With the advent of Large Language Models with context windows of 1 million tokens, many have declared the utility of retrieval augmented generation (RAG) dead and no longer necessary. But is this actually true? Or simply short-sighted?
Join Timescale and special guest presenter Lance Martin, engineer at @LangChain for a deep dive into the present and future of RAG with long text LLMs.
🛠 𝗥𝗲𝗹𝗲𝘃𝗮𝗻𝘁 𝗥𝗲𝘀𝗼𝘂𝗿𝗰𝗲𝘀
📌 Free trial of Timescale Vector ⇒ https://tsdb.co/webinar-signup
📌 Presentation slides ⇒ https://tsdb.co/is-rag-dead-slides
📌 Twitter thread with paper links ⇒ https://tsdb.co/is-rag-dead-papers
📌 Getting started with LangChain and Timescale Vector tutorial ⇒ https://tsdb.co/langchain-tutorial
🐯 𝗔𝗯𝗼𝘂𝘁 𝗧𝗶𝗺𝗲𝘀𝗰𝗮𝗹𝗲
Timescale a mature cloud PostgreSQL platform engineered for demanding workloads like time-series, vector, events and analytics data.
💻 𝗙𝗶𝗻𝗱 𝗨𝘀 𝗢𝗻𝗹𝗶𝗻𝗲!
🔍 Website ⇒ https://tsdb.co/homepage
🔍 Slack ⇒ https://slack.timescale.com
🔍 GitHub ⇒ https://github.com/timescale
🔍 Twitter ⇒ https://twitter.com/timescaledb
🔍 Twitch ⇒ https://www.twitch.tv/timescaledb
🔍 LinkedIn ⇒ https://www.linkedin.com/company/timescaledb
🔍 Timescale Blog ⇒ https://tsdb.co/blog
🔍 Timescale Documentation ⇒ https://tsdb.co/docs
📚 𝗖𝗵𝗮𝗽𝘁𝗲𝗿𝘀
00:00 Introduction
03:08 What is RAG?
03:52 Needle in a Haystack
08:15 RAG isn't dead. But it will change.
10:10 Documents as a minimum retrieval unit
11:32 Representation Indexing
13:06 RAPTOR - Questions that reference many documents
15:14 Reasoning: Self RAG
17:02 Summary
18:23 Question and Answer (Variety of Topics)
18:39 Cost of long context models
21:52 Needle in a Haystack in different LLMs
24:10 Self-RAG Eval Metrics
25:45 Self-RAG Latency
27:12 Semantic Utility of Long Context Embeddings
28:40 Testing Methods for RAG Pipelines
31:07 RAG Use Cases
37:31 Public Benchmarks for RAG
39:17 Improvements to Needle in a Haystack
41:53 Number of needles and performance
43:13 RAG and Few Shot Training
45:39 RAFT: Retrieval Aware Fine Tuning