How to Make RAG Chatbots FAST

James Briggs 40,589 2 years ago

Video Not Working? Fix It Now

In this video we learn how to make Retrieval Augmented Generation (RAG) super fast for chatbots, Large Language Models (LLMs), or agents. We focus on how to design RAG / agent-powered conversational agents that use NVIDIA's NeMo Guardrails for decision-making on tool usage. ? Article: https://www.pinecone.io/learn/fast-retrieval-augmented-generation/ ? Code: https://github.com/pinecone-io/examples/blob/master/learn/generation/chatbots/nemo-guardrails/03-rag-with-actions.ipynb ? Subscribe for Latest Articles and Videos: https://www.pinecone.io/newsletter-signup/ ?? AI Consulting: https://aurelio.ai ? Discord: https://discord.gg/c5QtDB9RAP Twitter: https://twitter.com/jamescalam LinkedIn: https://www.linkedin.com/in/jamescalam/ 00:00 Making RAG Faster 00:20 Different Types of RAG 01:03 Naive Retrieval Augmented Generation 02:22 RAG with Agents 05:06 Making RAG Faster 08:55 Implementing Fast RAG with Guardrails 11:02 Creating Vector Database 12:52 RAG Functions in Guardrails 14:32 Guardrails Colang Config 16:13 Guardrails Register Actions 17:03 Testing RAG with Guardrails 19:42 RAG, Agents, and LLMs

Comment