In this video, we'll learn how to build Large Language Model (LLM) + Retrieval Augmented Generation (RAG) pipelines using open-source models from Hugging Face deployed on AWS SageMaker. We use the MiniLM sentence transformer to power our semantic search component with Pinecone.
📌 Code:
https://github.com/pinecone-io/examples/blob/master/learn/generation/aws/sagemaker/sagemaker-huggingface-rag.ipynb
📕 Article:
https://www.pinecone.io/learn/sagemaker-rag/
🌲 Subscribe for Latest Articles and Videos:
https://www.pinecone.io/newsletter-signup/
👋🏼 AI Consulting:
https://aurelio.ai
👾 Discord:
https://discord.gg/c5QtDB9RAP
Twitter: https://twitter.com/jamescalam
LinkedIn: https://www.linkedin.com/in/jamescalam/
00:00 Open Source LLMs on AWS SageMaker
00:27 Open Source RAG Pipeline
04:25 Deploying Hugging Face LLM on SageMaker
08:33 LLM Responses with Context
10:39 Why Retrieval Augmented Generation
11:50 Deploying our MiniLM Embedding Model
14:34 Creating the Context Embeddings
19:49 Downloading the SageMaker FAQs Dataset
20:23 Creating the Pinecone Vector Index
24:51 Making Queries in Pinecone
25:58 Implementing Retrieval Augmented Generation
30:00 Deleting our Running Instances
#artificialintelligence #nlp #aws #opensource #chatbot