In this video, we're going to learn how to do naive/basic RAG (Retrieval Augmented Generation) with llama.cpp on our own machine.
* Mixed Bread AI - https://huggingface.co/mixedbread-ai/mxbai-embed-large-v1
* Llama3 - https://huggingface.co/bartowski/Llama-3-Instruct-8B-SPPO-Iter3-GGUF
* llama.cpp - https://llama-cpp-python.readthedocs.io/en/latest/
* Qdrant - https://github.com/qdrant/qdrant-client
* langchain-text-splitters - https://pypi.org/project/langchain-text-splitters/
* LangChain Q&A with RAG - https://python.langchain.com/v0.1/docs/use_cases/question_answering/
* This Day in AI Podcast - https://podcast.thisdayinai.com/episodes/ep68-we-sonnet-3-5-rabbit-r2-exclusive-openai-voice-delay-gemma-2-and-udio-suno-lawsuit
Ingestion code - https://github.com/mneedham/LearnDataWithMark/blob/main/llamacpp-rag/import.py
Querying code - https://github.com/mneedham/LearnDataWithMark/blob/main/llamacpp-rag/query.py