A better Hugging Face model search with OpenAI, RAG, pgvector

Efficient NLP 1,688 1 year ago

Video Not Working? Fix It Now

Try Voice Writer - speak your thoughts and let AI handle the grammar: https://voicewriter.io In this tutorial, learn how to build a chatbot to recommend HuggingFace models, using the RAG (retrieval augmented generation) pattern. We use OpenAI embeddings and chat models. Learn how to combine pgvector, response reranking, streaming protocols, and overcome resource constraints in deployment as well. This tool is no longer available for use and has been discontinued. Sorry for the inconvenience. 0:00 - Introduction 0:25 - Designing the model chat tool 1:44 - Retrieval Augmented Generation (RAG) 3:17 - Scraping HuggingFace models and readmes 5:01 - Trying out Llamaindex 6:28 - Which model to use? 7:51 - Generating embeddings 8:40 - Implementing the bot in Python 11:08 - Popularity reranking 12:34 - Results so far 15:09 - Deploying the app 16:55 - Optimizing memory usage 17:52 - Comparing vector databases and pgvector 19:55 - Streaming protocol and server-sent events 21:34 - Final demo

Comment