MENU

Fun & Interesting

Multimodal RAG: A Beginner-friendly Guide (with Python Code)

Shaw Talebi 8,596 lượt xem 4 months ago
Video Not Working? Fix It Now

Get exclusive access to AI resources and project ideas: https://the-data-entrepreneurs.kit.com/shaw

Multimodal RAG improves an AI model's responses by providing relevant information stored in text and non-text formats. Here, I discuss 3 ways to build an MRAG system and share an example implementation with Python.

Resources:
📰 Blog: https://medium.com/towards-data-science/multimodal-rag-process-any-file-type-with-ai-e6921342c903?sk=dabb0a46b1c53c3072f8f61772afa554
💻 GitHub Repo: https://github.com/ShawhinT/YouTube-Blog/tree/main/multimodal-ai

References:
[1] RAG: https://youtu.be/Ylz779Op9Pw
[2] Multimodal LLMs: https://youtu.be/Ot2c5MKN_-w
[3] Multimodal Embeddings: https://youtu.be/YOvxh_ma5qE

--
Homepage: https://www.shawhintalebi.com

Introduction - 0:00
What is RAG? - 1:12
Multimodal RAG (MRAG) - 4:01
3 Levels of MRAG - 5:26
Example code: Multimodal Blog QA Assistant - 10:52
Demo (Gradio) - 24:44
Limitations - 25:28

Comment