โ๏ธ ๐๐๐ ๐บ๐ฒ ๐ฎ ๐ฐ๐ผ๐ณ๐ณ๐ฒ๐ฒ:
To support the channel and encourage new videos, please consider buying me a coffee here:
https://ko-fi.com/bugbytes
โญTop resource to learn Python - https://datacamp.pxf.io/kOjKkV โญ
In this video, we'll take a look at the ChromaDB vector database, which can be used to store embedding data and retrieve embeddings that are most similar to an input query.
We'll take a look at loading and. embedding a real-life text dataset, and then querying for similar vectors. We'll also look at different client options for in-memory databases and persistent databases with Chroma, and how to integrate with OpenAI's embeddings API.
๐ ๐๐ต๐ฎ๐ฝ๐๐ฒ๐ฟ๐:
00:00 Intro
00:56 ChromaDB introduction
02:18 Creating a ChromaDB client and collections
03:32 Adding documents to a collection
08:45 Passing filters to collection queries
10:24 Reading in real-life dataset with Polars
13:10 Creating Embeddings with OpenAI APIs
18:11 Adding OpenAI vectors to ChromaDB
28:27 Persisting the ChromaDB database
๐ฆ๐ผ๐ฐ๐ถ๐ฎ๐น ๐ ๐ฒ๐ฑ๐ถ๐ฎ:
๐ Blog: https://bugbytes.io/posts/vector-databases-pgvector-and-langchain/
๐พ Github: https://github.com/bugbytes-io/
๐ฆ Twitter: https://twitter.com/bugbytesio
๐ ๐๐๐ฟ๐๐ต๐ฒ๐ฟ ๐ฟ๐ฒ๐ฎ๐ฑ๐ถ๐ป๐ด ๐ฎ๐ป๐ฑ ๐ถ๐ป๐ณ๐ผ๐ฟ๐บ๐ฎ๐๐ถ๐ผ๐ป:
ChromaDB: https://docs.trychroma.com/
ChromaDB Embeddings: https://docs.trychroma.com/guides/embeddings
ChromaDB Integrations: https://docs.trychroma.com/integrations
Kaggle News Articles Dataset: https://www.kaggle.com/datasets/asad1m9a9h6mood/news-articles
#python #chromadb #datascience