MENU

Fun & Interesting

Advanced Data Prep and Visualisation Techniques for Fine-tuning LLMs

Trelis Research 1,250 5 days ago
Video Not Working? Fix It Now

📜Get repo access at Trelis.com/ADVANCED-fine-tuning Tip: If you subscribe here on YouTube, click the bell to be notified of new vids 💡 Need Technical or Market Assistance? Book a Consult Here: https://forms.gle/wJXVZXwioKMktjyVA 🤝 Are You a Top Developer? Work for Trelis: https://trelis.com/jobs/ 💸 Starting a New Project/Venture? Apply for a Trelis Grant: https://trelis.com/trelis-ai-grants/ 📧 Get Trelis AI Tutorials by Email Subscribe on Substack: https://trelis.substack.com Video Links: - slides: https://docs.google.com/presentation/d/1VOtBNgmz1gutHQbtyDHC8Tfxtj1Ychpn_-1pDLkGGuk/edit?usp=sharing TIMESTAMPS: 0:00 Advanced Data Preparation Techniques 0:33 Video Overview 1:52 Synthetic Dataset Generation Goals 3:48 Synthetic Data Generation Pipeline 5:34 Document Ingestion Approaches (e.g. pdf to markdown) - comparing markitdown marker and Gemini 13:44 Chunking Approaches and Trade-offs 22:45 Question-Answer Pair Generation Approaches 31:56 Q-A pair visualization with embeddings or tags AND how to choose a model for synthetic data generation 44:29 How to create an Evaluation Dataset? Best Practice. 54:41 Preview of the upcoming fine-tuning video

Comment