Train an LLM to Self-Correct with Verifiable Backtracking

Trelis Research 3,852 2 months ago

Video Not Working? Fix It Now

📜Get repo access at Trelis.com/ADVANCED-fine-tuning Tip: If you subscribe here on YouTube, click the bell to be notified of new vids 🛠 Build & Deploy Faster Fine-tuning, Inference, Audio, Evals, and Vision Tools: https://trelis.com 💡 Need Technical or Market Assistance? Book a Consult Here: https://forms.gle/wJXVZXwioKMktjyVA 🤝 Are You a Top Developer? Join the Trelis team: https://trelis.com/developer-collaborations/ 💸 Starting a New Project/Venture? Apply for a Trelis Grant: https://trelis.com/trelis-ai-grants/ 📧 Get Trelis AI Tutorials by Email Subscribe on Substack: https://trelis.substack.com 📸 Thumbnail Tutorial See How It’s Made: https://youtu.be/ThKYjTdkyP8 Video Links: - s1 paper: https://arxiv.org/pdf/2501.19393 TIMESTAMPS: 00:00 Introduction to Verifiable Backtracking 00:45 Understanding Backtracking in LLMs 01:55 Budget Forcing Technique 06:01 Verifiable Backtracking Explained 10:49 Implementing Verifiable Backtracking in Code 17:39 Evaluating the Performance 22:43 Conclusion and Final Thoughts

Comment