MENU

Fun & Interesting

Data Versioning: Towards Reproducibility in Machine Learning - Nicolás Eiris - TryoLabs

DVCorg 496 2 years ago
Video Not Working? Fix It Now

Nicolás Eiris, Machine Learning Engineer at Tryolabs, presents the "Data Versioning: Towards Reproducibility in Machine Learning" tutorial at the May 2022 Embedded Vision Summit. Surprisingly in 2022, reproducibility is still a big pain point in most data science workflows. A critical element required for reproducibility is version control. Unfortunately, in machine learning there is a notorious lack of standards for version control, so developers typically resort to crafting ad-hoc workflows. And frequently, developers reinvent the wheel due to a lack of awareness of existing solutions. In this talk, Eiris introduces DVC, short for “Data Version Control,” an open-source tool that Tryolabs has found can significantly alleviate the pain of reproducibility in data science workflows. He covers the motivation for such a tool, digs into its main features and will hopefully convince you that your life will be much better if you integrate it into your next project. Everything is illustrated through a real-world example of an end-to-end ML pipeline. See more from the Embeded Vision Summit here: https://www.youtube.com/@UCoyivR_HZGuzPtsOXMMg3zg And learn about the conference here: https://embeddedvisionsummit.com/ ------- *Try out the DVC Extension for VS Code here:* https://marketplace.visualstudio.com/items?itemName=Iterative.dvc To learn more about Iterative's open-source and SaaS tools please visit: 🧑🏽‍💻 *Our free online course:* https://learn.iterative.ai ✍🏼 *Our docs:* https://dvc.org/doc (Data Version Control, Pipelines, Experiments) https://cml.dev/doc (CI/CD for Machine Learning) https://mlem.ai/doc (Package and Serve your models) https://studio.iterative.ai (Team Collaboration, Experiments, Model Registry) *Join the Community on our Discord server:* https://discord.gg/W49xzNmycw For more information on the HighLoad Conference: https://highload.rs/2023/ #dvc #machinelearning #datascience #generativeai

Comment