Get the slides: https://www.datacouncil.ai/talks/building-systems-to-monitor-data-and-model-health-in-production-systems
ABOUT THE TALK
Unlike traditional deterministic software, models in production start to degrade as soon as they have been deployed.
Typically these degradations are caused by three broad types of problems:
1. Bugs in the data data pipeline: This can manifest in many different ways e.g. someone changes code upstream and now a particular column could be filled with NaNs.
2. Input distribution shifts: This is caused by changes in real world phenomenon e.g. a bank has a churn prediction model - their competitor has just released a new competitive campaign. This in itself can manifest in distribution shifts as customer behavior changes.
3. Concept drift: This happens when the actual relationship between the input and output changes.
Traditional testing infrastructure is currently not suitable to handle the dynamic nature of machine learning models.
In this talk, Mohammed Ridwanul, Product Manager at Dessa, will speak about writing tests to monitor machine learning models, creating data contracts and metadata stores for reference checks and building automated systems around model testing to prevent degradation in production.
ABOUT THE SPEAKER
Mohammed Ridwanul is a Product Manager at Dessa, he leads the design, development, and execution of Dessa's Machine Learning Infrastructure and Continuous Delivery Toolkit - Foundations. Mohammed focuses on two products within the toolkit - Atlas, which helps organizations convert their infrastructure into a ML Platform-as-a-Service for their data scientists and Orbit, which helps with monitoring of ML models in production.
Prior to Dessa, Mohammed spent his time working within Shopify's Data Platform Engineering team where he built some of Shopify's core data streaming libraries surrounding Apache Kafka. Mohammed holds a Bachelor's degree in Electrical & Computer Engineering from the University of Waterloo.
ABOUT DATA COUNCIL:
Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers. Make sure to subscribe to our channel for more videos, including DC_THURS, our series of live online interviews with leading data professionals from top open source projects and startups.
FOLLOW DATA COUNCIL:
Twitter: https://twitter.com/DataCouncilAI
LinkedIn: https://www.linkedin.com/company/datacouncil-ai
Facebook: https://www.facebook.com/datacouncilai
Eventbrite: https://www.eventbrite.com/o/data-council-30357384520