MENU

Fun & Interesting

How unsupervised machine learning can scale data quality monitoring in Databricks

Databricks 4,318 lượt xem 2 years ago
Video Not Working? Fix It Now

Technologies like Databricks Delta Lake and Databricks SQL enable enterprises to store and query their data. But existing rules and metrics approaches to monitoring the quality of this data are tedious to set up and maintain, fail to catch unexpected issues, and generate false positive alerts that lead to alert fatigue.

In this talk, Jeremy will describe a set of fully unsupervised machine learning algorithms for monitoring data quality at scale in Databricks. He will cover how the algorithms work, their strengths and weaknesses, and how they are tested and calibrated.

Participants will leave this talk with an understanding of unsupervised data quality monitoring, its strengths and weaknesses, and how to begin monitoring data using it in Databricks.

Connect with us:
Website: https://databricks.com
Facebook: https://www.facebook.com/databricksinc
Twitter: https://twitter.com/databricks
LinkedIn: https://www.linkedin.com/company/data...
Instagram: https://www.instagram.com/databricksinc/

Comment