MENU

Fun & Interesting

Lakehouse data validation with Great Expectations in Microsoft Fabric

Video Not Working? Fix It Now

10+ hours of FREE Fabric Training: https://www.skool.com/microsoft-fabric/classroom/d154aad4?md=3b108b0e216c46c88d891407ccd8647b End-to-end project playlist: https://www.youtube.com/playlist?list=PLug2zSFKZmV1BHk129X_2xi53ZRwQaTkb GitHub code for the notebooks: https://github.com/LearnMicrosoftFabric Great Expectations docs: https://docs.greatexpectations.io/docs/ When users spot errors in your data or dashboards, they lose trust immediately and it can be very hard to regain that trust. So in this video, we look at (in my biased opinion), THE most important part of any data analysis/ business intelligence/ data science workflow - data validation. We learn how to implement Great Expectations, an industry standard Python library for data testing and validation. The video contains two parts, with a separate notebook for each part: Part 1: Initial setup and configuration of Great Expectations within Microsoft Fabric Part 2: Notebook to run validation on new datasets (for example when loading and validating data between bronze and silver layers of a medallion architecture. --BROWSE MY OTHER FABRIC PLAYLISTS-- DATA ENGINEERING https://www.youtube.com/playlist?list=PLug2zSFKZmV1NvKfnRzG9e3Fl-8QLD5MK END-TO-END FABRIC PROJECT https://www.youtube.com/playlist?list=PLug2zSFKZmV1BHk129X_2xi53ZRwQaTkb INTRO TO MICROSOFT FABRIC https://www.youtube.com/playlist?list=PLug2zSFKZmV0Yaya7NxRQfrrPtfF2vj0K DATA FACTORY https://www.youtube.com/playlist?list=PLug2zSFKZmV3FkUFDxlyrfSJMQ3CeqyEF #microsoftfabric #lakehouse #datavalidation #greatexpectations --TIMELINE-- 0:00 Coming up 0:29 Why this is so important 4:43 End-to-end project recap 5:37 Plan for this video 6:11 Intro to Great Expectations 8:00 NOTEBOOK 1 START: Installing Great Expectations 9:48 Setting up the GX Data Context 12:47 Adding data sources/ assets to the Context 15:07 Defining our tests (Expectations) 17:46 Defining and running a checkpoint 19:50 Initial look at results 21:37 IMPORTANT! Copying configuration to Lakehouse FIles 24:20 NOTEBOOK 2 START 24:57 Re-initialize data context from Files 26:03 Feeding in fresh data and running the validation 29:01 Handing the results 34:08 Wrapping up --LINKEDIN-- Not following the LinkedIn page yet? Here's the link: https://www.linkedin.com/company/learnmicrosoftfabric/ --ABOUT WILL-- Hi, I'm Will! I'm hugely passionate about data and using it to create a better world. I currently work as a Consultant, focusing on Data Strategy, Data Engineering and Business Intelligence (within the Microsoft/Azure/Fabric environment). I have previously worked as a Data Scientist. I started Learn Microsoft Fabric to share my learnings on how Microsoft Fabric works and help you build your career and build meaningful things in Fabric. --SUBSCRIBE-- Not subscribed yet? You should! There are lots of new videos in the pipeline covering all aspects of Microsoft Fabric. https://youtube.com/@LearnMicrosoftFabric?sub_confirmation=1

Comment