MENU

Fun & Interesting

Databricks Real Time Project- Preparing for Production: Automating Databricks Notebook-Part 1

The Data Master 4,998 9 months ago
Video Not Working? Fix It Now

Hello Everyone, I am Naval Yemul. Welcome back to Data Master! Follow me on LinkedIn: https://www.linkedin.com/in/naval-yemul-a5803523/ In this video, we dive into the essential skill of productionizing and automating Databricks notebooks, perfect for anyone looking to upskill in Databricks or add impactful projects to their resume. We'll walk through a real-world example from one of our e-commerce clients on Amazon, who provides raw CSV files in Azure Data Lake Storage (ADLS). Here's what you'll learn: Ingesting raw data into the bronze layer. Automating the ingestion process using Databricks workflow jobs. If you enjoy this video, please share it with your friends and colleagues, like the video, and subscribe to the channel. Have specific topics or questions? Leave a comment below! Check out the medallion architecture and other related videos linked in the "i" button. Links: Medallion Architecture Explained: https://youtu.be/sbch85VU4IA?si=8w2d275FjvxXzoN5 Creating Storage Credentials and External Locations: https://youtu.be/KBB14qcELD4?si=vaA024iOWkYNqneg Managed and External Tables: https://youtu.be/p4_QLr6Ybbs?si=6lYw1S7wuBg59EEh Databricks Playlist: https://www.youtube.com/playlist?list=PL7S7dD8r4QdVzOYRzIG2UJdCaCasqBv1F Databricks Certification Playlist: https://www.youtube.com/playlist?list=PL7S7dD8r4QdVnjXJQ3aObQPWjlaZwRVDo 0:00 Introduction to Real Time Project/ Understanding clients requirement. 3:22 Databricks Workspace 4:16 Understanding Sample Amazon Dataset 7:10 Uploading raw data in ADLS container 8:37 Exploring External Location/ Storage Credentials 10:18 Preparing for Production Notebook 15:11 Renaming column name using toDF 17:17 Writing to External Delta table 23:31 Explaining about Widget 24:31 Creating widgets 29:03 Removing raw file 29:28 Uploading new file 33:24 Schedule the notebook #Databricks #DataEngineering #Productionizing #Automation #DatabricksNotebooks #AzureDataLake #BigData #EcommerceData #ETL #DataAutomation #CloudComputing #ResumeProjects #DataOps #TechTutorials

Comment