MENU

Fun & Interesting

Building Robust Data Pipelines for Modern Data Engineering | End to End Data Engineering Project

CodeWithYu 70,075 1 year ago
Video Not Working? Fix It Now

In this video, we setup and end to end data engineering using Apache Spark, Azure Databricks, Data Build Tool (DBT) using Azure as our cloud provider. This project illustrate the process of data ingestion to the lakehouse, data integration with ADF and data transformation with Databricks, and DBT. Timestamp: 0:00 Introduction 0:49 System Architecture 3:01 Creating resource groups on Azure 5:02 Setting up the medallion architecture storage account 8:46 Setting up Azure Data Factory 10:18 Azure Key Vault setup for secrets 14:19 Azure database with automatic data population 25:32 Azure Data Factory pipeline orchestration 47:00 Setting up Databricks 49:50 Azure Databricks Secret Scope and Key Vault 54:33 Verifying Databricks - Key Vault - Secret Scope Integration 1:06:00 Azure Data Factory - Databricks Integration 1:21:19 DBT Setup 1:24:15 DBT Configuration with Azure Databricks 1:32:12 DBT Snapshots with Azure Databricks and ADLS Gen2 1:45:06 DBT Data Marts with Azure Databricks and ADLS Gen2 1:55:00 DBT Documentation 1:58:58 Outro Resources: Medium Article: https://medium.com/@yusuf.ganiyu/robust-data-pipelines-with-databricks-spark-dbt-and-azure-data-engineering-project-e5780fbc07a6 Full Code: https://github.com/airscholar/modern-data-eng-dbt-databricks-azure If you find our content valuable, support us by joining our channel membership, where you'll get exclusive access to behind-the-scenes content, Q&A sessions, and much more! https://www.youtube.com/@CodeWithYu/join 💬 Join the Conversation: We love hearing from you! Share your thoughts, questions, or experiences related to data engineering or this project in the comments below. Don't forget to like, subscribe, and hit the bell icon to stay updated with our latest content. Tags: Big Data, Data Engineering, Apache Spark, Databricks, DBT, Azure, Cloud Computing, Data Analytics, ETL, Data Warehouse, Technology, Analytics, Machine Learning, Data Science Hashtags: #BigData, #DataEngineering, #ApacheSpark, #Databricks, #DBT, #Azure, #CloudComputing, #DataAnalytics, #ETL, #DataWarehouse, #TechTalk, #MachineLearning, #DataScience, #BigDataAnalytics 🙏 Thank You for Watching! Remember to subscribe and hit the bell icon for notifications. Stay curious and keep exploring the fascinating world of data engineering!

Comment