In this video, we setup and end to end data engineering using Apache Spark, Azure Databricks, Data Build Tool (DBT) using Azure as our cloud provider. This project illustrate the process of data ingestion to the lakehouse, data integration with ADF and data transformation with Databricks, and DBT.
Timestamp:
0:00 Introduction
0:49 System Architecture
3:01 Creating resource groups on Azure
5:02 Setting up the medallion architecture storage account
8:46 Setting up Azure Data Factory
10:18 Azure Key Vault setup for secrets
14:19 Azure database with automatic data population
25:32 Azure Data Factory pipeline orchestration
47:00 Setting up Databricks
49:50 Azure Databricks Secret Scope and Key Vault
54:33 Verifying Databricks - Key Vault - Secret Scope Integration
1:06:00 Azure Data Factory - Databricks Integration
1:21:19 DBT Setup
1:24:15 DBT Configuration with Azure Databricks
1:32:12 DBT Snapshots with Azure Databricks and ADLS Gen2
1:45:06 DBT Data Marts with Azure Databricks and ADLS Gen2
1:55:00 DBT Documentation
1:58:58 Outro
Resources:
Medium Article: https://medium.com/@yusuf.ganiyu/robust-data-pipelines-with-databricks-spark-dbt-and-azure-data-engineering-project-e5780fbc07a6
Full Code: https://github.com/airscholar/modern-data-eng-dbt-databricks-azure
If you find our content valuable, support us by joining our channel membership, where you'll get exclusive access to behind-the-scenes content, Q&A sessions, and much more!
https://www.youtube.com/@CodeWithYu/join
💬 Join the Conversation:
We love hearing from you! Share your thoughts, questions, or experiences related to data engineering or this project in the comments below. Don't forget to like, subscribe, and hit the bell icon to stay updated with our latest content.
Tags:
Big Data, Data Engineering, Apache Spark, Databricks, DBT, Azure, Cloud Computing, Data Analytics, ETL, Data Warehouse, Technology, Analytics, Machine Learning, Data Science
Hashtags:
#BigData, #DataEngineering, #ApacheSpark, #Databricks, #DBT, #Azure, #CloudComputing, #DataAnalytics, #ETL, #DataWarehouse, #TechTalk, #MachineLearning, #DataScience, #BigDataAnalytics
🙏 Thank You for Watching!
Remember to subscribe and hit the bell icon for notifications. Stay curious and keep exploring the fascinating world of data engineering!