Azure End-To-End Data Engineering Project | Azure Data factory | Databricks | Delta Lake
In this 4+ HRS long video, we dive deep into an end-to-end Azure Data Engineering project using cutting-edge technologies like Azure Data Factory, Databricks, PySpark, and Delta Lake. Whether you're a beginner or looking to enhance your skills, this video covers everything you need to know to successfully implement a data pipeline.
🔍 What You'll Learn:
- Real-Time scenarios in Azure Data Factory
- Databricks Tutorial (detailed) along with PySpark
- How to use Delta Tables (Delta Lake) for efficient data storage and management
- Best practices for using Medallion Architecture in data workflows
- Common interview questions in End-To-End Data Engineering Domain.
Data Source Link : https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page
CSV Files Link : https://github.com/anshlambagit/NYC-TAXI-DE-Project/tree/main/Raw%20Data
Databricks Notebook : https://github.com/anshlambagit/NYC-TAXI-DE-Project/tree/main/Raw%20Data
PySpark Full Course : https://youtu.be/94w6hPk7nkM?si=Y2EUnifXrZRK3MOj
Data Engineering Masterclass : https://youtu.be/ZRz-7E-7X7c?si=bzl2xL1ngZodTx-M
Azure Synapse Analytics Project : https://youtu.be/0GTZ-12hYtU?si=_9Gyh6KvDt5HX_OT
Timestamps:
0:00 Introduction
3:20 Data Architecture
9:06 Medallion Architecture
14:45 Azure Fundamentals
20:31 Azure Free Account
23:24 Data Understanding (NYC Taxi Data)
34:30 Azure Resource Group
38:04 Azure Storage Account
40:45 Azure Data Lake
47:59 Azure Data Factory Tutorial
59:15 Data Ingestion with Azure Data Factory using API
1:11:43 Azure Data Factory Real-Time scenario
1:37:44 Dynamic Data Pipelines in Azure Data Factory
1:55:49 Access Data Lake using Databricks
2:01:40 Azure Databricks Free Account
2:04:34 Databricks Overview
2:05:40 Databricks Cluster
2:24:29 Reading data using PySpark
2:45:13 Data Transformation using PySpark functions
3:14:04 Data Analysis using PySpark
3:18:23 Managed Tables vs External Delta Tables (Delta Lake)
3:36:17 Delta Tables in Databricks using PySpark
3:43:50 What is Delta Log (Delta Lake)
3:48:15 Querying Delta Lake
3:50:55 Data Versioning in Delta Tables
4:00:00 Time Travel in Delta Tables (Delta Lake)
4:10:33 Connecting Databricks with BI tool
Connect with ME - https://www.linkedin.com/in/ansh-lamba-793681184/
Please Hit the SUBSCRIBE button❤️to support me and my hard work.
⭐Hashtags⭐
#azure #dataengineering #databricks #deltalake