Databricks Tutorial | PySpark | Azure Databricks | Delta Lake
This 4-hour Databricks Tutorial video covers everything from the fundamentals to advanced concepts, making it perfect for both beginners and experienced professionals.
🔍 What You'll Learn:
- Spark Architecture: Understand the core components and how they work together.
- Databricks Fundamentals: Get familiar with the Databricks environment and its features.
- DBFS (Databricks File System): Learn how to manage files and data efficiently.
- PySpark Transformations: Dive into data manipulation and processing techniques.
- External Data: Discover how to connect and read data from Azure Data Lake.
- Delta Lake: Explore Delta Lake in detail and its advantages for data management.
- Autoloader : Incrementally load data with spark streaming dataframes.
- Real-time Scenarios: Apply your knowledge to practical use cases.
- Interview Questions: Prepare for your next job interview with common questions in big data engineering.
Data Source Link : https://github.com/anshlambagit/Databricks-Masterclass/tree/main/Resources
Databricks Notebook : https://github.com/anshlambagit/Databricks-Masterclass/tree/main/Resources
PySpark Full Course : https://youtu.be/94w6hPk7nkM?si=Y2EUnifXrZRK3MOj
Data Engineering Masterclass : https://youtu.be/ZRz-7E-7X7c?si=bzl2xL1ngZodTx-M
End-To-End Azure Data Engineering Project : https://youtu.be/0GTZ-12hYtU?si=_9Gyh6KvDt5HX_OT
Timestamps:
0:00 Introduction
7:01 What is Spark Cluster
11:26 Spark Architecture
19:38 What is Dataricks
20:42 Databricks with Azure, AWS, GCP
24:52 Free Azure Account
27:08 Azure Portal
32:22 Azure Data Lake
40:36 Creating Azure Databricks Workspace
44:30 Azure Databricks Overview
50:37 Databricks Spark Cluster
58:23 Magic Commands in Databricks
1:08:49 Databricks File System (DBFS)
1:13:42 Access Azure Data Lake Storage from Databricks
1:19:39 Service Principal Azure
1:30:55 Databricks Utilities
1:35:52 dbutils widgets
1:39:30 dbutils Secrets
1:51:19 Reading Data (Ingestion) using PySpark
1:57:45 Free PySpark Full Course
1:59:14 Data Transformation using PySaprk
2:04:49 Real-Rime Scanrios in PySpark
2:09:25 Delta Lake Architecture
2:20:30 Output Modes in PySpark
2:28:58 External Delta Table vs Managed Delta Table
2:41:12 CRUD operations in Delta Tables
3:02:30 Delta Log Explained in Detail
3:13:01 Data Versioning in Delta Lake
3:17:01 Time Travel in Delta Lake
3:20:08 VACUUM in Databricks
3:29:01 Delta Table Optimization
3:35:57 ZORDER BY in Databricks in Detail
3:46:53 Autoloader Databricks
3:53:16 Streaming in Databricks
3:58:42 Incremental Data Ingestion with Autoloader
4:05:56 Databricks Workflows
4:08:30 Databricks Job
4:14:35 Azure End-To-End Data Engineering Project
Connect with ME - https://www.linkedin.com/in/ansh-lamba-793681184/
Please Hit the SUBSCRIBE button❤️to support me and my hard work.
⭐Hashtags⭐
#databricks #pyspark #azure #dataengineering #tutotrial #apachespark