Building Data Lakehouse from Scratch - End to End Data Engineering Project

CodeWithYu 11,380 lượt xem 1 year ago

Video Not Working? Fix It Now

In this video you will learn to design, implement and maintain secure, scalable and cost effective lakehouse architectures leveraging Apache Spark, Apache Kafka, Apache Flink, Delta Lake, AWS, and open-source tools. Unlock data's full potential through advanced analytics and machine learning.

Part 2: https://youtu.be/K84MEdiC1tM

FULL COURSE AVAILABLE: https://sh.datamasterylab.com/costsaver

Like this video?
Support us: https://www.youtube.com/@CodeWithYu/join

Timestamps:
0:00 Introduction
1:24 The system architecture
4:59 The modern system architecture
9:15 Implementation of the Current Data Lakehouse on AWS Cloud
11:33 Creating Databases for Data Lakehouse
12:12 Using Glue crawler for Data Lakehouse
17:19 Using Lambda function to automate data orchestration on AWS Cloud
21:03 Coding the Lambda function
43:57 Optimising Lambda Function
48:46 Verification of Results
53:43 Outro

Resources:
Youtube Source Code:
https://buymeacoffee.com/yusuf.ganiyu/youtube-source-code-building-cost-effective-data-lakehouse

🌟 Please LIKE ❤️ and SUBSCRIBE for more AMAZING content! 🌟

👦🏻 My Linkedin: https://www.linkedin.com/in/yusuf-ganiyu-b90140107/
🚀 X(Twitter): https://x.com/YusufOGaniyu
📝 Medium: https://medium.com/@yusuf.ganiyu

Hashtags:
#dataengineering #bigdata #dataanalytics #realtimeanalytics #streaming, #datalakehouse, #datalake, #datawarehouse, #dataintegration, #datatransformation, #datagovernance, #datasecurity, #apachespark, #apachekafka, #apacheflink, #deltalake, #aws, #opensource, #dataingestion, #structureddata, #unstructureddata, #semi-structureddata, #dataanalysis, #advancedanalytics, #dataarchitecture, #costoptimization, #cloudcomputing, #awscloud

Comment