MENU

Fun & Interesting

Building Realtime Data Warehouses from Scratch | End to End Data Engineering Project

CodeWithYu 9,672 lượt xem 6 months ago
Video Not Working? Fix It Now

In today’s fast-paced digital world, data is the new currency. Businesses thrive on making decisions in real-time—but how do they achieve that? Welcome to the era of real-time data warehousing.

In this video, you will build an end to end realtime data warehouse including a visualisation dashboard to watch in realtime how the data evolves.

What You Will Learn:
✅ Design & implement complex real-time data warehouse architecture
✅ Set up Apache Airflow, Kafka, and Apache Pinot for seamless data pipelines
✅ Develop custom Apache Airflow hooks for Kafka & Pinot integration
✅ Ingest batch & streaming data into Apache Pinot for real-time analytics
✅ Create a dynamic dashboard with Apache Superset to visualise evolving data in real-time
✅ Apply dimensional modeling for better data organisation and reporting

Timestamps:
0:00 Introduction
2:42 System Architecture
5:20 Setting up the project
20:25 Creating Dimensional Modelling with Apache Airflow
54:50 Creating Apache Airflow Hook for Kafka
1:19:20 Creating Apache Airflow Hook for Apache Pinot
1:29:00 Connecting Apache Pinot to Kafka
1:37:00 Batch Data Ingestion for Apache Pinot
1:43:39 Setting up Apache Superset for Data Visualisation
1:50:00 Creating Superset Dataset for Visualisation
1:56:11 Creating Apache Superset Realtime DW Dashboard
2:06:55 Wrapping up
2:08:00 Outro

Resources
✅ Full Source Code: https://buymeacoffee.com/yusuf.ganiyu/full-source-code-building-real-time-data-warehouse-scratch
✅ Apache Airflow Docker Compose: https://airflow.apache.org/docs/apache-airflow/2.10.2/docker-compose.yaml
✅ Full Article: https://link.medium.com/bwmohuYvhNb

If you find our content valuable, support us by joining our channel membership, where you'll get exclusive access to behind-the-scenes content, Q&A sessions, and much more!
https://www.youtube.com/@CodeWithYu/join

💬 Join the Conversation:
We love hearing from you! Share your thoughts, questions, or experiences related to data engineering or this project in the comments below. Don't forget to like, subscribe, and hit the bell icon to stay updated with our latest content.

Tags:
Big Data, Data Engineering, Redpanda, Apache Pinot, Apache Airflow, Data Analysis, Data Analytics, ETL, Data Warehouse, Technology, Analytics

Hashtags:
#BigData, #DataEngineering, #ApachePinot, #DataAnalysis, #DataAnalytics, #ETL, #DataWarehouse, #TechTalk, #BigDataAnalytics

🙏 Thank You for Watching!
Remember to like, subscribe and hit the bell icon for notifications. Stay curious and keep exploring the fascinating world of data engineering!

Comment