MENU

Fun & Interesting

Cache, Persist & StorageLevels In Apache Spark

Afaque Ahmad 8,962 2 years ago
Video Not Working? Fix It Now

Welcome to our easy-to-follow guide on Spark Performance Tuning, honing in on the essentials of Caching in Apache Spark. Ever been curious about Lazy Evaluation in Spark? I’'ve got it broken down for you. Dive into the world of Spark's Lineage Graph and understand its role in performance. The age-old debate, Spark Persist vs. Cache, is also tackled in this video to clear up any confusion. Learn about the different Storage Level in Spark used with Persist and how it can make a difference in your tasks. 📄 Complete Code on GitHub: https://github.com/afaqueahmad7117/spark-experiments/blob/main/spark/4_caching.ipynb 🎥 Full Spark Performance Tuning Playlist: https://www.youtube.com/playlist?list=PLWAuYt0wgRcLCtWzUxNg4BjnYlCZNEVth 🔗 LinkedIn: https://www.linkedin.com/in/afaque-ahmad-5a5847129/ Table credits (Storage Levels, When to use what?): https://sparkbyexamples.com/spark/spark-persistence-storage-levels/ Chapters: 00:00 Introduction 00:39 Why Should You Use Caching? 06:45 Lazy Evaluation & How Could Caching Help You? 10:12 Code + Spark UI Explanation Caching vs No Caching 14:21 Persist & Storage Levels In Persist #spark #dataengineering #apachespark #lazyevaluation #lineagegraph #storagelevel #persist #cache #persistvscache #sparkperformancetuning #sparkoptimization #uncache #unpersist #dataengineering #interviewquestions #dataengineerinterviewquestions #azuredataengineer #dataanalystinterview

Comment