MENU

Fun & Interesting

The BEST library for building Data Pipelines...

Rob Mulla 84,218 2 years ago
Video Not Working? Fix It Now

Building data pipelines with #python is an important skill for data engineers and data scientists. But what's the best library to use? In this video we look at three options: pandas, polars, and spark (pyspark). Timeline: 00:00 Data Pipelines 01:11 The Data 02:32 Pandas 04:34 Polars 06:15 PySpark 09:15 Spark SQL Follow me on twitch for live coding streams: https://www.twitch.tv/medallionstallion_ My other videos: Speed Up Your Pandas Code: https://www.youtube.com/watch?v=SAFmrTnEHLg Intro to Pandas video: https://www.youtube.com/watch?v=_Eb0utIRdkw Exploratory Data Analysis Video: https://www.youtube.com/watch?v=xi0vhXFPegw Working with Audio data in Python: https://www.youtube.com/watch?v=ZqpSb5p1xQo Efficient Pandas Dataframes: https://www.youtube.com/watch?v=u4_c2LDi4b8 * Youtube: https://youtube.com/@robmulla?sub_confirmation=1 * Discord: https://discord.gg/HZszek7DQc * Twitch: https://www.twitch.tv/medallionstallion_ * Twitter: https://twitter.com/Rob_Mulla * Kaggle: https://www.kaggle.com/robikscube #python #polars #spark #dataengineering

Comment