This INCREDIBLE trick will speed up your data processes.

Rob Mulla 276,604 3 years ago

Video Not Working? Fix It Now

In this video we discuss the best way to save off data as files using python and pandas. When you are working with large datasets there comes a time when you need to store your data. Most people turn to CSV files because they are easy to share and universally used. But there are much better options out there! Watch as Rob Mulla, Kaggle grandmaster, discusses some alternative ways of saving data files: pickle, parquet and feather files. I run some benchmarks to show that you can save time, space and keep the important metadata about your files in the process! Timeline 00:00 Intro 00:49 Creating our Data 02:08 CSVs 04:39 Setting dtypes for CSVs 06:15 Pickle Files 07:16 Parquet ❤️ 09:07 Feather 10:31 Other Options 11:02 Benchmarking 12:19 Takeaways 12:43 Outro Code Gist: https://gist.github.com/RobMulla/738491f7bf7cfe79168c7e55c622efa5 Follow me on twitch for live coding streams: https://www.twitch.tv/medallionstallion_ Other Videos: Speed up Pandas: https://www.youtube.com/watch?v=SAFmrTnEHLg Efficient Pandas Dataframes: https://www.youtube.com/watch?v=u4_c2LDi4b8 Inroduction to Pandas: https://www.youtube.com/watch?v=_Eb0utIRdkw Exploritory Data Analysis Video: https://www.youtube.com/watch?v=xi0vhXFPegw Audio Data in Python: https://www.youtube.com/watch?v=ZqpSb5p1xQo Image Data in Python: https://www.youtube.com/watch?v=kSqxn6zGE0c * Youtube: https://youtube.com/@robmulla?sub_confirmation=1 * Discord: https://discord.gg/HZszek7DQc * Twitch: https://www.twitch.tv/medallionstallion_ * Twitter: https://twitter.com/Rob_Mulla * Kaggle: https://www.kaggle.com/robikscube #python #code #datascience #pandas

Comment