MENU

Fun & Interesting

Step-by-Step Data Cleaning in Python with Pandas | Jupyter Notebook Tutorial | CSV to visualization

Data Geek 11,415 lượt xem 4 months ago
Video Not Working? Fix It Now

A Step-by-Step Data Cleaning in Python: Removing Duplicates, Nulls & Visualizing Data. Master Data Cleaning with Python Pandas: CSV File to Visualization in Jupyter Notebook.

*Ways To Support My Channel:*
Buy Me A Coffee: https://buymeacoffee.com/datageekismyname
💎Support my channel and hit the Subscribe button 💎

In this video, I’ll show you how to clean data using Python, Pandas, and Jupyter Notebook. We'll start by loading a CSV file, then tackle common data issues like duplicates and missing values. I'll also guide you through dropping unnecessary data and creating visualizations such as pie charts and bar charts. Plus, we’ll create a new variable to categorize age groups, giving you practical tools for effective data cleaning.

Continue your learning with Python:
https://learnpython.com?ref=mgzmzjn

Timestamps:
00:00 Intro to Data Cleaning
02:04 Reviewing the dataset
03:07 Upload dataset (CSV) to Jupyter notebook
04:07 Create a new Workbook and renaming the workbook (document)
04:45 Import libraries into your workbook to start coding
05:02 Python code to import the CSV file to Jupyter notebook & print out CSV dataset
06:26 Check information on the dataset
07:11 Review the variable names for misspellings, etc.
07:53 Drop a column by using python code
08:23 Create a new column naming it "Age Group" to group the ages in my dataset.
09:10 Checking for null values or N/A values
09:50 Dropping N/A (null) values
10:37 Checking for any duplicate values within the dataset
11:25 Dropping duplicates in the dataset using python code
11:54 Looking at the description of the dataset for "count, mean, std, min, max, etc."
12:27 Starting visualization
13:01 Pie chart visualization
16:19 Bar chart visualization
19:26 Pie chart visualization to compare another variable, "Season"
20:58 Bar chart visualization to review the variable "Age Group"
21:55 Bar chart visualization to review the variable "Age Group & Gender"
23:08 Thank you, and I hope you enjoyed the video. Please like and subscribe.

*** FREE code and dataset HERE***
The dataset used is "customer data on purchases" for a clothing store.
Get free access to the CODE and DATABASE. All I ask is to subscribe to my channel! Thanks for your support.

Get the dataset here:
https://docs.google.com/spreadsheets/d/1oKXmVa8hqQeTIk2wsmD744nBhKlixY5Es6Wz0O5quz8/edit?usp=sharing
Get the code here: https://docs.google.com/document/d/1NbtEd2uQGTlSNZqsrONADdnIyhxGJfE2HE7mXulpOUM/edit?usp=sharing

**Get free resources to continue learning: **
https://www.excelcampus.com/161.php

===Great Books For Mastering Data Science and Data Cleaning===
Python For Data Analysis: https://amzn.to/4dQUOaF
Python Data Science HandBook: https://amzn.to/3BV6hsk
Hands On Machine Learning with Scikit-Learn & TensorFlow: https://amzn.to/4h8IxRS
Python Machine Learning by Sebastian Raschka: https://amzn.to/401eIMU
Modern Python Cookbook: updated: https://amzn.to/3BV6sE0

Mouse Map: Python Cheat Sheet Desk Mat for Software Engineers, Hackers, and Programmers Quick Key: https://amzn.to/421qAQ4

__________________________________________________________________________________________
Disclaimer:
This content is for educational purposes only. Affiliate links may be included, and I may earn a small commission at no extra cost to you. Thank you for supporting the channel!

#datacleaning #pythontutorial #pandas #jupyternotebook #dataanalysis #datavisualization #dataanalyst

Comment