title
Data Cleaning in Excel - 10 Tricks (Beginner to PRO)
description
Dirty data on your mind?
Just spray the amazing "data cleaner" on it.
In this video, learn how you can use 5 Excel features to clean data with 10 examples.
You will learn about:
1) Flash Fill
2) Text to Columns
3) Remove Duplicates
4) Formulas
5) Power Query
Video Topics โฑ๐
===============
0:00 - Data Clean-up - the challenge
1:17 - Extracting the first name (Flash Fill)
2:10 - Get the city from location (Flash Fill)
3:12 - Get the city from location (Text to Columns)
4:20 - Splitting cost centre & ID number (Text to Columns)
5:35 - Duplicate Valvalues (Remove Duplicates)
6:33 - But there is a problem!!!
7:16 - Remove extra spaces from name (Formulas)
8:01 - Full time or Part time (Formulas)
8:49 - Power Query!!!
10:04 - Splitting Cost Centre & ID with Power Query
10:57 - Fixing inconsistent dates with Power Query
11:40 - Adding a Full / Part column with Power Query
12:30 - Bonus Example - Fixing or Removing incomplete data
13:50 - Loading Cleaned up data to Excel & Data Updates (Refresh)
Download the sample file ๐ฉ
=======================
Here is a blank data file. Use it to practice these data clean-up tasks:
https://chandoo.org/wp/wp-content/uploads/2022/04/data-cleaning-in-excel.xlsx
Watch next ๐ป๐
==============
๐งน10 Quick and Smart Data Cleaning Tricks - https://youtu.be/SNVjndgWBlw
๐ฉโ๐ป Detailed Video on Power Query - https://youtu.be/PiFAa_jjaEI
๐ถ Power Query Playlist - https://www.youtube.com/watch?v=PiFAa_jjaEI&list=PLmejDGrsgFyBqyHBq4jqdUzobCo5BDKY2
๐คทโโ๏ธ How to _____ in Excel - Playlist - https://www.youtube.com/watch?v=S2RRGqR_MNw&list=PLmejDGrsgFyAYGXOFlJ-KsMr4e_XgKY-4
~
#Excel #powerquery
May your data be clean and your days be serene ๐
detail
{'title': 'Data Cleaning in Excel - 10 Tricks (Beginner to PRO)', 'heatmap': [{'end': 597.103, 'start': 560.767, 'weight': 0.846}, {'end': 920, 'start': 910.8, 'weight': 0.867}], 'summary': "Covers excel data cleanup techniques with five powerful functionalities for automating data cleaning, data manipulation using flash fill, 'text to columns,' 'remove duplicates,' and excel formulas, and power query for splitting columns, fixing date inconsistencies, and automating data cleanup, enabling easy refresh for efficient data cleaning.", 'chapters': [{'end': 77.723, 'segs': [{'end': 77.723, 'src': 'embed', 'start': 0.229, 'weight': 0, 'content': [{'end': 7.333, 'text': 'Famous English poet Samuel Taylor Coleridge said, water, water everywhere, not a drop to drink.', 'start': 0.229, 'duration': 7.104}, {'end': 14.217, 'text': "I'm pretty sure if he is alive today, he would say data, data everywhere, not an item to use.", 'start': 8.294, 'duration': 5.923}, {'end': 19.561, 'text': 'Dirty data is a big problem in many data analysis situations.', 'start': 15.138, 'duration': 4.423}, {'end': 28.366, 'text': "I wish we have a data cleaner handy so we can spray the dirty data with this and it'll automatically clean it.", 'start': 20.701, 'duration': 7.665}, {'end': 32.406, 'text': "Well, don't worry, we have got the next best thing.", 'start': 29.744, 'duration': 2.662}, {'end': 41.331, 'text': 'Excel has got five powerful functionalities using which we can clean data in an automated fashion.', 'start': 33.186, 'duration': 8.145}, {'end': 46.894, 'text': "In this video, I'm going to show you how to use these five functionalities with ten examples.", 'start': 41.891, 'duration': 5.003}, {'end': 48.114, 'text': "Let's get in.", 'start': 47.634, 'duration': 0.48}, {'end': 54.856, 'text': 'Here is the sample data set that I will be using to explain the data cleanup techniques in Excel.', 'start': 48.574, 'duration': 6.282}, {'end': 60.817, 'text': 'This data set has several columns and it represents a typical employee data set.', 'start': 55.196, 'duration': 5.621}, {'end': 65.479, 'text': 'While it does look fairly clean, there are some gnarly problems in the data.', 'start': 61.157, 'duration': 4.322}, {'end': 67.999, 'text': 'For example, the start date is all messed up.', 'start': 65.759, 'duration': 2.24}, {'end': 70.84, 'text': 'There are several blank values in the gender column.', 'start': 68.079, 'duration': 2.761}, {'end': 74.301, 'text': 'The department has some incorrect values as null.', 'start': 71.12, 'duration': 3.181}, {'end': 77.723, 'text': 'and some of the location details are all messed up.', 'start': 75.021, 'duration': 2.702}], 'summary': 'Dirty data is a big problem in data analysis. excel offers five powerful functionalities for automated data cleaning.', 'duration': 77.494, 'max_score': 0.229, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/q7EpoOwBcnM/pics/q7EpoOwBcnM229.jpg'}], 'start': 0.229, 'title': 'Excel data cleanup techniques', 'summary': 'Discusses the challenge of dirty data in data analysis situations and introduces five powerful functionalities in excel to automate data cleaning, featuring a sample employee dataset with issues in start dates, blank gender values, incorrect department entries, and messy location details.', 'chapters': [{'end': 77.723, 'start': 0.229, 'title': 'Excel data cleanup techniques', 'summary': 'Discusses the challenge of dirty data in data analysis situations and introduces five powerful functionalities in excel to automate data cleaning, featuring a sample employee dataset with issues in start dates, blank gender values, incorrect department entries, and messy location details.', 'duration': 77.494, 'highlights': ['Excel offers five powerful functionalities for automated data cleaning, addressing issues such as incorrect start dates, blank gender values, incorrect department entries, and messy location details.', 'The sample employee dataset used for demonstration contains various issues, including messed up start dates, blank gender values, incorrect department entries, and messy location details.', "Famous English poet Samuel Taylor Coleridge's quote about water can be adapted to describe the prevalence of dirty data in data analysis situations today, emphasizing the need for effective data cleaning solutions.", 'The chapter emphasizes the prevalence of dirty data in data analysis situations and the need for effective data cleaning solutions, drawing parallels with a famous quote by Samuel Taylor Coleridge.', "The chapter highlights the challenge of dirty data in data analysis situations and introduces Excel's powerful functionalities for automated data cleaning, emphasizing the need for such solutions with a reference to a quote by Samuel Taylor Coleridge."]}], 'duration': 77.494, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/q7EpoOwBcnM/pics/q7EpoOwBcnM229.jpg', 'highlights': ['Excel offers five powerful functionalities for automated data cleaning, addressing issues such as incorrect start dates, blank gender values, incorrect department entries, and messy location details.', 'The sample employee dataset used for demonstration contains various issues, including messed up start dates, blank gender values, incorrect department entries, and messy location details.', 'The chapter emphasizes the prevalence of dirty data in data analysis situations and the need for effective data cleaning solutions, drawing parallels with a famous quote by Samuel Taylor Coleridge.', "The chapter highlights the challenge of dirty data in data analysis situations and introduces Excel's powerful functionalities for automated data cleaning, emphasizing the need for such solutions with a reference to a quote by Samuel Taylor Coleridge.", "Famous English poet Samuel Taylor Coleridge's quote about water can be adapted to describe the prevalence of dirty data in data analysis situations today, emphasizing the need for effective data cleaning solutions."]}, {'end': 531.201, 'segs': [{'end': 174.815, 'src': 'embed', 'start': 98.373, 'weight': 0, 'content': [{'end': 104.995, 'text': 'you will see that Excel has predicted what you are going for and it has automatically filled up all the values.', 'start': 98.373, 'duration': 6.622}, {'end': 112.157, 'text': 'At this point, if you just press enter, all the values for your table will be automatically calculated.', 'start': 105.555, 'duration': 6.602}, {'end': 121.239, 'text': 'This functionality is called Flash Fill and it is kind of like a simple pattern recognition engine built right into Excel.', 'start': 112.417, 'duration': 8.822}, {'end': 129.101, 'text': 'If the values are incorrect or you want to re-trigger it, you can use the Ctrl E shortcut to run this again.', 'start': 121.639, 'duration': 7.462}, {'end': 132.842, 'text': "Let's demonstrate the flash fill with another example.", 'start': 129.92, 'duration': 2.922}, {'end': 138.503, 'text': 'For this time, we are going to extract the work location, just the city name alone.', 'start': 133.002, 'duration': 5.501}, {'end': 148.166, 'text': 'Notice that several of our employees work from remote locations, but other people work from either Seattle, Hyderabad, Wellington or Columbus, etc.', 'start': 139.024, 'duration': 9.142}, {'end': 152.827, 'text': "so i'm going to type city here and then start typing the names.", 'start': 148.766, 'duration': 4.061}, {'end': 157.629, 'text': 'so second cell has a value seattle, seattle, hyderabad.', 'start': 152.827, 'duration': 4.802}, {'end': 163.211, 'text': 'and at this point flash fill did predict what we are going for and it kind of suggested.', 'start': 157.629, 'duration': 5.582}, {'end': 169.433, 'text': 'but notice that the location city where the city is remote is kind of incorrect.', 'start': 163.211, 'duration': 6.222}, {'end': 171.954, 'text': 'this is one of the limitations of flash fill.', 'start': 169.433, 'duration': 2.521}, {'end': 174.815, 'text': 'the other limitation of flash fill is it is manual.', 'start': 171.954, 'duration': 2.861}], 'summary': "Excel's flash fill predicts and fills values, with limitations such as manual corrections and recognizing simple patterns.", 'duration': 76.442, 'max_score': 98.373, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/q7EpoOwBcnM/pics/q7EpoOwBcnM98373.jpg'}, {'end': 273.554, 'src': 'embed', 'start': 246.382, 'weight': 3, 'content': [{'end': 251.166, 'text': "And when you finish, it's going to split the work location into two separate columns.", 'start': 246.382, 'duration': 4.784}, {'end': 257.267, 'text': "notice that if there is an extra space in the beginning, it's going to leave that space there later on.", 'start': 251.906, 'duration': 5.361}, {'end': 260.629, 'text': "in this video i'm going to show you how to remove that space.", 'start': 257.267, 'duration': 3.362}, {'end': 267.492, 'text': 'text to columns is an amazing way to split the data, not just by delimiter, but also by fixed width.', 'start': 260.629, 'duration': 6.863}, {'end': 269.633, 'text': "let's take a look at another example.", 'start': 267.492, 'duration': 2.141}, {'end': 273.554, 'text': 'in our data, we have our employee id with a specific format.', 'start': 269.633, 'duration': 3.921}], 'summary': 'Demonstrates how to split work location into two columns using text to columns, and remove extra spaces.', 'duration': 27.172, 'max_score': 246.382, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/q7EpoOwBcnM/pics/q7EpoOwBcnM246382.jpg'}, {'end': 436.053, 'src': 'embed', 'start': 386.717, 'weight': 4, 'content': [{'end': 393.782, 'text': "and then we'll click OK and it will remove any duplicate records and all the relevant rows as well.", 'start': 386.717, 'duration': 7.065}, {'end': 395.423, 'text': 'But there is a problem.', 'start': 394.042, 'duration': 1.381}, {'end': 404.538, 'text': 'While flash, fill text to columns and the remove duplicates are all amazing ways to clean data.', 'start': 399.113, 'duration': 5.425}, {'end': 409.202, 'text': 'they have one serious flaw, and that is they are all manual.', 'start': 404.538, 'duration': 4.664}, {'end': 414.147, 'text': 'So if your data changes, you must rerun this process to have clean data.', 'start': 409.763, 'duration': 4.384}, {'end': 418.028, 'text': 'This is where our next set of functionalities come into picture.', 'start': 414.807, 'duration': 3.221}, {'end': 425.23, 'text': "I'm going to show you how to use a completely automated setup to clean up data for the next half of the video.", 'start': 418.568, 'duration': 6.662}, {'end': 429.451, 'text': 'The next Excel functionality that we are going to look at is the Excel formulas.', 'start': 425.67, 'duration': 3.781}, {'end': 436.053, 'text': 'My favorite functions when it comes to working with data cleanup are usually text and date and time functions.', 'start': 429.851, 'duration': 6.202}], 'summary': "Manual data cleaning methods have flaws, next we'll focus on using automated excel formulas for data cleanup.", 'duration': 49.336, 'max_score': 386.717, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/q7EpoOwBcnM/pics/q7EpoOwBcnM386717.jpg'}], 'start': 78.003, 'title': 'Excel data manipulation', 'summary': "Covers excel's flash fill for data extraction, highlighting its prediction and limitations, and presents data cleaning tips including 'text to columns,' 'remove duplicates,' and excel formulas for efficient data cleanup.", 'chapters': [{'end': 174.815, 'start': 78.003, 'title': 'Excel flash fill for data extraction', 'summary': "Explains how to use excel's flash fill functionality to extract data, such as first names and city names, and highlights its automatic prediction and limitations.", 'duration': 96.812, 'highlights': ["Excel's flash fill functionality allows for easy extraction of data, such as first names and city names, by simply typing the desired values and letting Excel predict and automatically fill up the remaining values, saving time and effort.", 'The flash fill feature can be re-triggered using the Ctrl E shortcut if incorrect values are predicted, showcasing its flexibility and user control.', 'One of the limitations of flash fill is its inability to accurately predict values for certain patterns, such as remote locations, which may result in incorrect suggestions.', 'Another limitation of flash fill is its manual nature, requiring user intervention in certain cases, potentially affecting its efficiency.']}, {'end': 531.201, 'start': 175.555, 'title': 'Excel data cleaning tips', 'summary': "Presents excel data cleaning tips including 'text to columns' for splitting data, 'remove duplicates' for eliminating duplicate records, and excel formulas for automated data cleanup, promoting dynamicity and efficiency.", 'duration': 355.646, 'highlights': ["The 'Text to Columns' feature in Excel can be used to split data based on delimiters or fixed width, providing a dynamic and efficient way to clean and organize data. The 'Text to Columns' feature in Excel allows for efficient data splitting based on delimiters or fixed width, promoting dynamic and efficient data organization.", "The 'Remove Duplicates' function in Excel enables the elimination of duplicate records, enhancing data cleanliness and accuracy. The 'Remove Duplicates' function in Excel facilitates the removal of duplicate records, contributing to improved data cleanliness and accuracy.", "Excel formulas, such as 'TRIM' and 'IF', offer automated data cleanup, ensuring dynamic adjustments to data changes and promoting efficiency in data management. Excel formulas, including 'TRIM' and 'IF', provide automated data cleanup and dynamic adjustments to data changes, enhancing efficiency in data management."]}], 'duration': 453.198, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/q7EpoOwBcnM/pics/q7EpoOwBcnM78003.jpg', 'highlights': ["Excel's flash fill predicts and fills data, saving time and effort.", 'Ctrl E shortcut re-triggers flash fill, showcasing flexibility and user control.', 'Flash fill has limitations in accurately predicting values for certain patterns.', 'Text to Columns feature in Excel efficiently splits data based on delimiters or fixed width.', 'Remove Duplicates function in Excel enhances data cleanliness and accuracy.', 'Excel formulas like TRIM and IF offer automated data cleanup and dynamic adjustments.']}, {'end': 909.729, 'segs': [{'end': 555.805, 'src': 'embed', 'start': 531.201, 'weight': 2, 'content': [{'end': 542.43, 'text': 'Excel also has a really powerful and completely automated way to clean data where your data could be even more messy or the cleanup process is more complicated.', 'start': 531.201, 'duration': 11.229}, {'end': 545.833, 'text': 'And that particular feature is called Power Query.', 'start': 542.931, 'duration': 2.902}, {'end': 555.805, 'text': "So this next part of the video, I'm going to show you how to use Power Query with three different examples plus a bonus example.", 'start': 548.942, 'duration': 6.863}], 'summary': "Excel's power query automates data cleaning with three examples and a bonus.", 'duration': 24.604, 'max_score': 531.201, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/q7EpoOwBcnM/pics/q7EpoOwBcnM531201.jpg'}, {'end': 597.103, 'src': 'heatmap', 'start': 560.767, 'weight': 0.846, 'content': [{'end': 565.228, 'text': 'And now we will load this tab into Power Query so that we can clean it up.', 'start': 560.767, 'duration': 4.461}, {'end': 574.672, 'text': "To start, you need to just select the data and then from the data ribbon, because our data is already here, I'm just going to say from within sheet.", 'start': 565.468, 'duration': 9.204}, {'end': 580.255, 'text': 'At this point, Power Query ideally wants your data to be in a table, so it wants to turn this into a table.', 'start': 575.052, 'duration': 5.203}, {'end': 582.676, 'text': "This is fine with me, so I'm going to click OK.", 'start': 580.555, 'duration': 2.121}, {'end': 587.679, 'text': 'Our data gets turned into a table and it will transform into Power Query.', 'start': 583.056, 'duration': 4.623}, {'end': 590.92, 'text': 'If this is your first time working within Power Query,', 'start': 587.879, 'duration': 3.041}, {'end': 597.103, 'text': "then my recommendation is just pay attention to what I'm doing and do not worry about all the screen buttons.", 'start': 590.92, 'duration': 6.183}], 'summary': 'Loading tab into power query to clean data and transform into a table.', 'duration': 36.336, 'max_score': 560.767, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/q7EpoOwBcnM/pics/q7EpoOwBcnM560767.jpg'}, {'end': 622.439, 'src': 'embed', 'start': 597.243, 'weight': 0, 'content': [{'end': 603.927, 'text': "I have got another video where you can learn more about Power Query at a detailed level, and I'll show that to you towards the end of this.", 'start': 597.243, 'duration': 6.684}, {'end': 610.891, 'text': "Let's start with the simple problem, which is I got my employee ID that has my cost center and ID value mixed up,", 'start': 604.167, 'duration': 6.724}, {'end': 612.893, 'text': 'and I want to split this into two columns.', 'start': 610.891, 'duration': 2.002}, {'end': 614.674, 'text': 'So this is my column.', 'start': 613.313, 'duration': 1.361}, {'end': 622.439, 'text': 'Right click and then from there I can split the column by either a delimiter, number of characters or positions.', 'start': 614.854, 'duration': 7.585}], 'summary': 'Demonstrating how to split a column using power query.', 'duration': 25.196, 'max_score': 597.243, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/q7EpoOwBcnM/pics/q7EpoOwBcnM597243.jpg'}, {'end': 729.042, 'src': 'embed', 'start': 699.471, 'weight': 1, 'content': [{'end': 705.973, 'text': 'Let us see how we can use the FTE column to figure out whether an employee is a full time or a part time person.', 'start': 699.471, 'duration': 6.502}, {'end': 710.054, 'text': 'This kind of an operation requires the if condition in Excel with formula.', 'start': 706.293, 'duration': 3.761}, {'end': 713.095, 'text': 'We can use the same logic within Power Query as well.', 'start': 710.374, 'duration': 2.721}, {'end': 721.417, 'text': 'Because we want that as a new column, we can go from home to add column ribbon and then use the conditional column option.', 'start': 713.195, 'duration': 8.222}, {'end': 729.042, 'text': 'give your conditional column a name and then here i can just select fte and i can specify my condition.', 'start': 722.137, 'duration': 6.905}], 'summary': 'Using excel and power query to determine full-time or part-time employee based on fte column.', 'duration': 29.571, 'max_score': 699.471, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/q7EpoOwBcnM/pics/q7EpoOwBcnM699471.jpg'}], 'start': 531.201, 'title': 'Power query data cleaning in excel', 'summary': 'Demonstrates using power query in excel to clean messy data, including splitting columns, fixing date inconsistencies, adding conditional columns, and automating the data cleanup process, enabling easy refresh with new data, facilitating efficient data cleaning for analysis. it recommends paying attention to the steps and exploring more detailed tutorials available.', 'chapters': [{'end': 699.246, 'start': 531.201, 'title': 'Power query data cleaning in excel', 'summary': 'Demonstrates using power query in excel to clean messy data, including splitting columns and fixing date inconsistencies, with the recommendation to pay attention to the steps and explore more detailed tutorials available.', 'duration': 168.045, 'highlights': ['The chapter demonstrates using Power Query in Excel to clean messy data, including splitting columns and fixing date inconsistencies, with the recommendation to pay attention to the steps and explore more detailed tutorials available.', 'The feature Power Query is automated and powerful for data cleaning, with three examples shown in the tutorial.', "The tutorial includes splitting a column using the 'number of characters' option in Power Query to separate employee ID and cost center, addressing common corporate data inconsistencies in start dates by changing the data type to date, and utilizing locale options for different date formatting.", 'The demonstration emphasizes the importance of paying attention to the process while working with Power Query for the first time and suggests exploring more detailed tutorials available.']}, {'end': 909.729, 'start': 699.471, 'title': 'Data cleanup with power query', 'summary': 'Demonstrates using power query to clean and transform data in excel, including adding conditional columns, replacing null values, and automating the data cleanup process, enabling easy refresh with new data, facilitating efficient data cleaning for analysis.', 'duration': 210.258, 'highlights': ['Using Power Query to add conditional columns for full time and part time employees Demonstrates using Power Query to add a conditional column to categorize employees as full time or part time based on the FTE column values.', 'Replacing null values in the gender and department columns for data cleanup Illustrates the process of replacing null values in the gender and department columns to ensure data cleanliness and consistency.', 'Automating the data cleanup process using Power Query for easy refresh with new data Explains how Power Query allows the creation of a series of data cleanup steps that can be easily reapplied with new data, streamlining the data cleaning process for efficiency.']}], 'duration': 378.528, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/q7EpoOwBcnM/pics/q7EpoOwBcnM531201.jpg', 'highlights': ["The tutorial includes splitting a column using the 'number of characters' option in Power Query to separate employee ID and cost center, addressing common corporate data inconsistencies in start dates by changing the data type to date, and utilizing locale options for different date formatting.", 'Using Power Query to add conditional columns for full time and part time employees Demonstrates using Power Query to add a conditional column to categorize employees as full time or part time based on the FTE column values.', 'Automating the data cleanup process using Power Query for easy refresh with new data Explains how Power Query allows the creation of a series of data cleanup steps that can be easily reapplied with new data, streamlining the data cleaning process for efficiency.', 'The feature Power Query is automated and powerful for data cleaning, with three examples shown in the tutorial.', 'The chapter demonstrates using Power Query in Excel to clean messy data, including splitting columns and fixing date inconsistencies, with the recommendation to pay attention to the steps and explore more detailed tutorials available.']}], 'highlights': ['Excel offers five powerful functionalities for automated data cleaning, addressing issues such as incorrect start dates, blank gender values, incorrect department entries, and messy location details.', "The tutorial includes splitting a column using the 'number of characters' option in Power Query to separate employee ID and cost center, addressing common corporate data inconsistencies in start dates by changing the data type to date, and utilizing locale options for different date formatting.", 'The sample employee dataset used for demonstration contains various issues, including messed up start dates, blank gender values, incorrect department entries, and messy location details.', 'The chapter emphasizes the prevalence of dirty data in data analysis situations and the need for effective data cleaning solutions, drawing parallels with a famous quote by Samuel Taylor Coleridge.', 'Using Power Query to add conditional columns for full time and part time employees Demonstrates using Power Query to add a conditional column to categorize employees as full time or part time based on the FTE column values.']}