title
Python Pandas Tutorial (Part 4): Filtering - Using Conditionals to Filter Rows and Columns
description
In this video, we will be learning how to filter our Pandas dataframes using conditionals.
This video is sponsored by Brilliant. Go to https://brilliant.org/cms to sign up for free. Be one of the first 200 people to sign up with this link and get 20% off your premium subscription.
In this Python Programming video, we will be learning how to write conditionals in or to filter our data within our Pandas dataframes. This is a fundamental skill to have when using Pandas because it is one of the first things most people do when starting a new Pandas project. Let's get started...
The code for this video can be found at:
http://bit.ly/Pandas-04
StackOverflow Survey Download Page - http://bit.ly/SO-Survey-Download
✅ Support My Channel Through Patreon:
https://www.patreon.com/coreyms
✅ Become a Channel Member:
https://www.youtube.com/channel/UCCezIgC97PvUuR4_gbFUs5g/join
✅ One-Time Contribution Through PayPal:
https://goo.gl/649HFY
✅ Cryptocurrency Donations:
Bitcoin Wallet - 3MPH8oY2EAgbLVy7RBMinwcBntggi7qeG3
Ethereum Wallet - 0x151649418616068fB46C3598083817101d3bCD33
Litecoin Wallet - MPvEBY5fxGkmPQgocfJbxP6EmTo5UUXMot
✅ Corey's Public Amazon Wishlist
http://a.co/inIyro1
✅ Equipment I Use and Books I Recommend:
https://www.amazon.com/shop/coreyschafer
▶️ You Can Find Me On:
My Website - http://coreyms.com/
My Second Channel - https://www.youtube.com/c/coreymschafer
Facebook - https://www.facebook.com/CoreyMSchafer
Twitter - https://twitter.com/CoreyMSchafer
Instagram - https://www.instagram.com/coreymschafer/
#Python #Pandas
detail
{'title': 'Python Pandas Tutorial (Part 4): Filtering - Using Conditionals to Filter Rows and Columns', 'heatmap': [{'end': 735.568, 'start': 718.691, 'weight': 1}, {'end': 1084.982, 'start': 1048.678, 'weight': 0.762}], 'summary': "This python pandas tutorial series provides a comprehensive guide on filtering data, emphasizing the basics of filtering data frames and series objects, applying filter operators like 'and' and 'or' to retrieve specific rows, and filtering survey data for high salaries and by country. the series also demonstrates the use of 'filter' and '.loc' indexers for data filtering in pandas.", 'chapters': [{'end': 53.511, 'segs': [{'end': 53.511, 'src': 'embed', 'start': 0.149, 'weight': 0, 'content': [{'end': 6.175, 'text': "Hey there, how's it going everybody? In this video, we're going to go over the basics of filtering data from data frames and series objects.", 'start': 0.149, 'duration': 6.026}, {'end': 13.023, 'text': 'So for example, if we wanted to look at our survey data and only look at people who know Python, then we can filter that data out.', 'start': 6.556, 'duration': 6.467}, {'end': 19.83, 'text': 'Or maybe we only want to see results from a specific country or people that have a specific salary range, anything like that.', 'start': 13.423, 'duration': 6.407}, {'end': 26.672, 'text': "We can do all of that by filtering out data from our series and data frame objects, and we'll learn how to do that in this video.", 'start': 20.29, 'duration': 6.382}, {'end': 29.994, 'text': 'So filtering is one of the main things to learn with pandas,', 'start': 27.053, 'duration': 2.941}, {'end': 36.016, 'text': "because it's basically how we begin every project by filtering the data that we want from the data that we don't.", 'start': 29.994, 'duration': 6.022}, {'end': 41.26, 'text': "Now I'd also like to mention that we do have a sponsor for this video and that is Brilliant.org.", 'start': 36.576, 'duration': 4.684}, {'end': 48.887, 'text': "So I'd really like to thank Brilliant for sponsoring this series and it would be great if you all could check them out using the link in the description section below and support the sponsors.", 'start': 41.54, 'duration': 7.347}, {'end': 51.269, 'text': "And I'll talk more about their services in just a bit.", 'start': 49.227, 'duration': 2.042}, {'end': 53.511, 'text': "So with that said, let's go ahead and get started.", 'start': 51.669, 'duration': 1.842}], 'summary': 'Learn how to filter data in pandas for data analysis.', 'duration': 53.362, 'max_score': 0.149, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY149.jpg'}], 'start': 0.149, 'title': 'Filtering data basics with pandas', 'summary': 'Covers the basics of filtering data from data frames and series objects in pandas, emphasizing the importance of filtering at the start of every project and mentioning the sponsorship by brilliant.org.', 'chapters': [{'end': 53.511, 'start': 0.149, 'title': 'Filtering data basics with pandas', 'summary': 'Covers the basics of filtering data from data frames and series objects in pandas, emphasizing the importance of filtering at the start of every project and mentioning the sponsorship by brilliant.org.', 'duration': 53.362, 'highlights': ["Filtering data is a fundamental skill with pandas as it's how every project begins, by extracting the desired data from the available pool.", 'The video discusses filtering data from data frames and series objects to extract specific information, such as people who know Python or results from a specific country or with a specific salary range.', 'The importance of learning how to filter data from series and data frame objects in pandas is emphasized, indicating its significance in data analysis and manipulation.', 'The video is sponsored by Brilliant.org, with a request to check out their services using the provided link in the description section below, showcasing support for the sponsors.']}], 'duration': 53.362, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY149.jpg', 'highlights': ["Filtering data is a fundamental skill with pandas as it's how every project begins, by extracting the desired data from the available pool.", 'The importance of learning how to filter data from series and data frame objects in pandas is emphasized, indicating its significance in data analysis and manipulation.', 'The video discusses filtering data from data frames and series objects to extract specific information, such as people who know Python or results from a specific country or with a specific salary range.', 'The video is sponsored by Brilliant.org, with a request to check out their services using the provided link in the description section below, showcasing support for the sponsors.']}, {'end': 209.39, 'segs': [{'end': 124.04, 'src': 'embed', 'start': 95.866, 'weight': 0, 'content': [{'end': 101.029, 'text': 'And now we can just say if that last name equals equals doe.', 'start': 95.866, 'duration': 5.163}, {'end': 106.091, 'text': 'So if I run this, then what we get back is a series object.', 'start': 101.549, 'duration': 4.542}, {'end': 108.292, 'text': 'And this might not be what you expected.', 'start': 106.511, 'duration': 1.781}, {'end': 114.735, 'text': 'So maybe you thought we would just get a data frame back with all of the values that met our criteria.', 'start': 108.692, 'duration': 6.043}, {'end': 119.658, 'text': 'But what we got back is a series with a bunch of true false values.', 'start': 115.176, 'duration': 4.482}, {'end': 124.04, 'text': 'Now, these true false values actually correspond to our original data frame.', 'start': 120.038, 'duration': 4.002}], 'summary': 'Using a conditional statement to filter data resulted in a series object with true/false values corresponding to the original data frame.', 'duration': 28.174, 'max_score': 95.866, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY95866.jpg'}, {'end': 188.413, 'src': 'embed', 'start': 140.328, 'weight': 1, 'content': [{'end': 143.03, 'text': 'And these two last names here with Doe are true.', 'start': 140.328, 'duration': 2.702}, {'end': 151.879, 'text': 'So this is a filter mask, and when you apply it to a data frame, it will give you all of the rows that meet that filter criteria.', 'start': 143.65, 'duration': 8.229}, {'end': 154.962, 'text': "So now let's apply this filter to our data frame.", 'start': 152.339, 'duration': 2.623}, {'end': 163.03, 'text': "So first, I'm going to assign this return series here to a variable, and I'm just going to call this variable Filt.", 'start': 155.422, 'duration': 7.608}, {'end': 166.514, 'text': "So I'll say Filt is equal to, and then this comparison here.", 'start': 163.23, 'duration': 3.284}, {'end': 172.359, 'text': 'Now, filter is a built-in Python keyword, so be sure to use something else.', 'start': 166.954, 'duration': 5.405}, {'end': 181.808, 'text': 'Anytime I assign these two variables, I usually just use this Filt keyword here, or not keyword, but variable name.', 'start': 172.819, 'duration': 8.989}, {'end': 188.413, 'text': 'Now, I also usually like to wrap my entire filter in parentheses because I find it easier to read.', 'start': 182.528, 'duration': 5.885}], 'summary': 'Using a filter mask on a data frame to retrieve rows meeting the filter criteria.', 'duration': 48.085, 'max_score': 140.328, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY140328.jpg'}], 'start': 54.011, 'title': 'Filtering data in python', 'summary': 'Demonstrates using basic comparisons to filter data frames in python, resulting in a series object with true and false values and applying this filter to retrieve the rows meeting the criteria.', 'chapters': [{'end': 209.39, 'start': 54.011, 'title': 'Filtering data in python', 'summary': 'Demonstrates how to use basic comparisons to filter data frames in python, resulting in a series object with true and false values corresponding to the rows that met the filter criteria, and then applying this filter to the data frame to retrieve the rows that meet the criteria.', 'duration': 155.379, 'highlights': ["The comparison 'DF['last name'] == doe' results in a series object with true and false values corresponding to the rows that met the filter criteria, with two rows meeting the criteria for the last name 'Doe'.", "The filtered series is assigned to a variable 'Filt', and is wrapped in parentheses for improved readability and clarity in the code.", "The filter variable 'Filt' holds the series of true and false values, which when applied to a data frame, gives all the rows that meet the filter criteria."]}], 'duration': 155.379, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY54011.jpg', 'highlights': ["The comparison 'DF['last name'] == doe' results in a series object with true and false values corresponding to the rows that met the filter criteria, with two rows meeting the criteria for the last name 'Doe'.", "The filter variable 'Filt' holds the series of true and false values, which when applied to a data frame, gives all the rows that meet the filter criteria.", "The filtered series is assigned to a variable 'Filt', and is wrapped in parentheses for improved readability and clarity in the code."]}, {'end': 407.93, 'segs': [{'end': 274.63, 'src': 'embed', 'start': 209.93, 'weight': 0, 'content': [{'end': 214.956, 'text': "and now let's apply this filter to our data frame, and we can do this in a couple of ways.", 'start': 209.93, 'duration': 5.026}, {'end': 217.278, 'text': 'so you might see some people do it like this.', 'start': 214.956, 'duration': 2.322}, {'end': 221.943, 'text': 'we can just pass that directly in, like we are searching for a column.', 'start': 217.278, 'duration': 4.665}, {'end': 230.11, 'text': 'we can pass in a filter there and if I run this, Oops and I got an error there because I did not run this cell to set that variable.', 'start': 221.943, 'duration': 8.167}, {'end': 231.371, 'text': "So I'll rerun that.", 'start': 230.51, 'duration': 0.861}, {'end': 241.377, 'text': 'And now if I run this, then now we can see that we get a data frame back where it returned all of the rows that have the last name of dough.', 'start': 231.811, 'duration': 9.566}, {'end': 246.899, 'text': "Now, we only assigned the filter on a different line because I think that's easier to read.", 'start': 241.957, 'duration': 4.942}, {'end': 252.381, 'text': 'But you might see some people put these comparisons directly in the brackets for the data frame.', 'start': 247.459, 'duration': 4.922}, {'end': 255.101, 'text': 'So you might see something like this.', 'start': 252.801, 'duration': 2.3}, {'end': 261.584, 'text': "So I'm just going to comment that out right now and just grab this entire filter here.", 'start': 255.262, 'duration': 6.322}, {'end': 266.665, 'text': 'You might see some people do it like this and just paste it or put it directly in there.', 'start': 261.983, 'duration': 4.682}, {'end': 268.226, 'text': 'And we can see that that works too.', 'start': 266.985, 'duration': 1.241}, {'end': 274.63, 'text': "Now, I think that that's a little more difficult to read than just assigning this to a variable.", 'start': 268.846, 'duration': 5.784}], 'summary': "Applying a filter to a data frame to return rows with the last name 'dough'.", 'duration': 64.7, 'max_score': 209.93, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY209930.jpg'}, {'end': 352.083, 'src': 'embed', 'start': 320.884, 'weight': 1, 'content': [{'end': 326.728, 'text': 'because there are multiple things that you can pass into these different brackets to get different results.', 'start': 320.884, 'duration': 5.844}, {'end': 335.653, 'text': 'so, like i said before, dot loc is used to look up rows and columns by label, but if you pass in a series of booleans, like we did here,', 'start': 326.728, 'duration': 8.925}, {'end': 337.535, 'text': 'then you can also filter data out.', 'start': 335.653, 'duration': 1.882}, {'end': 339.656, 'text': 'now the reason that I like using.', 'start': 338.075, 'duration': 1.581}, {'end': 344.819, 'text': 'loc for this is because we can still grab the specific columns that we want as well.', 'start': 339.656, 'duration': 5.163}, {'end': 352.083, 'text': 'so, for example, if I wanted the email column, then I could simply say pass in a second value here into.', 'start': 344.819, 'duration': 7.264}], 'summary': 'Using dot loc to filter data with booleans and retrieve specific columns.', 'duration': 31.199, 'max_score': 320.884, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY320884.jpg'}, {'end': 407.93, 'src': 'embed', 'start': 384.542, 'weight': 3, 'content': [{'end': 392.687, 'text': "Now, we can't use the Python built-in AND and OR keywords for our filters, so we're going to be using some other symbols.", 'start': 384.542, 'duration': 8.145}, {'end': 403.029, 'text': "And the symbols that we're going to use here are the ampersand for and let me write these out here the ampersand for an and and this vertical bar for an or.", 'start': 393.167, 'duration': 9.862}, {'end': 407.93, 'text': 'So these symbols carry over from other programming conventions, so you may have seen them before.', 'start': 403.429, 'duration': 4.501}], 'summary': 'Python filters use ampersand for and and vertical bar for or.', 'duration': 23.388, 'max_score': 384.542, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY384542.jpg'}], 'start': 209.93, 'title': 'Filtering data in pandas', 'summary': "Demonstrates applying a filter to a data frame with two methods, showcasing the preference for assigning the filter on a separate line for readability. it also explains filtering data using the 'filter' and '.loc' indexer, discussing the use of and and or operators for filtering.", 'chapters': [{'end': 255.101, 'start': 209.93, 'title': 'Applying filter to data frame', 'summary': 'Demonstrates applying a filter to a data frame to retrieve rows with a specific criterion, showcasing two methods of implementation and highlighting the preference for assigning the filter on a separate line for readability.', 'duration': 45.171, 'highlights': ["Applying a filter to a data frame returns all rows that meet the specified condition, such as retrieving rows with a specific last name like 'dough'.", 'There are multiple ways to apply filters to a data frame, either by passing the filter directly as a comparison in the brackets for the data frame or by assigning the filter on a separate line for improved readability.']}, {'end': 407.93, 'start': 255.262, 'title': 'Filtering and locating data in pandas', 'summary': "Explains two methods for filtering data, using the 'filter' and '.loc' indexer, with a preference for the latter due to its ability to select specific columns. it also discusses the use of and and or operators for filtering.", 'duration': 152.668, 'highlights': ['The .loc indexer allows filtering data by passing in a series of booleans, providing flexibility to select specific columns as well.', 'Explains the use of AND and OR operators for filtering, utilizing the ampersand for AND and the vertical bar for OR.', 'Demonstrates an alternative method of filtering data using the filter function and highlights the readability advantage of using a variable for the filter.']}], 'duration': 198, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY209930.jpg', 'highlights': ["Applying a filter to a data frame returns all rows that meet the specified condition, such as retrieving rows with a specific last name like 'dough'.", 'The .loc indexer allows filtering data by passing in a series of booleans, providing flexibility to select specific columns as well.', 'There are multiple ways to apply filters to a data frame, either by passing the filter directly as a comparison in the brackets for the data frame or by assigning the filter on a separate line for improved readability.', 'Explains the use of AND and OR operators for filtering, utilizing the ampersand for AND and the vertical bar for OR.', 'Demonstrates an alternative method of filtering data using the filter function and highlights the readability advantage of using a variable for the filter.']}, {'end': 574.233, 'segs': [{'end': 483.24, 'src': 'embed', 'start': 459.788, 'weight': 0, 'content': [{'end': 468.654, 'text': "So now if I rerun that filter and then rerun our dot loc, then we can see that now we're just getting that single email address.", 'start': 459.788, 'duration': 8.866}, {'end': 472.877, 'text': 'And that email again is where all of the last names were equal to doe.', 'start': 469.034, 'duration': 3.843}, {'end': 475.278, 'text': 'And the first names were equal to John.', 'start': 473.457, 'duration': 1.821}, {'end': 477.038, 'text': "So in this case, it's just one result.", 'start': 475.318, 'duration': 1.72}, {'end': 480.219, 'text': "So now let's look at an example using the OR operator.", 'start': 477.458, 'duration': 2.761}, {'end': 483.24, 'text': 'Now for this, we can use the vertical bar character.', 'start': 480.719, 'duration': 2.521}], 'summary': 'Filtering data resulted in one email address with specific criteria. demonstrated use of or operator with vertical bar character.', 'duration': 23.452, 'max_score': 459.788, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY459788.jpg'}, {'end': 563.881, 'src': 'embed', 'start': 529.1, 'weight': 2, 'content': [{'end': 534.982, 'text': 'Now I could go in here and fiddle around with this query that I currently have trying to get everything right.', 'start': 529.1, 'duration': 5.882}, {'end': 543.244, 'text': "Or I could simply add in a tilde at the beginning of this filter and it will give me everything that didn't match that filter.", 'start': 535.522, 'duration': 7.722}, {'end': 552.331, 'text': 'So if I just come in here and put a tilde there, then that is going to negate that filter and give me the opposite of those results.', 'start': 543.724, 'duration': 8.607}, {'end': 563.881, 'text': "So we can see here that we get Jane Doe because that is all the results where the last name was not Schaefer or the first name wasn't John.", 'start': 552.852, 'duration': 11.029}], 'summary': 'Adding a tilde at the beginning of the filter yields results opposite to the given criteria.', 'duration': 34.781, 'max_score': 529.1, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY529100.jpg'}], 'start': 408.41, 'title': 'Pandas filter operators', 'summary': "Explains how to use filter operators in pandas like 'and' (&) and 'or' (|) to retrieve specific rows based on given conditions, demonstrating the results in a practical example.", 'chapters': [{'end': 574.233, 'start': 408.41, 'title': 'Pandas filter operators', 'summary': "Explains how to use filter operators in pandas like 'and' (&) and 'or' (|) to retrieve specific rows based on given conditions, demonstrating the results in a practical example.", 'duration': 165.823, 'highlights': ["Demonstrates using 'and' operator for filtering rows based on multiple conditions, resulting in a single result.", "Illustrates using 'or' operator to retrieve rows based on multiple conditions, resulting in multiple results.", 'Shows how to get the opposite of a filter using the tilde (~) character.']}], 'duration': 165.823, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY408410.jpg', 'highlights': ["Illustrates using 'or' operator to retrieve rows based on multiple conditions, resulting in multiple results.", "Demonstrates using 'and' operator for filtering rows based on multiple conditions, resulting in a single result.", 'Shows how to get the opposite of a filter using the tilde (~) character.']}, {'end': 800.085, 'segs': [{'end': 600.71, 'src': 'embed', 'start': 574.633, 'weight': 4, 'content': [{'end': 580.721, 'text': "But it's more mathematical related than programming related, although the two do overlap very frequently.", 'start': 574.633, 'duration': 6.088}, {'end': 585.103, 'text': 'Okay, so that kind of covers the basics of filtering on a small data frame.', 'start': 581.141, 'duration': 3.962}, {'end': 593.067, 'text': "But now let's go back to our larger data set of survey data and look at some real world examples of some filters that we might want to take a look at.", 'start': 585.483, 'duration': 7.584}, {'end': 595.808, 'text': "So I'm going to bring up my other notebook here.", 'start': 593.447, 'duration': 2.361}, {'end': 600.71, 'text': "And here we have the Stack Overflow survey data that we've been using throughout the series.", 'start': 596.248, 'duration': 4.462}], 'summary': 'Mathematics and programming overlap in data filtering. survey data used in examples.', 'duration': 26.077, 'max_score': 574.633, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY574633.jpg'}, {'end': 657.47, 'src': 'embed', 'start': 615.419, 'weight': 0, 'content': [{'end': 621.782, 'text': "So for example, let's say that we wanted to look at the data for people who are making a salary over a certain amount.", 'start': 615.419, 'duration': 6.363}, {'end': 628.207, 'text': 'Now, maybe we want to take a look at what languages are earning the higher salaries or something like that.', 'start': 622.223, 'duration': 5.984}, {'end': 632.15, 'text': "so in order to do that, I'm going to first create a filter.", 'start': 628.207, 'duration': 3.943}, {'end': 635.653, 'text': "now, if you don't know which column in the data frame gives the salary,", 'start': 632.15, 'duration': 3.503}, {'end': 640.997, 'text': "then you can always find that using the schema data frame that we've seen throughout the series.", 'start': 635.653, 'duration': 5.344}, {'end': 645.681, 'text': 'that tells us what each of these columns here means.', 'start': 640.997, 'duration': 4.684}, {'end': 651.485, 'text': "but for the sake of time here I'll just tell you that the column for salary if I go over here,", 'start': 645.681, 'duration': 5.804}, {'end': 657.47, 'text': 'I think I can find it here pretty quick it is this converted comp right here.', 'start': 651.485, 'duration': 5.985}], 'summary': 'Analyzing salaries to identify high-earning languages using data filters and schema data frame.', 'duration': 42.051, 'max_score': 615.419, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY615419.jpg'}, {'end': 743.818, 'src': 'heatmap', 'start': 706.181, 'weight': 1, 'content': [{'end': 710.324, 'text': "We want that to be over, let's say a high salary is over $70, 000.", 'start': 706.181, 'duration': 4.143}, {'end': 715.589, 'text': "You know, this is kind of subjective, but we'll do that as a filter here.", 'start': 710.324, 'duration': 5.265}, {'end': 718.311, 'text': "And now let's apply that filter to our data frame.", 'start': 715.969, 'duration': 2.342}, {'end': 725.757, 'text': 'So just like we saw before, I can say df.loc and I can pass in that high salary filter there.', 'start': 718.691, 'duration': 7.066}, {'end': 729.821, 'text': 'And now we can see that we get some results here.', 'start': 726.517, 'duration': 3.304}, {'end': 731.824, 'text': "And this isn't all of our respondents.", 'start': 730.242, 'duration': 1.582}, {'end': 735.568, 'text': "We can see that now it's respondent six and nine and 13.", 'start': 731.864, 'duration': 3.704}, {'end': 743.818, 'text': 'So if I scroll over to our converted comp, then all of these salaries here should be over 70, 000.', 'start': 735.568, 'duration': 8.25}], 'summary': 'Applying a high salary filter over $70,000 to the data frame results in respondents 6, 9, and 13 having salaries over $70,000.', 'duration': 37.637, 'max_score': 706.181, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY706181.jpg'}, {'end': 784.416, 'src': 'embed', 'start': 756.284, 'weight': 3, 'content': [{'end': 759.445, 'text': 'So to do this, remember we can just pass these into .', 'start': 756.284, 'duration': 3.161}, {'end': 762.186, 'text': "loc So up here where we're doing .", 'start': 759.445, 'duration': 2.741}, {'end': 767.168, 'text': "loc, I'll put in a comma here, and now I'll put in a list for the columns that we want.", 'start': 762.186, 'duration': 4.982}, {'end': 769.789, 'text': "And let's say that we want to get the country.", 'start': 767.708, 'duration': 2.081}, {'end': 775.272, 'text': 'We also want to get the programming languages that these people have worked with.', 'start': 770.43, 'duration': 4.842}, {'end': 784.416, 'text': 'And this here, and like I said, you can look all of these up in the schema, but this is under languages worked with.', 'start': 775.892, 'duration': 8.524}], 'summary': 'Pass parameters to retrieve country and programming languages from schema.', 'duration': 28.132, 'max_score': 756.284, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY756284.jpg'}], 'start': 574.633, 'title': 'Filtering and selecting high salaries', 'summary': 'Covers filtering survey data for high salaries, using examples of salary amounts and a quick data schema reference. it also demonstrates filtering a data frame to show respondents with salaries over $70,000 and selecting specific columns such as country, programming languages, and salary.', 'chapters': [{'end': 679.248, 'start': 574.633, 'title': 'Filtering survey data for high salaries', 'summary': 'Covers filtering basics on a small data frame and then proceeds to demonstrate filtering on a larger survey dataset from stack overflow, aiming to analyze high salaries based on programming languages with examples of salary amounts and a quick data schema reference.', 'duration': 104.615, 'highlights': ['Demonstrating filtering on a larger survey dataset from Stack Overflow to analyze high salaries based on programming languages.', 'Providing examples of salary amounts and referencing the data schema for column identification.', 'Covering filtering basics on a small data frame.']}, {'end': 800.085, 'start': 679.648, 'title': 'Filtering high salaries and selecting columns', 'summary': 'Demonstrates filtering a data frame to show respondents with salaries over $70,000 and selecting specific columns such as country, programming languages, and salary.', 'duration': 120.437, 'highlights': ['The chapter demonstrates applying a filter to the data frame to show salaries over $70,000, resulting in respondent indices 13, 9, and 6.', 'It showcases the process of selecting specific columns like country, programming languages, and salary from the data frame using .loc method.', 'The speaker explains the use of a conditional filter to display salaries over $70,000, a subjective but applicable threshold for the demonstration.']}], 'duration': 225.452, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY574633.jpg', 'highlights': ['Demonstrating filtering on a larger survey dataset from Stack Overflow to analyze high salaries based on programming languages.', 'The chapter demonstrates applying a filter to the data frame to show salaries over $70,000, resulting in respondent indices 13, 9, and 6.', 'Providing examples of salary amounts and referencing the data schema for column identification.', 'It showcases the process of selecting specific columns like country, programming languages, and salary from the data frame using .loc method.', 'Covering filtering basics on a small data frame.', 'The speaker explains the use of a conditional filter to display salaries over $70,000, a subjective but applicable threshold for the demonstration.']}, {'end': 1371.006, 'segs': [{'end': 867.349, 'src': 'embed', 'start': 821.397, 'weight': 0, 'content': [{'end': 828.466, 'text': "So now that I'm actually seeing the countries here, that reminds me that we might want to do some filtering with multiple values.", 'start': 821.397, 'duration': 7.069}, {'end': 838.219, 'text': 'So for example, you know, my YouTube audience comes mainly from the United States, India, the United Kingdom, Germany and Canada.', 'start': 828.826, 'duration': 9.393}, {'end': 844.642, 'text': "Well, that's where the largest percentages of the audience of people who are watching the videos come from.", 'start': 838.719, 'duration': 5.923}, {'end': 852.525, 'text': "So let's say that I wanted to filter out the survey results here so that I only see the results from those five countries that I mentioned.", 'start': 845.182, 'duration': 7.343}, {'end': 856.267, 'text': 'Now I could create a super long filter up here.', 'start': 852.945, 'duration': 3.322}, {'end': 863.449, 'text': 'where I say you know if the country is equal to the United States, or if the country is equal to India,', 'start': 857.027, 'duration': 6.422}, {'end': 867.349, 'text': 'or if the country is equal to the United Kingdom, but that would take up a lot of space.', 'start': 863.449, 'duration': 3.9}], 'summary': 'Filter survey results for top 5 countries: us, india, uk, germany, canada', 'duration': 45.952, 'max_score': 821.397, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY821397.jpg'}, {'end': 984.092, 'src': 'embed', 'start': 951.12, 'weight': 1, 'content': [{'end': 955.261, 'text': "Now let me show you one more common filter operation that you'll probably use a lot.", 'start': 951.12, 'duration': 4.141}, {'end': 965.303, 'text': 'So we can actually use string methods within pandas as well to do some alterations to our data frame, or in this case, to help with a conditional.', 'start': 955.701, 'duration': 9.602}, {'end': 967.124, 'text': 'So let me show you what I mean.', 'start': 965.823, 'duration': 1.301}, {'end': 973.587, 'text': "So let's say that we only want to look at people who answered that they knew Python as a programming language.", 'start': 967.604, 'duration': 5.983}, {'end': 975.328, 'text': "So let's see how we do this.", 'start': 973.987, 'duration': 1.341}, {'end': 984.092, 'text': 'So first of all, the column that lists the programming languages that each person said that they know is that language worked with column.', 'start': 975.788, 'duration': 8.304}], 'summary': 'Demonstrating filtering for people who know python using string methods in pandas.', 'duration': 32.972, 'max_score': 951.12, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY951120.jpg'}, {'end': 1076.657, 'src': 'embed', 'start': 1048.678, 'weight': 4, 'content': [{'end': 1052.46, 'text': "So I'm going to say dot str dot contains.", 'start': 1048.678, 'duration': 3.782}, {'end': 1055.841, 'text': 'And then I will pass in Python.', 'start': 1053.32, 'duration': 2.521}, {'end': 1061.265, 'text': 'Now also we can see that we have some nan values here, but not a number.', 'start': 1056.241, 'duration': 5.024}, {'end': 1066.909, 'text': "Now we need to also set a fill value for those or else we're going to probably get an error.", 'start': 1061.945, 'duration': 4.964}, {'end': 1070.952, 'text': 'So that is part of the contains method here.', 'start': 1067.35, 'duration': 3.602}, {'end': 1074.295, 'text': 'I can just say na is equal to false.', 'start': 1071.313, 'duration': 2.982}, {'end': 1076.657, 'text': "We're just not going to do anything with those.", 'start': 1074.435, 'duration': 2.222}], 'summary': 'Using dot str dot contains in python to handle nan values with fill value set to avoid errors.', 'duration': 27.979, 'max_score': 1048.678, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY1048678.jpg'}, {'end': 1084.982, 'src': 'heatmap', 'start': 1048.678, 'weight': 0.762, 'content': [{'end': 1052.46, 'text': "So I'm going to say dot str dot contains.", 'start': 1048.678, 'duration': 3.782}, {'end': 1055.841, 'text': 'And then I will pass in Python.', 'start': 1053.32, 'duration': 2.521}, {'end': 1061.265, 'text': 'Now also we can see that we have some nan values here, but not a number.', 'start': 1056.241, 'duration': 5.024}, {'end': 1066.909, 'text': "Now we need to also set a fill value for those or else we're going to probably get an error.", 'start': 1061.945, 'duration': 4.964}, {'end': 1070.952, 'text': 'So that is part of the contains method here.', 'start': 1067.35, 'duration': 3.602}, {'end': 1074.295, 'text': 'I can just say na is equal to false.', 'start': 1071.313, 'duration': 2.982}, {'end': 1076.657, 'text': "We're just not going to do anything with those.", 'start': 1074.435, 'duration': 2.222}, {'end': 1079.299, 'text': 'So let me explain this one more time here.', 'start': 1077.197, 'duration': 2.102}, {'end': 1084.982, 'text': "So this filter that we're putting in place here it's saying okay for this column.", 'start': 1079.819, 'duration': 5.163}], 'summary': 'Using contains method in python to handle nan values and set fill values.', 'duration': 36.304, 'max_score': 1048.678, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY1048678.jpg'}, {'end': 1158.546, 'src': 'embed', 'start': 1128.182, 'weight': 5, 'content': [{'end': 1134.367, 'text': "Now this one here, number eight, we can't actually see it here, but we have these ellipses here.", 'start': 1128.182, 'duration': 6.185}, {'end': 1137.53, 'text': "So it's probably just being truncated here.", 'start': 1134.908, 'duration': 2.622}, {'end': 1142.1, 'text': 'Now, in my last video, I kept saying that these were being concatenated.', 'start': 1137.938, 'duration': 4.162}, {'end': 1143.04, 'text': 'I meant truncated.', 'start': 1142.22, 'duration': 0.82}, {'end': 1144.921, 'text': 'A few people pointed that out in the comments.', 'start': 1143.08, 'duration': 1.841}, {'end': 1147.442, 'text': 'So yeah, these are being truncated here.', 'start': 1145.441, 'duration': 2.001}, {'end': 1150.823, 'text': "So we just can't see the Python value there, but they are there.", 'start': 1147.942, 'duration': 2.881}, {'end': 1158.546, 'text': "And I'll probably do a complete video on string methods here in the future, since there's so much more that we can do with these.", 'start': 1151.263, 'duration': 7.283}], 'summary': 'In the video, python values are being truncated, prompting plans for a future complete video on string methods.', 'duration': 30.364, 'max_score': 1128.182, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY1128182.jpg'}, {'end': 1320.061, 'src': 'embed', 'start': 1285.78, 'weight': 2, 'content': [{'end': 1290.745, 'text': 'And also the first 200 people to go to that link will get 20% off the annual premium subscription.', 'start': 1285.78, 'duration': 4.965}, {'end': 1293.508, 'text': 'And you can find that link in the description section below.', 'start': 1291.106, 'duration': 2.402}, {'end': 1298.074, 'text': "Again, that's brilliant.org forward slash CMS.", 'start': 1294.009, 'duration': 4.065}, {'end': 1301.556, 'text': "Okay, so I think that's going to do it for this pandas video.", 'start': 1299.395, 'duration': 2.161}, {'end': 1308.078, 'text': "I hope you feel like you got a good idea for how to filter the data within our data frames to find the information that you're looking for.", 'start': 1301.916, 'duration': 6.162}, {'end': 1314.8, 'text': 'Like I said, this is a fundamental skill in pandas, which is usually one of the first things that we do with our data.', 'start': 1308.638, 'duration': 6.162}, {'end': 1320.061, 'text': "In the next video, we'll be learning how to alter the data in our data frames and make changes.", 'start': 1315.3, 'duration': 4.761}], 'summary': 'First 200 people get 20% off annual premium subscription at brilliant.org/cms. next video covers altering data in data frames.', 'duration': 34.281, 'max_score': 1285.78, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY1285780.jpg'}, {'end': 1371.006, 'src': 'embed', 'start': 1356.563, 'weight': 7, 'content': [{'end': 1359.424, 'text': 'The easiest way is to simply like the video and give it a thumbs up,', 'start': 1356.563, 'duration': 2.861}, {'end': 1362.964, 'text': "and also it's a huge help to share these videos with anyone who you think would find them useful.", 'start': 1359.424, 'duration': 3.54}, {'end': 1367.845, 'text': "And if you have the means, you can contribute through Patreon, and there's a link to that page in the description section below.", 'start': 1363.384, 'duration': 4.461}, {'end': 1371.006, 'text': 'Be sure to subscribe for future videos, and thank you all for watching.', 'start': 1368.285, 'duration': 2.721}], 'summary': 'Engage with the video by liking, sharing, and contributing through patreon for support.', 'duration': 14.443, 'max_score': 1356.563, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY1356563.jpg'}], 'start': 800.545, 'title': 'Data filtering in pandas', 'summary': 'Covers filtering survey results by country to include the largest percentages of the audience from specific countries, filtering data with string methods in pandas based on a condition, and emphasizes the importance of understanding underlying concepts for data filtering in pandas.', 'chapters': [{'end': 950.7, 'start': 800.545, 'title': 'Filtering survey results by country', 'summary': 'Demonstrates filtering survey results by country to include only the largest percentages of the audience from the united states, india, the united kingdom, germany, and canada, ultimately achieving the desired results with the applied filter.', 'duration': 150.155, 'highlights': ['The chapter demonstrates filtering survey results by country to include only the largest percentages of the audience from the United States, India, the United Kingdom, Germany, and Canada.', 'Achieving the desired results with the applied filter.', 'Utilizing a list of countries to efficiently filter survey results.']}, {'end': 1220.163, 'start': 951.12, 'title': 'Filtering data with string methods in pandas', 'summary': 'Demonstrates how to use string methods in pandas to filter data based on a specific condition, with the example of filtering people who know python as a programming language and how the filter applies a mask to the data frame.', 'duration': 269.043, 'highlights': ['Using string methods in Pandas to filter data based on specific conditions, such as identifying people who know Python as a programming language, which is a common operation in data analysis.', "Demonstrating the use of the contains method to check if 'Python' is within the string of programming languages known by each person, resulting in a filter that returns a series of true/false values.", 'Explaining how the true/false values from the filter apply a mask to the data frame, allowing the retrieval of rows with true values and excluding those with false values.', 'Highlighting the potential truncation of data when displaying results, as indicated by the presence of ellipses and emphasizing the usefulness of string methods for tasks beyond filtering, such as text replacement and value splitting.']}, {'end': 1371.006, 'start': 1220.343, 'title': 'Pandas data filtering', 'summary': 'Highlights the fundamental skill of filtering data in pandas, emphasizes the importance of understanding underlying concepts, and promotes brilliant.org as a supplement for data science skills.', 'duration': 150.663, 'highlights': ['Brilliant.org offers problem-solving courses and lessons on data science, with a 20% discount for the first 200 sign-ups via the provided link.', 'The next video will cover altering data in data frames, including making changes to specific values and across the entire data frame.', 'Encouragement to engage with the content through likes, shares, and support via Patreon, with a reminder to subscribe for future videos.']}], 'duration': 570.461, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/Lw2rlcxScZY/pics/Lw2rlcxScZY800545.jpg', 'highlights': ['Demonstrating filtering survey results by country for the largest audience percentages', 'Using string methods in Pandas to filter data based on specific conditions', 'Emphasizing the importance of understanding underlying concepts for data filtering', 'Utilizing a list of countries to efficiently filter survey results', 'Explaining the use of the contains method to check for specific conditions', 'Highlighting the potential truncation of data when displaying results', "Offering a 20% discount for the first 200 sign-ups for Brilliant.org's data science courses", 'Encouraging engagement with content through likes, shares, and support via Patreon', 'Announcing the next video will cover altering data in data frames']}], 'highlights': ["Filtering data is a fundamental skill with pandas as it's how every project begins, by extracting the desired data from the available pool.", 'The importance of learning how to filter data from series and data frame objects in pandas is emphasized, indicating its significance in data analysis and manipulation.', "Applying a filter to a data frame returns all rows that meet the specified condition, such as retrieving rows with a specific last name like 'dough'.", 'The video discusses filtering data from data frames and series objects to extract specific information, such as people who know Python or results from a specific country or with a specific salary range.', 'Demonstrating filtering survey results by country for the largest audience percentages', "Illustrates using 'or' operator to retrieve rows based on multiple conditions, resulting in multiple results.", 'The chapter demonstrates applying a filter to the data frame to show salaries over $70,000, resulting in respondent indices 13, 9, and 6.', 'The .loc indexer allows filtering data by passing in a series of booleans, providing flexibility to select specific columns as well.', "The comparison 'DF['last name'] == doe' results in a series object with true and false values corresponding to the rows that met the filter criteria, with two rows meeting the criteria for the last name 'Doe'.", 'Using string methods in Pandas to filter data based on specific conditions']}