title
Python Pandas Tutorial (Part 8): Grouping and Aggregating - Analyzing and Exploring Your Data

description
In this video, we will be learning how to group and aggregate our data. This video is sponsored by Brilliant. Go to https://brilliant.org/cms to sign up for free. Be one of the first 200 people to sign up with this link and get 20% off your premium subscription. In this Python Programming video, we will be learning how to group and aggregate our data. This will allow us to explore our data in ways we have not yet done in this series. We will be able to answer questions such as: "What is the most popular social media site for each country?" We will be using the groupby method, and also some aggregate functions such as mean, median, value_counts, etc. Let's get started... Video Timestamps: Aggregate Column - 2:00 Aggregate DataFrame - 3:55 Value Counts - 7:51 Grouping - 12:30 Multiple Aggregates on Group - 26:00 People Who Know Python By Country - 27:20 Practice Question - 34:20 Concat Series - 37:27 The code for this video can be found at: http://bit.ly/Pandas-08 StackOverflow Survey Download Page - http://bit.ly/SO-Survey-Download ✅ Support My Channel Through Patreon: https://www.patreon.com/coreyms ✅ Become a Channel Member: https://www.youtube.com/channel/UCCezIgC97PvUuR4_gbFUs5g/join ✅ One-Time Contribution Through PayPal: https://goo.gl/649HFY ✅ Cryptocurrency Donations: Bitcoin Wallet - 3MPH8oY2EAgbLVy7RBMinwcBntggi7qeG3 Ethereum Wallet - 0x151649418616068fB46C3598083817101d3bCD33 Litecoin Wallet - MPvEBY5fxGkmPQgocfJbxP6EmTo5UUXMot ✅ Corey's Public Amazon Wishlist http://a.co/inIyro1 ✅ Equipment I Use and Books I Recommend: https://www.amazon.com/shop/coreyschafer ▶️ You Can Find Me On: My Website - http://coreyms.com/ My Second Channel - https://www.youtube.com/c/coreymschafer Facebook - https://www.facebook.com/CoreyMSchafer Twitter - https://twitter.com/CoreyMSchafer Instagram - https://www.instagram.com/coreymschafer/ #Python #Pandas

detail
{'title': 'Python Pandas Tutorial (Part 8): Grouping and Aggregating - Analyzing and Exploring Your Data', 'heatmap': [{'end': 1093.175, 'start': 971.302, 'weight': 1}, {'end': 1268.52, 'start': 1203.925, 'weight': 0.9}], 'summary': 'Tutorial on python pandas covers grouping and aggregating data, analyzing developer survey data, using aggregate functions such as median and describe, analyzing survey data and grouping by country, social media analysis, and analyzing python usage in survey data, showcasing practical examples and statistics including median salary, social media platform usage, and python knowledge percentages for different countries.', 'chapters': [{'end': 50.438, 'segs': [{'end': 50.438, 'src': 'embed', 'start': 16.863, 'weight': 0, 'content': [{'end': 23.809, 'text': "So this will be the first video where we actually get some statistics back on our data sets and aren't just modifying our data frames in different ways.", 'start': 16.863, 'duration': 6.946}, {'end': 28.513, 'text': 'So for example, maybe you want to know what the average salary for a developer is.', 'start': 24.209, 'duration': 4.304}, {'end': 34.139, 'text': 'Or maybe you want to know how many people from each country knows Python or another programming language.', 'start': 28.934, 'duration': 5.205}, {'end': 38.323, 'text': "So what we're going to learn here is going to allow us to answer those types of questions.", 'start': 34.479, 'duration': 3.844}, {'end': 42.768, 'text': 'Now I would like to mention that we do have a sponsor for this series of videos and that is Brilliant.', 'start': 38.723, 'duration': 4.045}, {'end': 50.438, 'text': 'So I really want to thank Brilliant for sponsoring this series and it would be great if you all could check them out using the link in the description section below and support the sponsors.', 'start': 43.069, 'duration': 7.369}], 'summary': 'First video with statistics on data sets, answering salary and language proficiency questions. sponsored by brilliant.', 'duration': 33.575, 'max_score': 16.863, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut6416863.jpg'}], 'start': 0.189, 'title': 'Grouping and aggregating data', 'summary': 'Covers the concept of grouping and aggregating data to analyze it in a meaningful sense, allowing users to obtain statistics from the data sets, such as average salary for a developer or the number of people from each country who know a specific programming language.', 'chapters': [{'end': 50.438, 'start': 0.189, 'title': 'Grouping and aggregating data', 'summary': 'Covers the concept of grouping and aggregating data to analyze it in a meaningful sense, allowing users to obtain statistics from the data sets, such as average salary for a developer or the number of people from each country who know a specific programming language.', 'duration': 50.249, 'highlights': ['Learning how to group and aggregate data enables users to obtain statistics and analyze the data in a meaningful way, such as determining the average salary for a developer or the number of people from each country who know a specific programming language.', 'The chapter marks the transition from simply modifying data frames to actually obtaining statistics from the data sets.', 'The video introduces Brilliant as a sponsor for the series.']}], 'duration': 50.249, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut64189.jpg', 'highlights': ['Learning how to group and aggregate data enables users to obtain statistics and analyze the data in a meaningful way, such as determining the average salary for a developer or the number of people from each country who know a specific programming language.', 'The chapter marks the transition from simply modifying data frames to actually obtaining statistics from the data sets.', 'The video introduces Brilliant as a sponsor for the series.']}, {'end': 715.375, 'segs': [{'end': 117.728, 'src': 'embed', 'start': 91.665, 'weight': 4, 'content': [{'end': 99.487, 'text': 'these are aggregate functions because they take multiple values and give you either the mean median or mode of those results.', 'start': 91.665, 'duration': 7.822}, {'end': 106.555, 'text': 'So, if we wanted to run some analysis on our developer survey here, one question we might ask is okay,', 'start': 100.047, 'duration': 6.508}, {'end': 110.199, 'text': 'what is a typical salary for developers who answered this survey?', 'start': 106.555, 'duration': 3.644}, {'end': 117.728, 'text': "So that might be some good information to have if you're looking for a job and want to get an idea of what these salaries look like at the moment.", 'start': 110.699, 'duration': 7.029}], 'summary': 'Aggregate functions provide mean, median, or mode of multiple values. useful for analyzing developer survey data, e.g., determining typical salaries.', 'duration': 26.063, 'max_score': 91.665, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut6491665.jpg'}, {'end': 239.665, 'src': 'embed', 'start': 210.817, 'weight': 2, 'content': [{'end': 214.741, 'text': "Now this probably doesn't give us as much information as we'd really like to have.", 'start': 210.817, 'duration': 3.924}, {'end': 221.628, 'text': 'So, for example, different countries pay different amounts since there are different costs of living and things like that.', 'start': 215.121, 'duration': 6.507}, {'end': 225.932, 'text': "So it'd be nice if we could look at the median salary broken down by country.", 'start': 222.008, 'duration': 3.924}, {'end': 229.335, 'text': "And we'll look at that here in a second when we learn about grouping data.", 'start': 226.292, 'duration': 3.043}, {'end': 234, 'text': 'But first, I want to cover a few more basic concepts before we move on to grouping.', 'start': 229.816, 'duration': 4.184}, {'end': 239.665, 'text': "So one thing that I'd like to look at is running these aggregate functions on our entire data frame.", 'start': 234.5, 'duration': 5.165}], 'summary': 'Analyzing median salary by country and grouping data to understand aggregate functions.', 'duration': 28.848, 'max_score': 210.817, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut64210817.jpg'}, {'end': 326.304, 'src': 'embed', 'start': 275.311, 'weight': 0, 'content': [{'end': 285.519, 'text': 'So, for example, we can see that the median age down here at the bottom for this survey was 29 years old and the median number of work hours per week.', 'start': 275.311, 'duration': 10.208}, {'end': 288.261, 'text': 'That was 40, which is pretty standard.', 'start': 286.14, 'duration': 2.121}, {'end': 289.062, 'text': 'So that makes sense.', 'start': 288.281, 'duration': 0.781}, {'end': 297.968, 'text': 'Now if you want to get a broad overview of your data and a statistical overview, we can use the describe method on our data frame instead.', 'start': 289.682, 'duration': 8.286}, {'end': 308.634, 'text': 'So if I instead run describe instead of median and I run this, then this is going to give us a broad overview of some different stats.', 'start': 298.468, 'duration': 10.166}, {'end': 315.338, 'text': 'So if we look at the converted comp column here, then we can see a few different stats about this column.', 'start': 309.035, 'duration': 6.303}, {'end': 326.304, 'text': 'So it gives us the count, it gives us the mean, it gives us the standard deviation, the minimum, and then it also gives us the 25, 50,', 'start': 315.779, 'duration': 10.525}], 'summary': 'Median age: 29 years, work hours: 40 per week; describe method provides statistical overview of data frame.', 'duration': 50.993, 'max_score': 275.311, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut64275311.jpg'}, {'end': 414.154, 'src': 'embed', 'start': 371.063, 'weight': 1, 'content': [{'end': 377.787, 'text': "it's not really a good metric to use, because a few outliers can affect the average very heavily.", 'start': 371.063, 'duration': 6.724}, {'end': 381.488, 'text': 'we can see that the mean salary up here.', 'start': 377.787, 'duration': 3.701}, {'end': 388.452, 'text': "if I highlight this right here, if we were to count this up, then that's actually about a hundred and twenty seven thousand dollars on average.", 'start': 381.488, 'duration': 6.964}, {'end': 394.858, 'text': 'But that gives us an unrealistic expectation of what a typical developer salary is,', 'start': 388.972, 'duration': 5.886}, {'end': 399.322, 'text': 'because the largest salaries in our data set are just pulling up that average so heavily.', 'start': 394.858, 'duration': 4.464}, {'end': 403.126, 'text': 'So in cases like that, you definitely want to use the mean instead.', 'start': 399.703, 'duration': 3.423}, {'end': 404.607, 'text': "I think that's a better representation.", 'start': 403.146, 'duration': 1.461}, {'end': 408.992, 'text': "Or I'm sorry, you're going to want to use the median instead because that's a better representation.", 'start': 404.948, 'duration': 4.044}, {'end': 414.154, 'text': 'Now, if we only wanted to get this overview for a single column,', 'start': 409.872, 'duration': 4.282}], 'summary': 'Mean salary of about $127,000 skews data, use median instead.', 'duration': 43.091, 'max_score': 371.063, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut64371063.jpg'}, {'end': 547.681, 'src': 'embed', 'start': 520.844, 'weight': 6, 'content': [{'end': 527.348, 'text': 'so you might get the survey results back and you might think to yourself okay, well, I can see the responses here in the survey,', 'start': 520.844, 'duration': 6.504}, {'end': 531.691, 'text': 'but I just want to know how many people answered yes and how many people answered no.', 'start': 527.348, 'duration': 4.343}, {'end': 533.352, 'text': 'so how would we do that?', 'start': 531.691, 'duration': 1.661}, {'end': 537.215, 'text': 'well, we can get that information with the value counts function.', 'start': 533.352, 'duration': 3.863}, {'end': 547.681, 'text': 'so If I just look at the value counts and that is value underscore counts if we run that method on that series,', 'start': 537.215, 'duration': 10.466}], 'summary': 'Using the value_counts function to analyze survey responses for quantifiable data.', 'duration': 26.837, 'max_score': 520.844, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut64520844.jpg'}], 'start': 50.779, 'title': 'Analyzing developer survey data and aggregate functions', 'summary': 'Introduces basic aggregations by calculating the median salary of developers from a survey, providing an example of $57,000, discusses the need to break down the median salary by country. it also covers using aggregate functions like median and describe to get statistical insights from the entire data frame, including obtaining median values for age and work hours, understanding the limitations of mean, and utilizing value counts to analyze the distribution of responses in specific columns.', 'chapters': [{'end': 229.335, 'start': 50.779, 'title': 'Analyzing developer survey data', 'summary': 'Introduces basic aggregations by calculating the median salary of developers from a survey, providing an example of $57,000, and discussing the need to break down the median salary by country for more comprehensive insights.', 'duration': 178.556, 'highlights': ['The median salary for developers from the survey was around $57,000.', 'The need to break down the median salary by country for more comprehensive insights.', 'Introduction to basic aggregations by calculating the median salary of developers.']}, {'end': 715.375, 'start': 229.816, 'title': 'Aggregate functions and data overview', 'summary': 'Covers using aggregate functions like median and describe to get statistical insights from the entire data frame, including obtaining median values for age and work hours, understanding the limitations of mean and the importance of using median, and utilizing value counts to analyze the distribution of responses in specific columns.', 'duration': 485.559, 'highlights': ['The describe method provides a broad overview of statistical measures such as count, mean, standard deviation, and quantiles for columns in the data frame, allowing for a comprehensive understanding of the data distribution and variability.', 'The median age from the survey data is 29 years old, and the median number of work hours per week is 40, providing specific numerical insights into the attributes of the dataset.', 'The limitations of using the mean for salaries are highlighted, as it is heavily affected by outliers, resulting in an unrealistic average salary of about $127,000, emphasizing the importance of using the median for a more accurate representation of typical salaries.', "The value counts function is demonstrated to analyze the distribution of responses in specific columns, such as determining the number of 'yes' and 'no' responses for a particular survey question, providing valuable insights into the distribution of categorical data."]}], 'duration': 664.596, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut6450779.jpg', 'highlights': ['The describe method provides statistical measures like count, mean, and quantiles for data frame columns.', 'The median salary for developers from the survey was around $57,000.', 'The need to break down the median salary by country for comprehensive insights.', 'The limitations of using the mean for salaries are highlighted due to outliers.', 'Introduction to basic aggregations by calculating the median salary of developers.', 'The median age from the survey data is 29 years old, and the median work hours per week is 40.', 'The value counts function is demonstrated to analyze the distribution of responses in specific columns.']}, {'end': 1294.527, 'segs': [{'end': 780.123, 'src': 'embed', 'start': 739.424, 'weight': 1, 'content': [{'end': 741.786, 'text': "And now we're going to get these broken down by percentage.", 'start': 739.424, 'duration': 2.362}, {'end': 749.993, 'text': 'So 17% of the people said that they use Reddit, 16 said YouTube, about 16 said WhatsApp, and so on.', 'start': 742.186, 'duration': 7.807}, {'end': 754.738, 'text': 'Okay, so we can see that we have some social media sites here from some other countries.', 'start': 750.453, 'duration': 4.285}, {'end': 758.122, 'text': 'So obviously this is most likely a regional thing.', 'start': 755.138, 'duration': 2.984}, {'end': 764.889, 'text': "My guess would be that the popularity of these social media platforms varies a lot based on what country you're in.", 'start': 758.542, 'duration': 6.347}, {'end': 771.616, 'text': 'So how would we break up these results so that we can see the most popular social media sites for each country?', 'start': 765.49, 'duration': 6.126}, {'end': 775.96, 'text': "Now, in order to do this, we're going to have to learn about grouping our data.", 'start': 772.176, 'duration': 3.784}, {'end': 780.123, 'text': 'So again, this is a topic that can be a little confusing when you first see it.', 'start': 776.46, 'duration': 3.663}], 'summary': '17% use reddit, 16% use youtube, and whatsapp. data varies by country.', 'duration': 40.699, 'max_score': 739.424, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut64739424.jpg'}, {'end': 873.062, 'src': 'embed', 'start': 843.929, 'weight': 0, 'content': [{'end': 848.491, 'text': 'So to do this, we can just access the country column.', 'start': 843.929, 'duration': 4.562}, {'end': 855.454, 'text': 'And if I run this, we can see that this gives us the country that each respondent said that they were from.', 'start': 849.011, 'duration': 6.443}, {'end': 861.777, 'text': 'And if we look at the value counts for this, then this is going to tally up all of the unique responses.', 'start': 855.974, 'duration': 5.803}, {'end': 867.599, 'text': 'So we can see that the majority of this survey was answered by developers in the United States.', 'start': 862.137, 'duration': 5.462}, {'end': 873.062, 'text': 'And in second was India, then Germany, United Kingdom, Canada, and so on.', 'start': 868.14, 'duration': 4.922}], 'summary': 'Survey shows majority of developers from united states, followed by india and germany.', 'duration': 29.133, 'max_score': 843.929, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut64843929.jpg'}, {'end': 1093.175, 'src': 'heatmap', 'start': 971.302, 'weight': 1, 'content': [{'end': 979.504, 'text': 'so to do this we can say country group dot, get underscore group and then pass in the name of the group.', 'start': 971.302, 'duration': 8.202}, {'end': 983.005, 'text': 'in this case I want to get the group for United States.', 'start': 979.504, 'duration': 3.501}, {'end': 987.886, 'text': 'so if I run this cell, whoops and this is telling me that country group is not defined,', 'start': 983.005, 'duration': 4.881}, {'end': 992.087, 'text': "and it's because I didn't rerun this cell up here after I set that variable.", 'start': 987.886, 'duration': 4.201}, {'end': 1001.137, 'text': 'so If I run this and grab the group for the United States, then we can see that we get a data frame returned here with some survey results.', 'start': 992.087, 'duration': 9.05}, {'end': 1004.321, 'text': "So this doesn't look like anything special yet.", 'start': 1001.618, 'duration': 2.703}, {'end': 1010.068, 'text': 'But if I look at the country name for each of these survey results, the country is listed right here.', 'start': 1004.762, 'duration': 5.306}, {'end': 1016.514, 'text': 'then we can see that all of these responses are from people who said that they were from the United States.', 'start': 1010.729, 'duration': 5.785}, {'end': 1026.842, 'text': 'And if I look at the group for India, so if I instead change United States to India here and grab that group, if we look at the country here,', 'start': 1017.134, 'duration': 9.708}, {'end': 1030.685, 'text': 'then these are all the survey results for people who said that they were from India.', 'start': 1027.321, 'duration': 3.364}, {'end': 1035.752, 'text': "So that's what our data frame group by object that we saw before consists of.", 'start': 1031.087, 'duration': 4.665}, {'end': 1041.259, 'text': 'It has broken up all of the different responses into groups by country name.', 'start': 1036.434, 'duration': 4.825}, {'end': 1045.545, 'text': 'So this would be similar to running a filter on our original data frame.', 'start': 1041.72, 'duration': 3.825}, {'end': 1053.728, 'text': "So I should be able to get these same results for a single country just by doing what we've seen in previous videos and creating a filter.", 'start': 1045.944, 'duration': 7.784}, {'end': 1063.832, 'text': 'So I could say, OK, I want to grab I want our filter to be equal to any time the country is equal to the.', 'start': 1053.948, 'duration': 9.884}, {'end': 1066.634, 'text': 'United States.', 'start': 1065.393, 'duration': 1.241}, {'end': 1077.122, 'text': 'and then I can apply this to our data frame by saying okay, DF dot loc and give me all the results that match that filter.', 'start': 1066.634, 'duration': 10.488}, {'end': 1086.53, 'text': 'and if I run this cell, then we can see over here in the country column that all of these results are respondents from the United States.', 'start': 1077.122, 'duration': 9.408}, {'end': 1093.175, 'text': "so if we're just looking to get information on a single country, then it's very similar to just creating a filter, like we did here,", 'start': 1086.53, 'duration': 6.645}], 'summary': 'Grouped survey results by country to analyze responses, e.g., for united states and india.', 'duration': 121.873, 'max_score': 971.302, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut64971302.jpg'}, {'end': 1066.634, 'src': 'embed', 'start': 1017.134, 'weight': 3, 'content': [{'end': 1026.842, 'text': 'And if I look at the group for India, so if I instead change United States to India here and grab that group, if we look at the country here,', 'start': 1017.134, 'duration': 9.708}, {'end': 1030.685, 'text': 'then these are all the survey results for people who said that they were from India.', 'start': 1027.321, 'duration': 3.364}, {'end': 1035.752, 'text': "So that's what our data frame group by object that we saw before consists of.", 'start': 1031.087, 'duration': 4.665}, {'end': 1041.259, 'text': 'It has broken up all of the different responses into groups by country name.', 'start': 1036.434, 'duration': 4.825}, {'end': 1045.545, 'text': 'So this would be similar to running a filter on our original data frame.', 'start': 1041.72, 'duration': 3.825}, {'end': 1053.728, 'text': "So I should be able to get these same results for a single country just by doing what we've seen in previous videos and creating a filter.", 'start': 1045.944, 'duration': 7.784}, {'end': 1063.832, 'text': 'So I could say, OK, I want to grab I want our filter to be equal to any time the country is equal to the.', 'start': 1053.948, 'duration': 9.884}, {'end': 1066.634, 'text': 'United States.', 'start': 1065.393, 'duration': 1.241}], 'summary': 'Data frame grouped by country, with survey results for india and united states.', 'duration': 49.5, 'max_score': 1017.134, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut641017134.jpg'}, {'end': 1136.571, 'src': 'embed', 'start': 1113.362, 'weight': 8, 'content': [{'end': 1120.165, 'text': 'well, like I mentioned before, maybe we want to see the most popular social media sites broken down by country.', 'start': 1113.362, 'duration': 6.803}, {'end': 1129.568, 'text': "now, if you just wanted to get the most popular social media sites by the United States or by India, then we've already seen how we can do this.", 'start': 1120.165, 'duration': 9.403}, {'end': 1136.571, 'text': 'so right here I have some filtered results down to where we have the responses for the United States.', 'start': 1129.568, 'duration': 7.003}], 'summary': 'Analyze popular social media sites by country, including specific data for the united states and india.', 'duration': 23.209, 'max_score': 1113.362, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut641113362.jpg'}, {'end': 1268.52, 'src': 'heatmap', 'start': 1203.925, 'weight': 0.9, 'content': [{'end': 1208.526, 'text': 'then it will combine those groups to give us the results for all of those unique countries.', 'start': 1203.925, 'duration': 4.601}, {'end': 1211.907, 'text': 'So I think this will make sense once we just see this here.', 'start': 1209.007, 'duration': 2.9}, {'end': 1216.809, 'text': 'So remember, I called our group up here country group.', 'start': 1212.428, 'duration': 4.381}, {'end': 1222.812, 'text': 'so if we come down here to the bottom, then we can say okay for the country group.', 'start': 1217.449, 'duration': 5.363}, {'end': 1233.889, 'text': 'now I want to look at the social media column and I want to grab the value counts for that column, for that entire country group.', 'start': 1222.812, 'duration': 11.077}, {'end': 1241.594, 'text': 'So if I run this, then what this returns is a series with the most popular social media sites broken down by country.', 'start': 1234.429, 'duration': 7.165}, {'end': 1244.076, 'text': 'Now this actually cuts off a little early here.', 'start': 1242.054, 'duration': 2.022}, {'end': 1248.639, 'text': 'So let me grab a larger chunk of this series to get a better idea of what this looks like.', 'start': 1244.416, 'duration': 4.223}, {'end': 1254.543, 'text': "So right here at the end, I'm just going to say dot head and look at the top 50 results or so.", 'start': 1249.019, 'duration': 5.524}, {'end': 1256.986, 'text': 'so if we run this,', 'start': 1255.263, 'duration': 1.723}, {'end': 1265.597, 'text': 'then we can see here that our first country is Afghanistan and we can look at the most popular social media for that and then go down the list Albania,', 'start': 1256.986, 'duration': 8.611}, {'end': 1268.52, 'text': 'Algeria, Argentina and so on.', 'start': 1265.597, 'duration': 2.923}], 'summary': 'Data analysis reveals most popular social media sites by country.', 'duration': 64.595, 'max_score': 1203.925, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut641203925.jpg'}], 'start': 715.735, 'title': 'Analyzing survey data and grouping by country', 'summary': 'Demonstrates using value counts to analyze survey data, revealing percentages of social media platform usage, and introduces grouping data by country. it also explains how to use the group by function in pandas to filter and access individual groups, exemplified by survey results for the united states and india. additionally, it showcases analyzing social media data by country, revealing the most popular social media sites broken down by country and showcasing the results for all unique countries in a series with multiple indexes.', 'chapters': [{'end': 867.599, 'start': 715.735, 'title': 'Analyzing survey data with value counts and group by', 'summary': 'Demonstrates using the value counts function to analyze survey data, revealing that 17% use reddit, 16% use youtube, and 16% use whatsapp, highlighting the regional popularity of social media platforms. it also introduces the concept of grouping data by country to identify the most popular social media sites for each country.', 'duration': 151.864, 'highlights': ['The majority of the survey was answered by developers in the United States.', '17% of the people said that they use Reddit, 16% said YouTube, and about 16% said WhatsApp, showing the popularity of social media platforms.', 'Introduces the concept of grouping data by country to identify the most popular social media sites for each country.']}, {'end': 1093.175, 'start': 868.14, 'title': 'Grouping data by country in pandas', 'summary': 'Explains how to use the group by function in pandas to group survey results by country, showcasing the process of splitting the object, accessing individual groups, and filtering data, exemplified by grouping survey results for the united states and india.', 'duration': 225.035, 'highlights': ['Using the group by function on the country column in Pandas allows for grouping survey results by country, with the United States having a specific group.', 'The group by object contains multiple groups of survey results, each representing responses from a specific country, with the ability to access individual groups such as those for the United States and India.', 'Filtering data for a single country can be achieved by creating a filter based on the country name, exemplified by creating a filter for the United States and applying it to the data frame to obtain results specifically for that country.']}, {'end': 1294.527, 'start': 1093.655, 'title': 'Grouping and analyzing social media data', 'summary': 'Demonstrates how to use the group by function to analyze social media data by country, revealing the most popular social media sites broken down by country and showcasing the results for all unique countries in a series with multiple indexes.', 'duration': 200.872, 'highlights': ['Using group by function to analyze social media data by country', 'Revealing most popular social media sites broken down by country', 'Showcasing results for all unique countries in a series with multiple indexes']}], 'duration': 578.792, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut64715735.jpg', 'highlights': ['The majority of the survey was answered by developers in the United States.', '17% of the people said that they use Reddit, 16% said YouTube, and about 16% said WhatsApp, showing the popularity of social media platforms.', 'Introduces the concept of grouping data by country to identify the most popular social media sites for each country.', 'Using the group by function on the country column in Pandas allows for grouping survey results by country, with the United States having a specific group.', 'The group by object contains multiple groups of survey results, each representing responses from a specific country, with the ability to access individual groups such as those for the United States and India.', 'Filtering data for a single country can be achieved by creating a filter based on the country name, exemplified by creating a filter for the United States and applying it to the data frame to obtain results specifically for that country.', 'Using group by function to analyze social media data by country', 'Revealing most popular social media sites broken down by country', 'Showcasing results for all unique countries in a series with multiple indexes']}, {'end': 1647.605, 'segs': [{'end': 1319.974, 'src': 'embed', 'start': 1294.967, 'weight': 1, 'content': [{'end': 1304.115, 'text': 'So again, if I wanted to grab those most popular social media sites for India, for example, then I could just come up here.', 'start': 1294.967, 'duration': 9.148}, {'end': 1308.219, 'text': "And with that returned series, actually let's take a look at this again.", 'start': 1304.935, 'duration': 3.284}, {'end': 1310.302, 'text': "So here's the index here.", 'start': 1308.8, 'duration': 1.502}, {'end': 1313.586, 'text': 'I can grab that series just by saying .', 'start': 1310.322, 'duration': 3.264}, {'end': 1316.93, 'text': 'loc and then looking for India.', 'start': 1313.586, 'duration': 3.344}, {'end': 1319.974, 'text': 'And we can see that those are the same results that we got before.', 'start': 1317.391, 'duration': 2.583}], 'summary': 'Identified popular social media sites for india using data analysis.', 'duration': 25.007, 'max_score': 1294.967, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut641294967.jpg'}, {'end': 1365.715, 'src': 'embed', 'start': 1343.362, 'weight': 2, 'content': [{'end': 1352.287, 'text': 'you know changing a filter over and over, I could just, you know, go here and look at the United States index for this return series,', 'start': 1343.362, 'duration': 8.925}, {'end': 1354.368, 'text': 'and now we can see those results.', 'start': 1352.287, 'duration': 2.081}, {'end': 1360.452, 'text': "so I think it's really interesting being able to play around with your data like this and being able to explore.", 'start': 1354.368, 'duration': 6.084}, {'end': 1365.715, 'text': "I really like seeing the different results for different countries, and a lot of these sites I've never heard of.", 'start': 1360.452, 'duration': 5.263}], 'summary': "Analyzing united states index for return series allows exploration of different countries' results.", 'duration': 22.353, 'max_score': 1343.362, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut641343362.jpg'}, {'end': 1466.279, 'src': 'embed', 'start': 1435.139, 'weight': 5, 'content': [{'end': 1440.224, 'text': 'so we can see that this russian social media site here, uh, has 30 percent,', 'start': 1435.139, 'duration': 5.085}, {'end': 1446.69, 'text': 'or 30 percent of the people from russia said that that was their most popular social network.', 'start': 1440.224, 'duration': 6.466}, {'end': 1449.752, 'text': 'and if we go back to China,', 'start': 1446.69, 'duration': 3.062}, {'end': 1459.116, 'text': 'then we can see that this one here at the top that has 67% of the developers from China said that that was the social media site that they used the most.', 'start': 1449.752, 'duration': 9.364}, {'end': 1466.279, 'text': 'So I just thought that was really interesting being able to play around with these numbers and seeing the different results for different countries.', 'start': 1459.496, 'duration': 6.783}], 'summary': '30% russians use this social media, 67% chinese use another.', 'duration': 31.14, 'max_score': 1435.139, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut641435139.jpg'}, {'end': 1506.026, 'src': 'embed', 'start': 1478.766, 'weight': 3, 'content': [{'end': 1482.608, 'text': 'Now bringing this back to what we were discussing at the beginning of the video.', 'start': 1478.766, 'duration': 3.842}, {'end': 1488.272, 'text': 'we can also use this to run more traditional aggregate functions like mean median and things like that.', 'start': 1482.608, 'duration': 5.664}, {'end': 1495.757, 'text': "So before we looked at the median salaries for the entire survey, But now let's break these down by country instead.", 'start': 1488.772, 'duration': 6.985}, {'end': 1502.443, 'text': 'So just like we looked at the value counts of the social media column, we can look at the median of the salary column.', 'start': 1496.138, 'duration': 6.305}, {'end': 1506.026, 'text': 'And that salary column is labeled converted comp.', 'start': 1502.904, 'duration': 3.122}], 'summary': 'Demonstrates using aggregate functions to analyze median salaries by country.', 'duration': 27.26, 'max_score': 1478.766, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut641478766.jpg'}, {'end': 1568.907, 'src': 'embed', 'start': 1540.736, 'weight': 0, 'content': [{'end': 1545.097, 'text': 'and this is the result that we get here, and these are our indexes.', 'start': 1540.736, 'duration': 4.361}, {'end': 1548.898, 'text': 'So the indexes are country name.', 'start': 1545.517, 'duration': 3.381}, {'end': 1552.619, 'text': 'So if I want to grab a specific country, then I can just use .', 'start': 1549.298, 'duration': 3.321}, {'end': 1554.72, 'text': 'loc and type in the country name.', 'start': 1552.619, 'duration': 2.101}, {'end': 1561.022, 'text': 'So if I run this, then we can see that the median salary here in Germany is about $63, 000.', 'start': 1555.2, 'duration': 5.822}, {'end': 1568.907, 'text': "Now maybe you're working on some analysis where you want to group your data, but you also want to run multiple aggregate functions on your group.", 'start': 1561.023, 'duration': 7.884}], 'summary': 'Median salary in germany is about $63,000.', 'duration': 28.171, 'max_score': 1540.736, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut641540736.jpg'}], 'start': 1294.967, 'title': 'Social media analysis and aggregate functions', 'summary': 'Covers the use of pandas to analyze social media site popularity across different countries, with examples from india, the united states, china, and russia. it also demonstrates the application of aggregate functions like median and mean to analyze salary data, presenting the median salary in germany as $63,000 and providing mean and median salaries for every country.', 'chapters': [{'end': 1478.286, 'start': 1294.967, 'title': 'Analyzing popular social media sites in different countries', 'summary': 'Discusses using pandas to analyze and compare the popularity of social media sites in different countries, showcasing the ability to access and manipulate data to gain insights and discover unexpected trends, with examples including the most popular social media sites in india, the united states, china, and russia.', 'duration': 183.319, 'highlights': ['Using pandas to access and manipulate data allows for easy comparison of the popularity of social media sites in different countries, eliminating the need to run filters on individual countries, as demonstrated by examples for India, the United States, China, and Russia.', 'The ability to explore and compare results for different countries using pandas provides insights into the popularity of social media sites, with examples such as WeChat and Weibo being popular in China, and the Russian social media site having 30% popularity among Russian users.', "By setting 'normalize' to true, pandas can provide percentage results instead of raw numbers, enabling a more in-depth analysis of the popularity of social media sites in different countries."]}, {'end': 1647.605, 'start': 1478.766, 'title': 'Aggregate function usage in data analysis', 'summary': 'Demonstrates how to use aggregate functions like median and mean to analyze salary data by country, showing median salary in germany as $63,000 and providing a data frame with mean and median salaries for every country.', 'duration': 168.839, 'highlights': ['The median salary in Germany is about $63,000, demonstrating the use of aggregate functions to analyze salary data by country.', 'The data frame provides the mean and median salaries for every country, allowing for specific country analysis and comparison.', 'The chapter also explains how to use the ag method to run multiple aggregate functions on the data, showcasing the use of median and mean aggregate functions for data analysis.']}], 'duration': 352.638, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut641294967.jpg', 'highlights': ['Using pandas to access and manipulate data allows for easy comparison of social media site popularity across different countries.', 'The ability to explore and compare results for different countries using pandas provides insights into the popularity of social media sites.', "By setting 'normalize' to true, pandas can provide percentage results instead of raw numbers, enabling a more in-depth analysis of social media site popularity.", 'The median salary in Germany is about $63,000, demonstrating the use of aggregate functions to analyze salary data by country.', 'The data frame provides the mean and median salaries for every country, allowing for specific country analysis and comparison.', 'The chapter also explains how to use the ag method to run multiple aggregate functions on the data, showcasing the use of median and mean aggregate functions for data analysis.']}, {'end': 2432.863, 'segs': [{'end': 1765.441, 'src': 'embed', 'start': 1736.343, 'weight': 0, 'content': [{'end': 1746.266, 'text': 'then I can say dot, str and use the string class on that return series and say okay, we want where the str dot contains.', 'start': 1736.343, 'duration': 9.923}, {'end': 1756.714, 'text': "So this will return true for the rows that have Python in the languages worked with and false for the responses that don't.", 'start': 1747.847, 'duration': 8.867}, {'end': 1765.441, 'text': 'So if I run this, then this just returns a series of true and false values where it tells us whether the language worked with column,', 'start': 1757.254, 'duration': 8.187}], 'summary': "Using string class to filter rows by containing 'python'.", 'duration': 29.098, 'max_score': 1736.343, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut641736343.jpg'}, {'end': 1860.046, 'src': 'embed', 'start': 1835.729, 'weight': 1, 'content': [{'end': 1844.215, 'text': 'I want to look at this language worked with column and then see the strings that contain Python and sum those up.', 'start': 1835.729, 'duration': 8.486}, {'end': 1848.618, 'text': 'but if I run this here, then we can see that we get an error now.', 'start': 1844.215, 'duration': 4.403}, {'end': 1855.523, 'text': 'like I said in a previous video, sometimes it can be hard to read these pandas errors and understand exactly what we did wrong,', 'start': 1848.618, 'duration': 6.905}, {'end': 1860.046, 'text': 'but in this case it actually gives us a pretty good clue as to what we did wrong.', 'start': 1855.523, 'duration': 4.523}], 'summary': "Finding strings containing 'python' and summing them up resulted in an error, which provided a clue for understanding the mistake.", 'duration': 24.317, 'max_score': 1835.729, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut641835729.jpg'}, {'end': 2347.225, 'src': 'embed', 'start': 2318.977, 'weight': 2, 'content': [{'end': 2323.458, 'text': "And then finally, I'm also going to put sort is equal to false.", 'start': 2318.977, 'duration': 4.481}, {'end': 2327.9, 'text': "Now, if you watched a previous video, this isn't absolutely necessary.", 'start': 2323.878, 'duration': 4.022}, {'end': 2332.241, 'text': 'But if you run it without sort equal to false,', 'start': 2328.9, 'duration': 3.341}, {'end': 2340.383, 'text': "then it'll give you a warning saying that in a future version of pandas that it'll sort by default or sort by false on default.", 'start': 2332.241, 'duration': 8.142}, {'end': 2347.225, 'text': "so it's better just to go ahead and specify if you want the resulting data frame sorted or not.", 'start': 2340.743, 'duration': 6.482}], 'summary': 'Setting sort to false avoids future warnings in pandas.', 'duration': 28.248, 'max_score': 2318.977, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut642318977.jpg'}, {'end': 2415.15, 'src': 'embed', 'start': 2385.781, 'weight': 5, 'content': [{'end': 2390.064, 'text': 'We can see here that this one is just called country and this one is called languages worked with.', 'start': 2385.781, 'duration': 4.283}, {'end': 2396.085, 'text': "So let's rename these so that they make more sense in the context of what we're actually trying to do.", 'start': 2390.584, 'duration': 5.501}, {'end': 2399.986, 'text': 'And we saw how to rename columns in a previous video as well.', 'start': 2396.666, 'duration': 3.32}, {'end': 2404.968, 'text': 'But if you forgot, then you can do this just by grabbing our data frame here.', 'start': 2400.386, 'duration': 4.582}, {'end': 2410.069, 'text': "And I'll say Python DF, which is our data frame dot rename.", 'start': 2405.568, 'duration': 4.501}, {'end': 2415.15, 'text': 'And now, what do we want to rename, we want to rename the columns.', 'start': 2410.889, 'duration': 4.261}], 'summary': 'Renaming columns in a data frame with python.', 'duration': 29.369, 'max_score': 2385.781, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut642385781.jpg'}], 'start': 1647.605, 'title': 'Python data analysis', 'summary': 'Covers filtering data to determine the number of people with python skills in a specific country and analyzing python usage in survey data, including identifying respondents who know python and calculating the percentage of python users from each country.', 'chapters': [{'end': 1713.349, 'start': 1647.605, 'title': 'Filtering data for python skills', 'summary': 'Demonstrates filtering data to determine the number of people in a specific country who know how to use python, using string methods and a filtering approach.', 'duration': 65.744, 'highlights': ['The chapter demonstrates using a filtering approach to determine the number of people in a specific country who know how to use Python.', 'It shows the utilization of string methods to filter the responses of people from a particular country who claimed to know Python.', 'The chapter emphasizes the process of filtering data to identify the number of individuals in a specific country who are familiar with Python.']}, {'end': 2432.863, 'start': 1713.849, 'title': 'Analyzing python usage in survey data', 'summary': 'Discusses identifying the number of survey respondents who know python, using the sum function on boolean values to count, encountering errors when applying functions on a group by object, and constructing a data frame to calculate the percentage of respondents from each country who know python.', 'duration': 719.014, 'highlights': ['Identifying the number of survey respondents who know Python', 'Encountering errors when applying functions on a group by object', 'Constructing a data frame to calculate the percentage of respondents from each country who know Python']}], 'duration': 785.258, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut641647605.jpg', 'highlights': ['Constructing a data frame to calculate the percentage of respondents from each country who know Python', 'The chapter demonstrates using a filtering approach to determine the number of people in a specific country who know how to use Python', 'The chapter emphasizes the process of filtering data to identify the number of individuals in a specific country who are familiar with Python', 'It shows the utilization of string methods to filter the responses of people from a particular country who claimed to know Python', 'Identifying the number of survey respondents who know Python', 'Encountering errors when applying functions on a group by object']}, {'end': 2932.451, 'segs': [{'end': 2505.753, 'src': 'embed', 'start': 2461.795, 'weight': 0, 'content': [{'end': 2471.367, 'text': 'So if I run that and then look at our data frame one more time, then we can see that it has been updated with those new columns.', 'start': 2461.795, 'duration': 9.572}, {'end': 2480.318, 'text': 'Now we have the total number of respondents from each country and the number of people who know Python from each country in one data frame.', 'start': 2471.887, 'duration': 8.431}, {'end': 2484.7, 'text': 'So we have all the information that we need to calculate a percentage.', 'start': 2480.818, 'duration': 3.882}, {'end': 2488.203, 'text': 'Now all we need to do is create a new column and calculate this.', 'start': 2485.161, 'duration': 3.042}, {'end': 2495.727, 'text': 'So if you remember, in order to create a new column, we can simply just assign it.', 'start': 2488.803, 'duration': 6.924}, {'end': 2502.111, 'text': 'So I will call this column PCT for percentage knows Python.', 'start': 2496.087, 'duration': 6.024}, {'end': 2505.753, 'text': 'and now, what do we want this to be equal to?', 'start': 2502.891, 'duration': 2.862}], 'summary': 'The data frame has been updated with total respondents and number of python users from each country, now ready to calculate a percentage.', 'duration': 43.958, 'max_score': 2461.795, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut642461795.jpg'}, {'end': 2709.125, 'src': 'embed', 'start': 2670.976, 'weight': 2, 'content': [{'end': 2671.857, 'text': "That's not bad either.", 'start': 2670.976, 'duration': 0.881}, {'end': 2676.08, 'text': "We have about 21, 000 here, about 10, 000 new Python, so that's 48%.", 'start': 2671.897, 'duration': 4.183}, {'end': 2678.862, 'text': "So that's in the higher range.", 'start': 2676.08, 'duration': 2.782}, {'end': 2679.483, 'text': "That's pretty good.", 'start': 2678.902, 'duration': 0.581}, {'end': 2683.866, 'text': 'So yeah, I think this is a great way to practice working with pandas.', 'start': 2679.943, 'duration': 3.923}, {'end': 2688.27, 'text': "And also it's just fun being able to explore your information in this way.", 'start': 2684.367, 'duration': 3.903}, {'end': 2691.873, 'text': 'And now that we have a data frame with all this information,', 'start': 2688.83, 'duration': 3.043}, {'end': 2699.298, 'text': 'Then we can also inspect a specific country to see what the percentage of developers are from a specific country who know Python.', 'start': 2692.293, 'duration': 7.005}, {'end': 2709.125, 'text': 'So, for example, instead of looking through what if I wanted to see Japan instead of looking through all of these, I could just say, OK, Python.', 'start': 2699.658, 'duration': 9.467}], 'summary': '21,000 developers, 10,000 new in python, marking 48% increase. great for practicing with pandas and exploring information in a fun way.', 'duration': 38.149, 'max_score': 2670.976, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut642670976.jpg'}, {'end': 2739.438, 'src': 'embed', 'start': 2715.948, 'weight': 3, 'content': [{'end': 2723.43, 'text': 'then we can just do a loc of japan and then we can see that we get these statistics for that specific country okay.', 'start': 2715.948, 'duration': 7.482}, {'end': 2728.452, 'text': 'so i know that that may have been a lot to take in and that we covered a lot of ground in this video.', 'start': 2723.43, 'duration': 5.022}, {'end': 2732.774, 'text': 'We definitely covered some more advanced topics here than we did in previous videos,', 'start': 2728.912, 'duration': 3.862}, {'end': 2739.438, 'text': 'but I hope this kind of got you a little excited to learn what you can do with pandas and the types of problems that we can solve.', 'start': 2732.774, 'duration': 6.664}], 'summary': 'Covered advanced topics in using pandas for data analysis in japan, sparking excitement for problem-solving.', 'duration': 23.49, 'max_score': 2715.948, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut642715948.jpg'}, {'end': 2818.034, 'src': 'embed', 'start': 2792.01, 'weight': 5, 'content': [{'end': 2796.294, 'text': "And I'll take a look at those and I'll highlight some if they are better than what I did here.", 'start': 2792.01, 'duration': 4.284}, {'end': 2801.498, 'text': 'Okay, so before we end here, I would like to mention the sponsor of this video.', 'start': 2796.954, 'duration': 4.544}, {'end': 2803.2, 'text': 'And that is Brilliant.', 'start': 2801.938, 'duration': 1.262}, {'end': 2808.044, 'text': "So in this series, we've been learning about pandas and how to analyze data in Python.", 'start': 2803.92, 'duration': 4.124}, {'end': 2812.469, 'text': 'And Brilliant would be an excellent way to supplement what you learn here with their hands-on courses.', 'start': 2808.425, 'duration': 4.044}, {'end': 2818.034, 'text': 'They have some excellent courses and lessons that do a deep dive on how to think about and analyze data correctly.', 'start': 2812.889, 'duration': 5.145}], 'summary': "Introduction to brilliant's data analysis courses in python", 'duration': 26.024, 'max_score': 2792.01, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut642792010.jpg'}, {'end': 2907.095, 'src': 'embed', 'start': 2870.74, 'weight': 6, 'content': [{'end': 2875.401, 'text': 'I would really encourage you to take some time after this video and play around with the data a bit.', 'start': 2870.74, 'duration': 4.661}, {'end': 2879.543, 'text': 'See if you can answer certain questions that someone might have about this data.', 'start': 2875.881, 'duration': 3.662}, {'end': 2885.325, 'text': 'So, for example, what is the most common education level for people who answered the survey?', 'start': 2880.103, 'duration': 5.222}, {'end': 2889.167, 'text': "That's definitely something that we could answer by what we learned here.", 'start': 2885.726, 'duration': 3.441}, {'end': 2894.149, 'text': 'So I hope you feel like you got a good introduction to being able to answer those types of questions.', 'start': 2889.587, 'duration': 4.562}, {'end': 2899.992, 'text': "Now in the next video, we're going to be learning about how to handle missing data and how to clean up your data.", 'start': 2894.589, 'duration': 5.403}, {'end': 2902.673, 'text': "It's very common for data to have missing values.", 'start': 2900.372, 'duration': 2.301}, {'end': 2907.095, 'text': 'So knowing how to sanitize and clean our data is definitely going to be important.', 'start': 2903.013, 'duration': 4.082}], 'summary': 'Encouraged to explore data, answer questions, and learn data cleaning in next video.', 'duration': 36.355, 'max_score': 2870.74, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut642870740.jpg'}], 'start': 2432.863, 'title': 'Python knowledge percentage and pandas data analysis', 'summary': 'Demonstrates modifying a data frame to calculate the percentage of people who know python, sorting countries by the largest percentage of python knowledge, with uganda having 65% and the united states having 48%, and discusses pandas data analysis and brilliant sponsorship.', 'chapters': [{'end': 2575.095, 'start': 2432.863, 'title': 'Calculating percentage of people who know python', 'summary': 'Demonstrates modifying a data frame to add columns displaying the total number of respondents and number of people who know python from each country, and then calculating the percentage of people who know python.', 'duration': 142.232, 'highlights': ['Modifying the data frame to add columns displaying the total number of respondents and number of people who know Python from each country.', 'Calculating the percentage of people who know Python by creating a new column and using the formula (number of people who know Python / total number of respondents) * 100.']}, {'end': 2791.55, 'start': 2575.596, 'title': 'Sorting countries by percentage of python knowledge', 'summary': 'Demonstrates sorting countries by the largest percentage of respondents who know python, with uganda having 65%, and united states having 48% of developers knowing python, highlighting the practical application of working with pandas and exploring data.', 'duration': 215.954, 'highlights': ['The chapter demonstrates sorting countries by the largest percentage of respondents who know Python, with Uganda having 65%.', 'United States has about 21,000 developers, with 48% of them knowing Python.', 'Practical application of working with pandas and exploring data is highlighted, emphasizing the process of inspecting specific countries to see the percentage of developers who know Python.']}, {'end': 2932.451, 'start': 2792.01, 'title': 'Pandas data analysis and brilliant sponsorship', 'summary': 'Discusses pandas data analysis, encourages viewers to explore data independently, promotes brilliant as a learning platform, and announces the upcoming video on handling missing data.', 'duration': 140.441, 'highlights': ['Brilliant sponsorship for pandas data analysis series', 'Encouragement to explore data independently', 'Upcoming video on handling missing data and cleaning up data']}], 'duration': 499.588, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/txMdrV1Ut64/pics/txMdrV1Ut642432863.jpg', 'highlights': ['Sorting countries by the largest percentage of respondents who know Python, with Uganda having 65%', 'United States has about 21,000 developers, with 48% of them knowing Python', 'Modifying the data frame to add columns displaying the total number of respondents and number of people who know Python from each country', 'Calculating the percentage of people who know Python by creating a new column and using the formula (number of people who know Python / total number of respondents) * 100', 'Practical application of working with pandas and exploring data is highlighted, emphasizing the process of inspecting specific countries to see the percentage of developers who know Python', 'Brilliant sponsorship for pandas data analysis series', 'Encouragement to explore data independently', 'Upcoming video on handling missing data and cleaning up data']}], 'highlights': ['Learning how to group and aggregate data for meaningful statistics and analysis', 'The describe method provides statistical measures like count, mean, and quantiles', 'The majority of the survey was answered by developers in the United States', 'Sorting countries by the largest percentage of respondents who know Python, with Uganda having 65%', 'Using pandas to access and manipulate data allows for easy comparison of social media site popularity across different countries', 'Constructing a data frame to calculate the percentage of respondents from each country who know Python', 'The median salary for developers from the survey was around $57,000', 'The need to break down the median salary by country for comprehensive insights', '17% of the people said that they use Reddit, 16% said YouTube, and about 16% said WhatsApp, showing the popularity of social media platforms', 'The ability to explore and compare results for different countries using pandas provides insights into the popularity of social media sites']}