title
Python Data Analysis Projects for 2022 | Data Analysis With Python | Python Training | Simplilearn

description
🔥Post Graduate Program In Data Analytics: https://www.simplilearn.com/pgp-data-analytics-certification-training-course?utm_campaign=PythonDataAnalysisProjectsFor2022-G9NmACvXh8w&utm_medium=Descriptionff&utm_source=youtube 🔥IIT Kanpur Professional Certificate Course In Data Analytics (India Only): https://www.simplilearn.com/iitk-professional-certificate-course-data-analytics?utm_campaign=PythonDataAnalysisProjectsFor2022-G9NmACvXh8w&utm_medium=Descriptionff&utm_source=youtube 🔥Caltech Data Analytics Bootcamp(US Only): https://www.simplilearn.com/data-analytics-bootcamp?utm_campaign=PythonDataAnalysisProjectsFor2022-G9NmACvXh8w&utm_medium=Descriptionff&utm_source=youtube 🔥Data Analyst Masters Program (Discount Code - YTBE15): https://www.simplilearn.com/data-analyst-masters-certification-training-course?utm_campaign=PythonDataAnalysisProjectsFor2022-G9NmACvXh8w&utm_medium=Descriptionff&utm_source=youtube This video on Python Data Analysis Projects for 2022 will help you learn how to use real-world data and perform exploratory data analysis. You will analyze and visualize coronavirus and Olympics dataset with libraries such as NumPy, Pandas, Matplotlib and Seaborn. Data Analysis with Python project will give you the experience will tackle real world problems. ✅Subscribe to our Channel to learn more programming languages: https://bit.ly/3eGepgQ ⏩ Check out the Python for beginners playlist: https://www.youtube.com/playlist?list=PLEiEAq2VkUUJO27b6PyoSd7CJjWIPyHYO #PythonDataAnalysisProjectsFor2022 #DataAnalysisWithPython #PythonTraining #PythonTutorial #PythonProgramming #Python #Simplilearn What is Python? Python is a high-level object-oriented programming language developed by Guido van Rossum in 1989 and was first released in 1991. Python is often called a batteries included language due to its comprehensive standard library. A fun fact about Python is that The name Python was actually taken from the popular BBC comedy show of that time, Monty Python's Flying Circus. Python is widely used these days from data analytics, machine learning, and web development. Python allows you to write programs in fewer lines of code than most of the programming languages. Following are the standard or built-in data type of Python: 1. Numeric data types 2. Text data type 3. Sequence data type 4. Mapping data type 5. Set data type 6. Boolean data type 7. Binary data type A programming language needs to have support for numbers to carry out calculations. In Python, the numbers are categorized into different data-types and the types are implemented in Python as classes. 🔥Enroll for Free Python Course & Get Your Completion Certificate: https://www.simplilearn.com/learn-python-basics-free-course-skillup?utm_campaign=PythonDataAnalysisProjectsFor2022&utm_medium=Description&utm_source=youtube ➡️ About Post Graduate Program In Data Analytics This Data Analytics Program is ideal for all working professionals and prior programming knowledge is not required. It covers topics like data analysis, data visualization, regression techniques, and supervised learning in-depth via our applied learning model with live sessions by leading practitioners and industry projects. ✅ Key Features - Post Graduate Program certificate and Alumni Association membership - Exclusive hackathons and Ask me Anything sessions by IBM - 8X higher live interaction in live online classes by industry experts - Capstone from 3 domains and 14+ Data Analytics Projects with Industry datasets from Google PlayStore, Lyft, World Bank etc. - Master Classes delivered by Purdue faculty and IBM experts - Simplilearn's JobAssist helps you get noticed by top hiring companies - Resume preparation and LinkedIn profile building - 1:1 mock interview - Career accelerator webinars ✅ Skills Covered - Data Analytics - Statistical Analysis using Excel - Data Analysis Python and R - Data Visualization Tableau and Power BI - Linear and logistic regression modules - Clustering using kmeans - Supervised Learning 👉 Learn More at: https://www.simplilearn.com/pgp-data-analytics-certification-training-course?utm_campaign=PythonDataAnalysisProjectsFor2022-G9NmACvXh8w&utm_medium=Description&utm_source=youtube 🔥Caltech Data Analytics Bootcamp(US Only): https://www.simplilearn.com/data-analytics-bootcamp?utm_campaign=PythonDataAnalysisProjectsFor2022-G9NmACvXh8w&utm_medium=Description&utm_source=youtube 🔥🔥 Interested in Attending Live Classes? Call Us: IN - 18002127688 / US - +18445327688

detail
{'title': 'Python Data Analysis Projects for 2022 | Data Analysis With Python | Python Training | Simplilearn', 'heatmap': [{'end': 1021.234, 'start': 919.47, 'weight': 0.858}, {'end': 1755.14, 'start': 1650.327, 'weight': 0.716}, {'end': 2768.25, 'start': 2578.276, 'weight': 0.705}], 'summary': 'Covers hands-on python data analysis projects focusing on covid-19 and olympics data, including analyzing global covid-19 impact with over 22 crore infections and 4.5 million deaths. it also delves into olympics data analysis, showcasing insights on male and female athlete participation and top countries in gold medals.', 'chapters': [{'end': 404.357, 'segs': [{'end': 59.482, 'src': 'embed', 'start': 8.382, 'weight': 0, 'content': [{'end': 12.963, 'text': 'Hi everyone! Welcome to this video tutorial on Python Data Analysis project for 2022.', 'start': 8.382, 'duration': 4.581}, {'end': 18.005, 'text': 'In this video, we will be covering two interesting hands-on projects using Python programming.', 'start': 12.963, 'duration': 5.042}, {'end': 22.526, 'text': 'You will learn how to use real-world data to perform data analysis and data visualization.', 'start': 18.545, 'duration': 3.981}, {'end': 25.847, 'text': 'The two projects are based on Coronavirus and Olympics data.', 'start': 22.986, 'duration': 2.861}, {'end': 34.401, 'text': 'You will learn to collect, analyze, clean, manipulate and visualize data with the help of Python libraries such as NumPy, Pandas,', 'start': 26.553, 'duration': 7.848}, {'end': 35.582, 'text': 'Matplotlib and Seaborn.', 'start': 34.401, 'duration': 1.181}, {'end': 40.867, 'text': 'These two projects will give you the idea to solve real-world problems using exploratory data analysis.', 'start': 36.083, 'duration': 4.784}, {'end': 43.19, 'text': "So let's get started with our first project.", 'start': 41.488, 'duration': 1.702}, {'end': 48.458, 'text': 'Today, we are going to perform two hands-on projects on COVID data analysis using Python and Tableau.', 'start': 43.976, 'duration': 4.482}, {'end': 55.321, 'text': "This is going to be a really interesting and fun session where I'll be asking you a few generic quiz questions related to coronavirus.", 'start': 49.178, 'duration': 6.143}, {'end': 58.062, 'text': 'Please make sure to answer them in the comment section of the video.', 'start': 55.781, 'duration': 2.281}, {'end': 59.482, 'text': "We'll be happy to hear from you.", 'start': 58.362, 'duration': 1.12}], 'summary': 'Python data analysis tutorial covers coronavirus and olympics projects using real-world data for 2022.', 'duration': 51.1, 'max_score': 8.382, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w8382.jpg'}, {'end': 114.614, 'src': 'embed', 'start': 85.143, 'weight': 4, 'content': [{'end': 89.527, 'text': 'The virus has so far infected over 22 crore people and killed more than 4.5 million innocents.', 'start': 85.143, 'duration': 4.384}, {'end': 99.55, 'text': 'In India, there have been over 3.3 crore confirmed cases and nearly 4,41, 000 deaths have been reported so far.', 'start': 91.508, 'duration': 8.042}, {'end': 105.732, 'text': 'This data is according to official figures released by the Union Ministry of Health and Family Welfare.', 'start': 101.371, 'duration': 4.361}, {'end': 109.913, 'text': 'As the world tries to cope up with this deadly virus,', 'start': 107.312, 'duration': 2.601}, {'end': 114.614, 'text': 'we request all our viewers and their family members to follow all the necessary precautions to avoid getting infected.', 'start': 109.913, 'duration': 4.701}], 'summary': 'Over 22 crore infected, 4.5 million deaths worldwide; india reports 3.3 crore cases, 4,41,000 deaths.', 'duration': 29.471, 'max_score': 85.143, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w85143.jpg'}, {'end': 190.052, 'src': 'embed', 'start': 168.695, 'weight': 5, 'content': [{'end': 177.929, 'text': 'The project will give you an idea about the impact of coronavirus globally in terms of confirmed cases, deaths reported, the number of recoveries,', 'start': 168.695, 'duration': 9.234}, {'end': 178.989, 'text': 'as well as active cases.', 'start': 177.929, 'duration': 1.06}, {'end': 190.052, 'text': 'We will also see how India has been affected since the pandemic started and dive into the different states and union territories to learn more about the COVID-19 influence and the vaccination status.', 'start': 180.609, 'duration': 9.443}], 'summary': "Analyzing global covid-19 impact and india's situation with state-wise data.", 'duration': 21.357, 'max_score': 168.695, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w168695.jpg'}], 'start': 8.382, 'title': 'Covid-19 data analysis projects', 'summary': 'Covers hands-on projects analyzing covid and olympics data, using python libraries like numpy, pandas, matplotlib, and seaborn. it provides statistics on global covid-19 impact, including over 22 crore infections and 4.5 million deaths, along with insights on vaccination status.', 'chapters': [{'end': 137.082, 'start': 8.382, 'title': 'Covid-19 data analysis projects', 'summary': 'Covers two python data analysis projects on covid and olympics data, emphasizing the importance of using python libraries like numpy, pandas, matplotlib, and seaborn, and provides statistics on the global impact of covid-19, including over 22 crore infections and 4.5 million deaths.', 'duration': 128.7, 'highlights': ['The chapter covers two Python data analysis projects on COVID and Olympics data. The tutorial focuses on two hands-on projects using Python programming, specifically on COVID and Olympics data analysis.', 'Emphasizing the importance of using Python libraries like NumPy, Pandas, Matplotlib, and Seaborn. The projects teach how to utilize Python libraries such as NumPy, Pandas, Matplotlib, and Seaborn for data analysis and data visualization.', 'Provides statistics on the global impact of COVID-19, including over 22 crore infections and 4.5 million deaths. The video provides global statistics on COVID-19, with over 22 crore infections and more than 4.5 million deaths due to the virus.']}, {'end': 404.357, 'start': 137.623, 'title': 'Covid-19 data analysis project', 'summary': 'Covers a hands-on project using three covid-19 datasets to perform data analysis and visualization using python and tableau, providing insights on the impact of coronavirus globally and in india, along with vaccination status.', 'duration': 266.734, 'highlights': ['The project involves using three different COVID-19 datasets for data analysis and visualization using Python and Tableau. The project emphasizes hands-on experience working with real-world COVID-19 datasets and utilizing Python libraries and Tableau to analyze and visualize data.', "Insights on the impact of coronavirus globally and in India, as well as the vaccination status, will be provided. The project aims to showcase the impact of coronavirus globally, including confirmed cases, deaths, recoveries, and active cases, along with a focus on India's COVID-19 influence and vaccination status.", "The project includes the demonstration of using two datasets, 'COVID-19 India' and 'COVID vaccine statewide,' for data analysis in Python. The project involves the use of specific datasets, such as 'COVID-19 India' and 'COVID vaccine statewide,' for conducting data analysis and visualization using Python."]}], 'duration': 395.975, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w8382.jpg', 'highlights': ['The chapter covers two Python data analysis projects on COVID and Olympics data.', 'Emphasizing the importance of using Python libraries like NumPy, Pandas, Matplotlib, and Seaborn.', 'The tutorial focuses on two hands-on projects using Python programming, specifically on COVID and Olympics data analysis.', 'The project involves using three different COVID-19 datasets for data analysis and visualization using Python and Tableau.', 'Provides statistics on the global impact of COVID-19, including over 22 crore infections and 4.5 million deaths.', 'Insights on the impact of coronavirus globally and in India, as well as the vaccination status, will be provided.']}, {'end': 1415.066, 'segs': [{'end': 480.161, 'src': 'embed', 'start': 427.734, 'weight': 0, 'content': [{'end': 430.616, 'text': 'we are going to use python jupyter notebook.', 'start': 427.734, 'duration': 2.882}, {'end': 438.02, 'text': "i'll just rename this notebook as covid data analysis project.", 'start': 430.616, 'duration': 7.404}, {'end': 442.476, 'text': 'click on rename.', 'start': 440.575, 'duration': 1.901}, {'end': 443.697, 'text': 'all right.', 'start': 442.476, 'duration': 1.221}, {'end': 449.44, 'text': 'so first and foremost, we need to import all the necessary libraries that we are going to use.', 'start': 443.697, 'duration': 5.743}, {'end': 451.642, 'text': "so first i'm importing pandas, spd.", 'start': 449.44, 'duration': 2.202}, {'end': 454.624, 'text': 'this is for data manipulation.', 'start': 451.642, 'duration': 2.982}, {'end': 459.546, 'text': 'then we have numpy, as np numpy is used for numerical computation.', 'start': 454.624, 'duration': 4.922}, {'end': 463.349, 'text': 'then we are importing matplotlib, seaborn and plotly.', 'start': 459.546, 'duration': 3.803}, {'end': 469.678, 'text': 'these three libraries will be used for plotting our data and creating interesting visualizations.', 'start': 464.156, 'duration': 5.522}, {'end': 475.459, 'text': "finally, i'm also importing my date time function.", 'start': 469.678, 'duration': 5.781}, {'end': 480.161, 'text': "all right, so i'll hit shift, enter to run the first cell.", 'start': 475.459, 'duration': 4.702}], 'summary': 'Using python jupyter notebook for covid data analysis, importing libraries for data manipulation and visualization.', 'duration': 52.427, 'max_score': 427.734, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w427734.jpg'}, {'end': 681.9, 'src': 'embed', 'start': 611.006, 'weight': 7, 'content': [{'end': 615.469, 'text': 'We have 18, 110 rows of information starting from zero till 18, 109.', 'start': 611.006, 'duration': 4.463}, {'end': 622.995, 'text': 'You see here the different types of variables or column names that we have.', 'start': 615.469, 'duration': 7.526}, {'end': 626.058, 'text': 'Then it has information about the memory usage as well.', 'start': 623.696, 'duration': 2.362}, {'end': 632.175, 'text': 'and this side you can see the data types.', 'start': 627.851, 'duration': 4.324}, {'end': 644.724, 'text': "cool, now we'll use another very important function, which is to get some idea about statistical analysis, the basic statistics, about your data set.", 'start': 632.175, 'duration': 12.549}, {'end': 650.208, 'text': "for that i'll be using the describe function.", 'start': 644.724, 'duration': 5.484}, {'end': 661.706, 'text': 'okay, so if you can see here, describe function is for numerical columns only and you have the measures such as count,', 'start': 650.208, 'duration': 11.498}, {'end': 672.351, 'text': 'the mean standard deviation maximum minimum, the 25th percentile, 50th percentile and the 75th percentile value.', 'start': 661.706, 'duration': 10.645}, {'end': 681.9, 'text': "okay, now, let's move ahead and import the second data set, which is related to vaccination.", 'start': 672.351, 'duration': 9.549}], 'summary': 'Data set contains 18,110 rows with statistics on numerical columns and data types.', 'duration': 70.894, 'max_score': 611.006, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w611006.jpg'}, {'end': 797.092, 'src': 'embed', 'start': 720.839, 'weight': 3, 'content': [{'end': 724.843, 'text': 'okay, so this is the data set that we saw.', 'start': 720.839, 'duration': 4.004}, {'end': 728.546, 'text': 'covid underscore vaccine, underscore state wise.', 'start': 724.843, 'duration': 3.703}, {'end': 741.138, 'text': "all right, let me run it cool and let's display the first seven rows of information from this data frame.", 'start': 728.546, 'duration': 12.592}, {'end': 744.561, 'text': "i'll be using the head function and inside the function i'll pass in seven.", 'start': 741.138, 'duration': 3.423}, {'end': 747.18, 'text': 'there you go.', 'start': 746.4, 'duration': 0.78}, {'end': 750.002, 'text': 'so here you can see we have from 0 till 6.', 'start': 747.18, 'duration': 2.822}, {'end': 753.684, 'text': 'there are total 24 columns.', 'start': 750.002, 'duration': 3.682}, {'end': 755.705, 'text': 'a lot of them have null values.', 'start': 753.684, 'duration': 2.021}, {'end': 764.35, 'text': 'you can see here all right now.', 'start': 755.705, 'duration': 8.645}, {'end': 775.247, 'text': "from the first data set, which is the covid underscore df data frame, we'll be dropping a few unnecessary columns,", 'start': 764.35, 'duration': 10.897}, {'end': 783.793, 'text': 'such as the time column confirmed Indian national and confirmed foreign national, as well as the s number.', 'start': 775.247, 'duration': 8.546}, {'end': 785.134, 'text': "we don't need these columns.", 'start': 783.793, 'duration': 1.341}, {'end': 788.836, 'text': "it's better to learn how to drop the columns for our analysis.", 'start': 785.134, 'duration': 3.702}, {'end': 797.092, 'text': "so I'll say covid underscore df dot.", 'start': 790.99, 'duration': 6.102}], 'summary': 'Data set: covid_vaccine_state wise, 24 columns, dropping unnecessary columns for analysis.', 'duration': 76.253, 'max_score': 720.839, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w720839.jpg'}, {'end': 914.746, 'src': 'embed', 'start': 874.294, 'weight': 2, 'content': [{'end': 880.619, 'text': 'okay, now we have remove these four columns.', 'start': 874.294, 'duration': 6.325}, {'end': 886.501, 'text': 'let me show you the data set now.', 'start': 880.619, 'duration': 5.882}, {'end': 888.162, 'text': 'there you go.', 'start': 886.501, 'duration': 1.661}, {'end': 893.944, 'text': 'so we have only the date column state or union territory, cured deaths and confirmed.', 'start': 888.162, 'duration': 5.782}, {'end': 901.727, 'text': "now let's see how you can change the format of the date column.", 'start': 893.944, 'duration': 7.783}, {'end': 902.648, 'text': 'for that.', 'start': 901.727, 'duration': 0.921}, {'end': 905.889, 'text': 'you have the function called to date, time.', 'start': 902.648, 'duration': 3.241}, {'end': 911.163, 'text': "I'll say covid underscore df.", 'start': 907, 'duration': 4.163}, {'end': 914.746, 'text': "I'll pass in my column name, that is date.", 'start': 911.163, 'duration': 3.583}], 'summary': "Data set was modified to include date, state/union territory, cured, deaths, and confirmed columns; date format changed using 'to_date' function.", 'duration': 40.452, 'max_score': 874.294, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w874294.jpg'}, {'end': 1021.234, 'src': 'heatmap', 'start': 919.47, 'weight': 0.858, 'content': [{'end': 927.676, 'text': "I'll use the pandas function, that is to underscore date time.", 'start': 919.47, 'duration': 8.206}, {'end': 933, 'text': "I'll say covid underscore df, which is my data frame name.", 'start': 927.676, 'duration': 5.324}, {'end': 935.682, 'text': 'pass in my variable, which is date.', 'start': 933, 'duration': 2.682}, {'end': 944.761, 'text': "give a comma and I'll use my argument that is format equal to.", 'start': 938.137, 'duration': 6.624}, {'end': 949.964, 'text': "I'll say %y, give a dash.", 'start': 944.761, 'duration': 5.203}, {'end': 956.829, 'text': 'say %m, give another dash and say %d.', 'start': 949.964, 'duration': 6.865}, {'end': 964.413, 'text': "let's run it and I'll print the head of the data frame.", 'start': 956.829, 'duration': 7.584}, {'end': 967.255, 'text': 'cool now, moving ahead.', 'start': 964.413, 'duration': 2.842}, {'end': 972.977, 'text': 'now we will see how to find the total number of active cases.', 'start': 968.834, 'duration': 4.143}, {'end': 982.623, 'text': 'so active cases, nothing, but the total number of confirmed cases, minus the sum of cured cases plus deaths reported.', 'start': 972.977, 'duration': 9.646}, {'end': 986.405, 'text': "so let's find the active cases.", 'start': 982.623, 'duration': 3.782}, {'end': 989.987, 'text': "i'll give a comment.", 'start': 986.405, 'duration': 3.582}, {'end': 999.914, 'text': "okay, i'll first write my data frame name, that is covid underscore df within square brackets.", 'start': 989.987, 'duration': 9.927}, {'end': 1005.259, 'text': "I'll give my new column, which is active underscore cases.", 'start': 999.914, 'duration': 5.345}, {'end': 1021.234, 'text': "I'll say equal to covid underscore DF and my first column would be the confirmed cases, column minus.", 'start': 1005.259, 'duration': 15.975}], 'summary': 'Using pandas to format datetime and calculate active cases in a covid dataframe.', 'duration': 101.764, 'max_score': 919.47, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w919470.jpg'}, {'end': 1113.301, 'src': 'embed', 'start': 1086.838, 'weight': 9, 'content': [{'end': 1097.647, 'text': 'so in this table we will be summing all the confirmed deaths and cured cases for each of the states and union territories.', 'start': 1086.838, 'duration': 10.809}, {'end': 1100.83, 'text': 'so we will be using the pivot underscore table function for this.', 'start': 1097.647, 'duration': 3.183}, {'end': 1108.256, 'text': 'I will create a variable called statewise and say pd dot.', 'start': 1100.83, 'duration': 7.426}, {'end': 1113.301, 'text': 'I will use the pivot underscore table function.', 'start': 1110.68, 'duration': 2.621}], 'summary': 'Summing confirmed deaths and cured cases for states and union territories using pivot table function.', 'duration': 26.463, 'max_score': 1086.838, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w1086838.jpg'}, {'end': 1334.175, 'src': 'embed', 'start': 1260.295, 'weight': 10, 'content': [{'end': 1263.979, 'text': 'so this time we are going to find out the mortality rate.', 'start': 1260.295, 'duration': 3.684}, {'end': 1271.466, 'text': 'so mortality rate is nothing but the total number of deaths, divided by the total number of confirmed cases into 100.', 'start': 1263.979, 'duration': 7.487}, {'end': 1276.511, 'text': "so I'm just going to replace the names here.", 'start': 1271.466, 'duration': 5.045}, {'end': 1291.328, 'text': "I'll say mortality, all right, and then instead of cured, i'll say my deaths column into 100, divided by the confirmed cases.", 'start': 1276.511, 'duration': 14.817}, {'end': 1293.95, 'text': "let's run it okay.", 'start': 1291.328, 'duration': 2.622}, {'end': 1304.759, 'text': "now we are going to sort the values based on the confirmed cases column, and we'll sort it in descending order.", 'start': 1293.95, 'duration': 10.809}, {'end': 1305.919, 'text': 'so let me show you how to do it.', 'start': 1304.759, 'duration': 1.16}, {'end': 1307.841, 'text': "i'll say state wise, equal to.", 'start': 1305.919, 'duration': 1.922}, {'end': 1312.579, 'text': "I'll use the function short underscore values.", 'start': 1309.577, 'duration': 3.002}, {'end': 1323.047, 'text': "so I'll pass in my variable state wise dot and use the short underscore values function I'll say by.", 'start': 1312.579, 'duration': 10.468}, {'end': 1334.175, 'text': 'I want to sort it by my confirmed cases column.', 'start': 1323.047, 'duration': 11.128}], 'summary': 'Analyzing mortality rate and sorting data by confirmed cases.', 'duration': 73.88, 'max_score': 1260.295, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w1260295.jpg'}], 'start': 404.357, 'title': 'Covid-19 data analysis', 'summary': 'Covers the recent development on coronavirus, emphasizes the importance of staying updated on the latest news, demonstrates the process of setting up a covid data analysis project using python, loading and manipulating covid-19 and vaccination data for india, including 18,110 rows of covid-19 data with 9 columns, loading and display of the first 7 rows from the vaccination dataset with 24 columns, and data manipulation techniques including formatting date column, calculating active cases, creating pivot table, finding recovery and mortality rates, sorting values, and plotting pivot table visually.', 'chapters': [{'end': 480.161, 'start': 404.357, 'title': 'Covid data analysis using python', 'summary': 'Covers the recent development on coronavirus, emphasizes the importance of staying updated on the latest news, and demonstrates the process of setting up a covid data analysis project using python and jupyter notebook by importing necessary libraries.', 'duration': 75.804, 'highlights': ['The chapter covers the recent development on coronavirus, emphasizes the importance of staying updated on the latest news, and demonstrates the process of setting up a COVID data analysis project using Python and Jupyter Notebook by importing necessary libraries.', 'The importation of pandas, numpy, matplotlib, seaborn, and plotly libraries for data manipulation, numerical computation, and visualization is demonstrated to facilitate COVID data analysis using Python.', 'The demonstration also includes the importation of the date time function for time-related operations in the COVID data analysis project.']}, {'end': 874.294, 'start': 480.161, 'title': 'Loading covid-19 data and vaccination data', 'summary': 'Covers loading and manipulating covid-19 and vaccination data for india, including 18,110 rows of covid-19 data with 9 columns and the initial steps for filtering unnecessary columns in the covid-19 dataset, followed by the loading and display of the first 7 rows from the vaccination dataset with 24 columns and null values in multiple columns.', 'duration': 394.133, 'highlights': ['The COVID-19 dataset consists of 18,110 rows of information with 9 columns, including confirmed cases, deaths reported, and different states and union territories.', "The initial steps for data manipulation involve dropping unnecessary columns such as 'S number', 'confirmed Indian national', 'confirmed foreign national', and 'time' from the COVID-19 dataset to streamline the analysis.", 'The vaccination dataset contains 24 columns with multiple columns having null values, and the chapter demonstrates the loading and display of the first 7 rows of information from this dataset.', 'The describe function provides basic statistical analysis for numerical columns, including count, mean, standard deviation, maximum, minimum, and percentile values, offering insights into the data distribution and variability.', 'The info function offers a comprehensive overview of the dataset, including the total number of columns, entries, data types, and memory usage, aiding in understanding the structure and characteristics of the data.']}, {'end': 1415.066, 'start': 874.294, 'title': 'Data analysis: covid-19 dataset manipulation', 'summary': 'Covers data manipulation techniques including formatting date column, calculating active cases, creating pivot table, finding recovery and mortality rates, sorting values, and plotting pivot table visually.', 'duration': 540.772, 'highlights': ['The chapter demonstrates how to create a pivot table using the pandas library, summing confirmed, deaths, and cured cases for each state and union territory.', 'The chapter shows how to calculate recovery rate by dividing total cured cases by total confirmed cases, and how to calculate mortality rate by dividing total deaths by total confirmed cases into 100.', 'The chapter explains the process of sorting values based on the confirmed cases column in descending order.', 'The chapter provides guidance on formatting the date column using the to_datetime function and changing the date format to %y-%m-%d.']}], 'duration': 1010.709, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w404357.jpg', 'highlights': ['Demonstrates setting up a COVID data analysis project using Python and Jupyter Notebook.', 'Importation of pandas, numpy, matplotlib, seaborn, and plotly libraries for data manipulation and visualization.', 'Importation of the date time function for time-related operations in the COVID data analysis project.', 'COVID-19 dataset consists of 18,110 rows of information with 9 columns.', 'Initial steps for data manipulation involve dropping unnecessary columns from the COVID-19 dataset.', 'Vaccination dataset contains 24 columns with multiple columns having null values.', 'Demonstrates loading and display of the first 7 rows of information from the vaccination dataset.', 'Describe function provides basic statistical analysis for numerical columns.', 'Info function offers a comprehensive overview of the dataset, aiding in understanding the structure and characteristics of the data.', 'Demonstrates creating a pivot table using the pandas library, summing confirmed, deaths, and cured cases for each state and union territory.', 'Shows how to calculate recovery rate and mortality rate based on total cured, deaths, and confirmed cases.', 'Provides guidance on sorting values based on the confirmed cases column in descending order.', 'Guidance on formatting the date column using the to_datetime function and changing the date format.']}, {'end': 2746.682, 'segs': [{'end': 1479.106, 'src': 'embed', 'start': 1449.13, 'weight': 2, 'content': [{'end': 1453.231, 'text': 'now, as i said in the beginning, there are a few discrepancies in the data set.', 'start': 1449.13, 'duration': 4.101}, {'end': 1461.176, 'text': "so here you can see there's one called Maharashtra and there's also Maharashtra triple star.", 'start': 1453.231, 'duration': 7.945}, {'end': 1463.697, 'text': 'this you can ignore, even if I scroll down.', 'start': 1461.176, 'duration': 2.521}, {'end': 1466.258, 'text': 'you have Madhya Pradesh, followed by three asterisks.', 'start': 1463.697, 'duration': 2.561}, {'end': 1468.819, 'text': 'you can ignore this value as well, even for Bihar we have.', 'start': 1466.258, 'duration': 2.561}, {'end': 1479.106, 'text': 'so these have been duplicated and here you can see the different state names and union territories.', 'start': 1470.663, 'duration': 8.443}], 'summary': 'Data set contains duplicated entries for states and union territories.', 'duration': 29.976, 'max_score': 1449.13, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w1449130.jpg'}, {'end': 1529.837, 'src': 'embed', 'start': 1502.788, 'weight': 0, 'content': [{'end': 1511.531, 'text': 'Our data says that Maharashtra has the highest number of cases, followed by Kerala, Karnataka, Tamil Nadu, Andhra Pradesh and Uttar Pradesh.', 'start': 1502.788, 'duration': 8.743}, {'end': 1517.273, 'text': 'So these are the top five states which have the highest number of confirmed cases.', 'start': 1511.551, 'duration': 5.722}, {'end': 1521.294, 'text': 'Even if you see the mortality rate is also high for Maharashtra.', 'start': 1518.113, 'duration': 3.181}, {'end': 1529.837, 'text': 'And if I scroll down, the mortality rate is also high for Uttarakhand if you see here.', 'start': 1523.015, 'duration': 6.822}], 'summary': 'Maharashtra has the highest number of cases, followed by kerala, karnataka, tamil nadu, andhra pradesh, and uttar pradesh, with high mortality rates in maharashtra and uttarakhand.', 'duration': 27.049, 'max_score': 1502.788, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w1502788.jpg'}, {'end': 1755.14, 'src': 'heatmap', 'start': 1650.327, 'weight': 0.716, 'content': [{'end': 1663.39, 'text': 'let me bring this to the next line underscore cases.', 'start': 1650.327, 'duration': 13.063}, {'end': 1676.833, 'text': 'give a comma and say ascending equal to false, say dot and then reset my index.', 'start': 1663.39, 'duration': 13.443}, {'end': 1682.413, 'text': "For that I'll use reset underscore index function.", 'start': 1679.172, 'duration': 3.241}, {'end': 1686.854, 'text': "Okay Let's check if everything is fine.", 'start': 1683.553, 'duration': 3.301}, {'end': 1691.216, 'text': 'I have missed a square bracket here.', 'start': 1688.155, 'duration': 3.061}, {'end': 1693.396, 'text': 'Let me give another square bracket here.', 'start': 1691.256, 'duration': 2.14}, {'end': 1708.321, 'text': 'Okay And all this we are going to store in our variable called top underscore 10 underscore active underscore cases.', 'start': 1694.737, 'duration': 13.584}, {'end': 1715.08, 'text': 'okay, now let me go ahead and run this cell.', 'start': 1710.618, 'duration': 4.462}, {'end': 1721.902, 'text': 'okay, there is a syntax error here.', 'start': 1715.08, 'duration': 6.822}, {'end': 1723.022, 'text': "now let's run it.", 'start': 1721.902, 'duration': 1.12}, {'end': 1729.485, 'text': "okay. now i'll create another variable called fig.", 'start': 1723.022, 'duration': 6.463}, {'end': 1738.487, 'text': "here we'll pass in the plt, which is for matplotlib library,", 'start': 1731.761, 'duration': 6.726}, {'end': 1755.14, 'text': "and we'll give the figure size using the fig size argument it's equal to and within a tuple and passing the size, let's say 16, comma 9.", 'start': 1738.487, 'duration': 16.653}], 'summary': 'Using underscore cases and matplotlib, creating variables for top 10 active cases and figure size 16x9.', 'duration': 104.813, 'max_score': 1650.327, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w1650327.jpg'}, {'end': 1967.883, 'src': 'embed', 'start': 1923.703, 'weight': 5, 'content': [{'end': 1926.825, 'text': 'let me change it to cases here.', 'start': 1923.703, 'duration': 3.122}, {'end': 1932.266, 'text': 'now run it.', 'start': 1930.885, 'duration': 1.381}, {'end': 1934.427, 'text': 'okay, the x axis also has a mistake.', 'start': 1932.266, 'duration': 2.161}, {'end': 1938.59, 'text': 'this should be state slash union territory.', 'start': 1934.427, 'duration': 4.163}, {'end': 1940.391, 'text': 'now let me run it.', 'start': 1938.59, 'duration': 1.801}, {'end': 1941.132, 'text': 'there you go.', 'start': 1940.391, 'duration': 0.741}, {'end': 1952.779, 'text': 'we have our plot created, but as you can see here the labels of the different states and union territories are overlapping.', 'start': 1941.132, 'duration': 11.647}, {'end': 1958.634, 'text': 'so for that let me first pass in the X labels.', 'start': 1952.779, 'duration': 5.855}, {'end': 1967.883, 'text': "I'll say PLT dot X label and my X label would be states.", 'start': 1958.634, 'duration': 9.249}], 'summary': 'Changed cases to state/union territory labels, fixed x-axis mistake, and adjusted labels for plot creation.', 'duration': 44.18, 'max_score': 1923.703, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w1923703.jpg'}, {'end': 2091.712, 'src': 'embed', 'start': 2048.61, 'weight': 3, 'content': [{'end': 2057.054, 'text': 'on the top you can see the title top 10 states with most active cases, and you see the edges are in red color for all the bars.', 'start': 2048.61, 'duration': 8.444}, {'end': 2062.077, 'text': 'on the x-axis you have the different state names maharashtra, karnataka, kerala.', 'start': 2057.054, 'duration': 5.023}, {'end': 2064.659, 'text': 'you also have andhra pradesh, gujarat, west bengal and chhattisgarh.', 'start': 2062.077, 'duration': 2.582}, {'end': 2076.688, 'text': 'So, as you can see, Maharashtra has the highest number of active cases based on our data, followed by Karnataka, Kerala and Tamil Nadu at 2nd,', 'start': 2065.438, 'duration': 11.25}, {'end': 2081.752, 'text': '3rd and 4th place respectively, and in the 9th place we have West Bengal.', 'start': 2076.688, 'duration': 5.064}, {'end': 2083.213, 'text': 'in the 10th place we have Chhattisgarh.', 'start': 2081.752, 'duration': 1.461}, {'end': 2091.712, 'text': 'on the y-axis you can see the total active cases which are in lakhs.', 'start': 2084.786, 'duration': 6.926}], 'summary': 'Maharashtra has the highest active cases, followed by karnataka, kerala, and tamil nadu, with west bengal and chhattisgarh also in the top 10 states with active cases in lakhs.', 'duration': 43.102, 'max_score': 2048.61, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w2048610.jpg'}, {'end': 2401.483, 'src': 'embed', 'start': 2361.791, 'weight': 7, 'content': [{'end': 2376.999, 'text': 'x label would be states my y label will be total death cases.', 'start': 2361.791, 'duration': 15.208}, {'end': 2383.1, 'text': 'then I will write plt.show.', 'start': 2376.999, 'duration': 6.101}, {'end': 2385.42, 'text': "now let's run it.", 'start': 2383.1, 'duration': 2.32}, {'end': 2386.92, 'text': 'there you go.', 'start': 2385.42, 'duration': 1.5}, {'end': 2389.241, 'text': 'you can see here we have a nice bar plot.', 'start': 2386.92, 'duration': 2.321}, {'end': 2393.001, 'text': 'on the top we have the title top 10 states with most deaths.', 'start': 2389.241, 'duration': 3.76}, {'end': 2398.702, 'text': 'now, what I specifically wanted you to see was these discrepancies in the data.', 'start': 2393.001, 'duration': 5.701}, {'end': 2399.803, 'text': 'you can see here.', 'start': 2398.702, 'duration': 1.101}, {'end': 2401.483, 'text': 'Maharashtra is repeated twice.', 'start': 2399.803, 'duration': 1.68}], 'summary': 'Bar plot shows discrepancies in death data, with maharashtra repeated twice.', 'duration': 39.692, 'max_score': 2361.791, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w2361791.jpg'}, {'end': 2456.577, 'src': 'embed', 'start': 2429.608, 'weight': 8, 'content': [{'end': 2436.358, 'text': 'so we have Maharashtra, Karnataka, Tamil Nadu, Delhi, then Uttar Pradesh, West Bengal, Kerala, Punjab,', 'start': 2429.608, 'duration': 6.75}, {'end': 2443.265, 'text': 'Andhra Pradesh and Chhattisgarh with the states that have the most number of deaths reported.', 'start': 2436.358, 'duration': 6.907}, {'end': 2456.577, 'text': "okay, now we'll create a line plot to see the growth or the trend of active cases for top five states with most number of confirmed cases.", 'start': 2443.265, 'duration': 13.312}], 'summary': 'Top 5 states with most deaths: maharashtra, karnataka, tamil nadu, delhi, uttar pradesh.', 'duration': 26.969, 'max_score': 2429.608, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w2429608.jpg'}], 'start': 1415.066, 'title': 'Covid-19 data analysis and visualization', 'summary': 'Delves into a covid data analysis project, uncovering dataset discrepancies, identifying top states with high confirmed cases and mortality rates. it introduces pandas group by function to find top 10 states with active cases, creating visualizations, and addressing label issues. additionally, it highlights top states with active cases and deaths, showcasing maharashtra, karnataka, kerala, and tamil nadu. furthermore, it presents a line plot to visualize growth trends of active cases in the top 5 affected states.', 'chapters': [{'end': 1569.619, 'start': 1415.066, 'title': 'Covid data analysis', 'summary': 'Explores a covid data analysis project, revealing discrepancies in the dataset, highlighting the top five states with the highest number of confirmed cases, and discussing the mortality rates for various states.', 'duration': 154.553, 'highlights': ['The top five states with the highest number of confirmed cases are Maharashtra, Kerala, Karnataka, Tamil Nadu, and Andhra Pradesh.', 'The mortality rate is high for Maharashtra, Uttarakhand, and Punjab.', 'Discrepancies in the dataset include duplicated state names and union territories, such as Maharashtra and Maharashtra triple star, Madhya Pradesh followed by three asterisks, and Bihar.', "The chapter visualizes the COVID data using a color map named 'cube helix' and presents a pivot table showing confirmed cases, cured cases, deaths reported, and calculated columns for recovery rate and mortality rate."]}, {'end': 2048.61, 'start': 1569.619, 'title': 'Pandas group by & visualization', 'summary': 'Introduces the group by function in pandas to find the top 10 states with the most active cases in india, creating a bar plot to visualize the data, and addressing issues with overlapping labels.', 'duration': 478.991, 'highlights': ['Introducing group by function in pandas to find top 10 states with most active cases in India The function group by is used to find the top 10 states with the most active cases in India.', 'Creating a bar plot to visualize the top 10 states with most active cases in India A bar plot is created to visualize the top 10 states with the most active cases in India.', 'Addressing issues with overlapping labels in the bar plot Steps are taken to address the issue of overlapping labels in the created bar plot.']}, {'end': 2429.608, 'start': 2048.61, 'title': 'Top 10 states: active cases & deaths', 'summary': 'Presents the top 10 states with the most active cases, highlighting maharashtra as having the highest number of active cases, followed by karnataka, kerala, and tamil nadu. it also showcases the top 10 states with the most deaths, revealing discrepancies in the data, particularly the repetition of maharashtra and misspelling of karnataka.', 'duration': 380.998, 'highlights': ['Maharashtra has the highest number of active cases, followed by Karnataka, Kerala, and Tamil Nadu at 2nd, 3rd, and 4th place respectively, with West Bengal in 9th place and Chhattisgarh in 10th place. Maharashtra leads with the highest number of active cases, followed by Karnataka, Kerala, and Tamil Nadu at 2nd, 3rd, and 4th place respectively, with West Bengal in 9th place and Chhattisgarh in 10th place.', 'The bar plot depicts the top 10 states with the most deaths, revealing discrepancies in the data, such as the repetition of Maharashtra and misspelling of Karnataka. The bar plot illustrates the top 10 states with the most deaths, exposing data discrepancies like the repetition of Maharashtra and misspelling of Karnataka.']}, {'end': 2746.682, 'start': 2429.608, 'title': 'Covid-19: top 5 affected states line plot', 'summary': 'Presents the top 5 affected states in india with the most number of deaths reported, followed by creating a line plot to visualize the growth trend of active cases for maharashtra, karnataka, kerala, tamil nadu, and uttar pradesh.', 'duration': 317.074, 'highlights': ['The top 5 affected states in India are Maharashtra, Karnataka, Kerala, Tamil Nadu, and Uttar Pradesh. These states have the most number of deaths reported.', 'A line plot is created to visualize the growth trend of active cases for the top 5 affected states. The line plot shows how the active cases surged around April and May for Maharashtra and Karnataka, and the trend for Uttar Pradesh, Tamil Nadu, and Kerala.', 'The line plot uses different colors to represent the 5 states and their respective active cases trend. The colors help differentiate the active cases trend for each state, providing a visual comparison of the surge and decline over time.']}], 'duration': 1331.616, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w1415066.jpg', 'highlights': ['The top five states with the highest number of confirmed cases are Maharashtra, Kerala, Karnataka, Tamil Nadu, and Andhra Pradesh.', 'The mortality rate is high for Maharashtra, Uttarakhand, and Punjab.', 'Discrepancies in the dataset include duplicated state names and union territories, such as Maharashtra and Maharashtra triple star, Madhya Pradesh followed by three asterisks, and Bihar.', 'Introducing group by function in pandas to find top 10 states with most active cases in India The function group by is used to find the top 10 states with the most active cases in India.', 'Creating a bar plot to visualize the top 10 states with most active cases in India A bar plot is created to visualize the top 10 states with the most active cases in India.', 'Addressing issues with overlapping labels in the bar plot Steps are taken to address the issue of overlapping labels in the created bar plot.', 'Maharashtra has the highest number of active cases, followed by Karnataka, Kerala, and Tamil Nadu at 2nd, 3rd, and 4th place respectively, with West Bengal in 9th place and Chhattisgarh in 10th place. Maharashtra leads with the highest number of active cases, followed by Karnataka, Kerala, and Tamil Nadu at 2nd, 3rd, and 4th place respectively, with West Bengal in 9th place and Chhattisgarh in 10th place.', 'The bar plot depicts the top 10 states with the most deaths, revealing discrepancies in the data, such as the repetition of Maharashtra and misspelling of Karnataka. The bar plot illustrates the top 10 states with the most deaths, exposing data discrepancies like the repetition of Maharashtra and misspelling of Karnataka.', 'The top 5 affected states in India are Maharashtra, Karnataka, Kerala, Tamil Nadu, and Uttar Pradesh. These states have the most number of deaths reported.', 'A line plot is created to visualize the growth trend of active cases for the top 5 affected states. The line plot shows how the active cases surged around April and May for Maharashtra and Karnataka, and the trend for Uttar Pradesh, Tamil Nadu, and Kerala.', 'The line plot uses different colors to represent the 5 states and their respective active cases trend. The colors help differentiate the active cases trend for each state, providing a visual comparison of the surge and decline over time.']}, {'end': 3858.655, 'segs': [{'end': 2782.133, 'src': 'embed', 'start': 2746.682, 'weight': 0, 'content': [{'end': 2761.247, 'text': 'you can see one common trend that after March, so around April, the cases started to emerge very rapidly and later after July, they started dipping.', 'start': 2746.682, 'duration': 14.565}, {'end': 2768.25, 'text': 'okay, now we are going to use our second data set, which is related to vaccination.', 'start': 2761.247, 'duration': 7.003}, {'end': 2782.133, 'text': 'alright, so first and foremost, let me go ahead and print the data frame for you so that you know data that we are going to use.', 'start': 2769.645, 'duration': 12.488}], 'summary': 'Covid-19 cases rapidly increased after march, dipped after july. vaccination data to be used.', 'duration': 35.451, 'max_score': 2746.682, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w2746682.jpg'}, {'end': 2998.876, 'src': 'embed', 'start': 2969.841, 'weight': 2, 'content': [{'end': 2981.063, 'text': 'so almost all the columns have missing values, but columns such as male individuals vaccinated, female individuals vaccinated.', 'start': 2969.841, 'duration': 11.222}, {'end': 2987.147, 'text': 'even for columns like the age groups, there are a lot of missing values.', 'start': 2981.063, 'duration': 6.084}, {'end': 2994.312, 'text': 'now, in the next step we are going to drop a few missing value columns.', 'start': 2987.147, 'duration': 7.165}, {'end': 2998.876, 'text': "so i'll say a new variable called vaccination.", 'start': 2994.312, 'duration': 4.564}], 'summary': 'Data analysis reveals high amount of missing values across multiple columns. plan to drop missing value columns and create new variable for vaccination.', 'duration': 29.035, 'max_score': 2969.841, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w2969841.jpg'}, {'end': 3433.998, 'src': 'embed', 'start': 3359.099, 'weight': 1, 'content': [{'end': 3364.44, 'text': "so we'll take some time to create the pie chart.", 'start': 3359.099, 'duration': 5.341}, {'end': 3366.501, 'text': 'here we go.', 'start': 3364.44, 'duration': 2.061}, {'end': 3371.522, 'text': 'see, here i have my title male and female vaccination.', 'start': 3366.501, 'duration': 5.021}, {'end': 3375.303, 'text': 'so these are the two pies or the areas.', 'start': 3371.522, 'duration': 3.781}, {'end': 3379.671, 'text': 'you have the label female and the value.', 'start': 3376.809, 'duration': 2.862}, {'end': 3382.553, 'text': 'here you have the label male and the value.', 'start': 3379.671, 'duration': 2.882}, {'end': 3392.68, 'text': 'so from our data you can see 53% male individuals have been vaccinated compared to 47% for females.', 'start': 3382.553, 'duration': 10.127}, {'end': 3401.707, 'text': 'okay, now we are going to drop all those rows where our state was mentioned as India in the original data set.', 'start': 3392.68, 'duration': 9.027}, {'end': 3403.608, 'text': 'I had shown this to you in the beginning.', 'start': 3401.707, 'duration': 1.901}, {'end': 3410.287, 'text': 'so we are going to remove rows where state is india.', 'start': 3403.608, 'duration': 6.679}, {'end': 3422.556, 'text': "okay, so to do that i'll say vaccine.", 'start': 3410.287, 'duration': 12.269}, {'end': 3433.998, 'text': "i'm creating another variable called vaccine and we'll use our original data frame that we had imported, which is vaccine underscore DF.", 'start': 3422.556, 'duration': 11.442}], 'summary': "53% males and 47% females vaccinated, removing rows with state 'india'.", 'duration': 74.899, 'max_score': 3359.099, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w3359099.jpg'}, {'end': 3690.685, 'src': 'embed', 'start': 3657.287, 'weight': 4, 'content': [{'end': 3659.167, 'text': 'I want the top five states.', 'start': 3657.287, 'duration': 1.88}, {'end': 3666.329, 'text': "Now let's go ahead and print it.", 'start': 3662.008, 'duration': 4.321}, {'end': 3669.929, 'text': 'So this will create a nice table for us.', 'start': 3668.049, 'duration': 1.88}, {'end': 3670.85, 'text': 'There you go.', 'start': 3670.47, 'duration': 0.38}, {'end': 3675.55, 'text': 'Here you can see we have Maharashtra, Uttar, Pradesh, Rajasthan,', 'start': 3671.71, 'duration': 3.84}, {'end': 3679.651, 'text': 'Gujarat and West Bengal as the top five states with most number of vaccinated individuals.', 'start': 3675.55, 'duration': 4.101}, {'end': 3686.504, 'text': 'Now we are going to use this table and convert it into a chart now.', 'start': 3680.811, 'duration': 5.693}, {'end': 3690.685, 'text': 'I had already written my code for the bar plot.', 'start': 3686.504, 'duration': 4.181}], 'summary': 'The top five states with the most vaccinated individuals are maharashtra, uttar pradesh, rajasthan, gujarat, and west bengal.', 'duration': 33.398, 'max_score': 3657.287, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w3657287.jpg'}], 'start': 2746.682, 'title': 'Covid-19 cases, vaccination trends, and data visualization', 'summary': 'Analyzes covid-19 case trends, including a rapid increase from april to july, transitions to vaccination data, revealing a 53% male and 47% female vaccination distribution, and covers data cleaning, renaming columns, and visualizing the top 5 vaccinated states in india.', 'chapters': [{'end': 3157.177, 'start': 2746.682, 'title': 'Analysis of covid-19 cases and vaccination data', 'summary': 'Explores the trend of covid-19 cases, with a rapid increase from april to july, and then transitions to analyzing vaccination data, addressing missing values and dropping specific columns.', 'duration': 410.495, 'highlights': ['The chapter explores the trend of COVID-19 cases, with a rapid increase from April to July, and then transitions to analyzing vaccination data. The cases started to emerge rapidly after March, peaking in April and declining after July, indicating the trend of COVID-19 cases. The analysis then shifts to the examination of vaccination data.', 'Addressing missing values in the vaccination data. The data set for vaccination analysis contains numerous missing values across various columns, such as male and female individuals vaccinated, and age groups, necessitating the need to address the missing values.', "Dropping specific columns from the vaccination data. Columns like 'doses administered', 'Sputnik V doses administered', 'AEFI', '18 to 44 years doses administered', '45 to 60 years doses administered', and '60 plus years doses administered' are identified and subsequently dropped from the vaccination data set."]}, {'end': 3392.68, 'start': 3157.177, 'title': 'Male vs female vaccination plot', 'summary': 'Discusses the process of creating a pie plot using the plotly library to visualize the vaccination distribution between male and female individuals, revealing that 53% of male individuals have been vaccinated compared to 47% for females.', 'duration': 235.503, 'highlights': ['53% of male individuals have been vaccinated compared to 47% for females The pie chart visualization shows that 53% of male individuals have been vaccinated compared to 47% for females, providing quantifiable data on the vaccination distribution.', 'Creating a pie plot using the Plotly library to visualize the vaccination distribution between male and female individuals The process of creating a pie plot using the Plotly library to visualize the vaccination distribution between male and female individuals is described, highlighting the use of visualization techniques for data analysis and presentation.', 'Filtering out male and female individuals from the data frame and calculating their respective sums The process of filtering out male and female individuals from the data frame and calculating their respective sums is detailed, demonstrating the data manipulation steps involved in preparing the data for visualization.']}, {'end': 3654.26, 'start': 3392.68, 'title': 'Data cleaning and visualization in python', 'summary': 'Covers data cleaning by removing rows with state as india, renaming a column, and visualizing the states with the most and least vaccinated individuals using python.', 'duration': 261.58, 'highlights': ["The chapter begins by dropping rows where the state is mentioned as India in the original dataset, followed by renaming the 'total individuals vaccinated' column to 'total'.", 'The next step involves creating visualizations to find the states with the most and least number of vaccinated individuals using Python.', "A variable 'max_vac' is created to group the data by state and find the total sum of vaccinated individuals for each state, sorting it in descending order to identify the states with the highest number of individuals vaccinated."]}, {'end': 3858.655, 'start': 3657.287, 'title': 'Top 5 vaccinated states in india', 'summary': 'Presents the top 5 states with the most vaccinated individuals in india, showcasing maharashtra as the highest, followed by uttar pradesh, rajasthan, gujarat, and west bengal. it also encourages viewers to identify and create a bar plot for the bottom 5 vaccinated states.', 'duration': 201.368, 'highlights': ['Maharashtra has the highest number of vaccinations Maharashtra is identified as the state with the highest number of vaccinated individuals.', 'Encouragement for viewers to create a bar plot for the bottom 5 vaccinated states Viewers are encouraged to identify and create a bar plot for the states with the least number of vaccinated individuals.', 'Use of two datasets for COVID-19 analysis in India Two datasets were utilized for the COVID-19 data analysis project, including one for COVID-19 data for different states and union territories, and another for analyzing the vaccination status in different states in India.']}], 'duration': 1111.973, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w2746682.jpg', 'highlights': ['The chapter explores the trend of COVID-19 cases, with a rapid increase from April to July, and then transitions to analyzing vaccination data.', '53% of male individuals have been vaccinated compared to 47% for females.', 'Addressing missing values in the vaccination data.', "The chapter begins by dropping rows where the state is mentioned as India in the original dataset, followed by renaming the 'total individuals vaccinated' column to 'total'.", 'Maharashtra has the highest number of vaccinations.']}, {'end': 4732.47, 'segs': [{'end': 3924.843, 'src': 'embed', 'start': 3887.171, 'weight': 0, 'content': [{'end': 3892.074, 'text': 'there are 192 countries in our data set.', 'start': 3887.171, 'duration': 4.903}, {'end': 3894.056, 'text': 'let me go to the top.', 'start': 3892.074, 'duration': 1.982}, {'end': 3899.279, 'text': 'so we have country or region.', 'start': 3894.056, 'duration': 5.223}, {'end': 3903.882, 'text': 'we also have the code for each of the countries.', 'start': 3899.279, 'duration': 4.603}, {'end': 3907.264, 'text': 'then we have column called last updated.', 'start': 3903.882, 'duration': 3.382}, {'end': 3918.277, 'text': "so here you can see our data is still 12th of March 2021 and there are two columns which don't have any information, that is,", 'start': 3907.264, 'duration': 11.013}, {'end': 3921.701, 'text': 'people hospitalized and people tested.', 'start': 3918.277, 'duration': 3.424}, {'end': 3924.843, 'text': 'then we have a column called active cases.', 'start': 3921.701, 'duration': 3.142}], 'summary': 'Data set includes 192 countries with last update on 12th march 2021, with some missing information such as people hospitalized and people tested.', 'duration': 37.672, 'max_score': 3887.171, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w3887171.jpg'}, {'end': 4016.496, 'src': 'embed', 'start': 3953.274, 'weight': 5, 'content': [{'end': 3968.858, 'text': "confirmed deaths and the recovered cases column and we'll be using the tableau software to create interesting visualizations and we'll convert those visuals and put it in the form of a dashboard towards the end of the project.", 'start': 3953.274, 'duration': 15.584}, {'end': 3974.06, 'text': 'okay, so let me close this file or let it be open.', 'start': 3968.858, 'duration': 5.202}, {'end': 3975.942, 'text': "i'll search for tableau public.", 'start': 3974.06, 'duration': 1.882}, {'end': 3980.584, 'text': 'you can see here i have tableau public installed.', 'start': 3975.942, 'duration': 4.642}, {'end': 3985.727, 'text': 'you can use the desktop version of tableau or the public version of tableau.', 'start': 3980.584, 'duration': 5.143}, {'end': 3994.196, 'text': 'so for this project we are going to use the tableau public edition.', 'start': 3987.011, 'duration': 7.185}, {'end': 3998.318, 'text': 'so here you can see, i have created a few visuals already.', 'start': 3994.196, 'duration': 4.122}, {'end': 4002.901, 'text': 'if you want, you can clear them out.', 'start': 3998.318, 'duration': 4.583}, {'end': 4010.992, 'text': 'okay, now, the first thing we need to do is to connect our csv data set to tableau.', 'start': 4002.901, 'duration': 8.091}, {'end': 4016.496, 'text': 'for that, under connect, we have to a file.', 'start': 4010.992, 'duration': 5.504}], 'summary': 'Using tableau public to visualize covid-19 data for project.', 'duration': 63.222, 'max_score': 3953.274, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w3953274.jpg'}, {'end': 4237.106, 'src': 'embed', 'start': 4203.367, 'weight': 1, 'content': [{'end': 4219.161, 'text': 'so tableau how easily has shown me a table where you can see the different country names and on the right you have the total number of cases that have emerged from these countries.', 'start': 4203.367, 'duration': 15.794}, {'end': 4221.621, 'text': 'now, this is not the most recent data.', 'start': 4219.161, 'duration': 2.46}, {'end': 4227.003, 'text': 'as i told you, this data was till march 2021.', 'start': 4221.621, 'duration': 5.382}, {'end': 4236.406, 'text': 'cool, now you can go ahead and sort the table based on the total confirmed cases here on the top.', 'start': 4227.003, 'duration': 9.403}, {'end': 4237.106, 'text': 'you can see here.', 'start': 4236.406, 'duration': 0.7}], 'summary': 'Tableau displays country names and total cases till march 2021.', 'duration': 33.739, 'max_score': 4203.367, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w4203367.jpg'}, {'end': 4288.073, 'src': 'embed', 'start': 4266.479, 'weight': 2, 'content': [{'end': 4276.223, 'text': 'if you want, you can also sort it in descending order so that the countries with the highest number of confirmed cases appear at the top.', 'start': 4266.479, 'duration': 9.744}, {'end': 4278.504, 'text': 'you can see it here.', 'start': 4276.223, 'duration': 2.281}, {'end': 4281.426, 'text': 'and these are the countries that have very few COVID cases.', 'start': 4278.504, 'duration': 2.922}, {'end': 4288.073, 'text': 'now you can convert this simple table into different charts and graphs.', 'start': 4282.731, 'duration': 5.342}], 'summary': 'Sort countries by confirmed covid cases, and convert table into charts.', 'duration': 21.594, 'max_score': 4266.479, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w4266479.jpg'}, {'end': 4532.3, 'src': 'embed', 'start': 4493.043, 'weight': 3, 'content': [{'end': 4502.294, 'text': 'similarly, you can change it here as well top 10 confirmed countries.', 'start': 4493.043, 'duration': 9.251}, {'end': 4511.748, 'text': "okay, now let's move on to our second visualization, where we will create a global map.", 'start': 4502.294, 'duration': 9.454}, {'end': 4521.353, 'text': 'so this global map will show the different countries present in the map and the cases that they have.', 'start': 4511.748, 'duration': 9.605}, {'end': 4532.3, 'text': "so to create the map, first i'll drag my longitude generated field onto columns and then i'll drag latitude onto rows.", 'start': 4521.353, 'duration': 10.947}], 'summary': 'Visualize top 10 confirmed countries and create a global map displaying cases by country.', 'duration': 39.257, 'max_score': 4493.043, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w4493043.jpg'}], 'start': 3859.276, 'title': 'Using tableau for covid-19 data visualization', 'summary': 'Covers the use of tableau to visualize a global coronavirus dataset with 192 countries, focusing on active, confirmed, and recovered cases. it also demonstrates creating visualizations for total confirmed cases, sorting data, and highlighting countries with the highest cases and deaths.', 'chapters': [{'end': 4083.324, 'start': 3859.276, 'title': 'Using tableau for global covid data visualization', 'summary': 'Covers the use of tableau software to visualize a global coronavirus dataset containing 192 countries, focusing on active, confirmed, and recovered cases, with a plan to create visualizations and a dashboard using tableau public edition.', 'duration': 224.048, 'highlights': ['The dataset contains information for 192 countries, including active cases, confirmed cases, and number of deaths. This quantifies the size of the dataset and highlights the key fields to be used for analysis.', 'The plan involves using Tableau software to create visualizations and convert them into a dashboard. This indicates the intended outcome of the project and the use of Tableau for visualization.', 'The chapter emphasizes the use of Tableau Public edition for the project. This clarifies the specific version of Tableau to be utilized for the project.']}, {'end': 4732.47, 'start': 4083.324, 'title': 'Tableau visualizations for covid-19 data', 'summary': 'Demonstrates the creation of tableau visualizations for global covid-19 data, showcasing the total number of confirmed cases for each country, sorting the data in ascending and descending order, transforming tables into bar and map visualizations, and highlighting countries with the highest number of cases and deaths.', 'duration': 649.146, 'highlights': ['Creating a table to display total confirmed cases for each country Tableau is used to create a table displaying the total confirmed cases for each country, providing a visual representation of the COVID-19 impact worldwide.', 'Sorting the table in ascending and descending order based on confirmed cases The table can be sorted in ascending or descending order based on the total confirmed cases, enabling easy identification of countries with varying COVID-19 case counts.', 'Transforming the table into bar and map visualizations The table visualization is transformed into bar and map visualizations, offering diverse ways to represent and analyze the global COVID-19 data.', 'Highlighting countries with the highest number of cases and deaths The visualization highlights countries with the highest number of COVID-19 cases and deaths, providing valuable insights into the global impact of the pandemic.']}], 'duration': 873.194, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w3859276.jpg', 'highlights': ['The dataset contains information for 192 countries, including active cases, confirmed cases, and number of deaths.', 'Creating a table to display total confirmed cases for each country Tableau is used to create a table displaying the total confirmed cases for each country, providing a visual representation of the COVID-19 impact worldwide.', 'Sorting the table in ascending and descending order based on confirmed cases The table can be sorted in ascending or descending order based on the total confirmed cases, enabling easy identification of countries with varying COVID-19 case counts.', 'Transforming the table into bar and map visualizations The table visualization is transformed into bar and map visualizations, offering diverse ways to represent and analyze the global COVID-19 data.', 'Highlighting countries with the highest number of cases and deaths The visualization highlights countries with the highest number of COVID-19 cases and deaths, providing valuable insights into the global impact of the pandemic.', 'The plan involves using Tableau software to create visualizations and convert them into a dashboard. This indicates the intended outcome of the project and the use of Tableau for visualization.', 'The chapter emphasizes the use of Tableau Public edition for the project. This clarifies the specific version of Tableau to be utilized for the project.']}, {'end': 5671.99, 'segs': [{'end': 4904.838, 'src': 'embed', 'start': 4830.263, 'weight': 0, 'content': [{'end': 4839.849, 'text': 'so this was a tree map that we created to show the top 10 countries with the highest number of deaths.', 'start': 4830.263, 'duration': 9.586}, {'end': 4847.494, 'text': "i'll just change the title top 10 deaths cases.", 'start': 4839.849, 'duration': 7.645}, {'end': 4854.741, 'text': 'now you can also edit the title in terms of the font, color and everything.', 'start': 4849.798, 'duration': 4.943}, {'end': 4860.705, 'text': "so i'll keep it tab blue, bold, and i'll choose this as 14.", 'start': 4854.741, 'duration': 5.964}, {'end': 4863.967, 'text': 'click on apply and okay, there you go.', 'start': 4860.705, 'duration': 3.262}, {'end': 4871.512, 'text': "similarly, let's rename the sheet to death cases.", 'start': 4863.967, 'duration': 7.545}, {'end': 4880.529, 'text': "cool. Now moving to the next visualization, where we'll see about the recoveries.", 'start': 4871.512, 'duration': 9.017}, {'end': 4890.234, 'text': "So, before we move on to create a visualization for the top 10, or let's say, the bottom 10 countries in terms of recoveries,", 'start': 4881.87, 'duration': 8.364}, {'end': 4893.836, 'text': 'let me show you how to change the color of the stream map.', 'start': 4890.234, 'duration': 3.602}, {'end': 4900.44, 'text': 'So here, by default, Ablu selects the automatic palette.', 'start': 4894.797, 'duration': 5.643}, {'end': 4904.838, 'text': 'If you want, you can change it to different palettes.', 'start': 4901.201, 'duration': 3.637}], 'summary': 'Created tree map showing top 10 countries by deaths and visualization customization', 'duration': 74.575, 'max_score': 4830.263, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w4830263.jpg'}, {'end': 5047.593, 'src': 'embed', 'start': 5020.755, 'weight': 3, 'content': [{'end': 5026.765, 'text': "I'll just rename this sheet to recovery cases.", 'start': 5020.755, 'duration': 6.01}, {'end': 5031.987, 'text': "cool, now let's move on to our next visual.", 'start': 5026.765, 'duration': 5.222}, {'end': 5040.21, 'text': 'okay, so here we are going to create a scatter plot to analyze the total confirmed cases versus the total number of deaths.', 'start': 5031.987, 'duration': 8.223}, {'end': 5047.593, 'text': 'so whenever you have two numeric fields, it is better to create a scatter plot to visualize the cases.', 'start': 5040.21, 'duration': 7.383}], 'summary': "Renamed sheet to 'recovery cases', creating scatter plot for total confirmed cases vs total deaths.", 'duration': 26.838, 'max_score': 5020.755, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w5020755.jpg'}, {'end': 5154.935, 'src': 'embed', 'start': 5116.681, 'weight': 4, 'content': [{'end': 5126.049, 'text': 'now, from this we are going to filter out the top 10 countries with the highest number of confirmed cases and the deaths.', 'start': 5116.681, 'duration': 9.368}, {'end': 5132.735, 'text': "so for that i'll drag my country column onto filters and i'll go to top.", 'start': 5126.049, 'duration': 6.686}, {'end': 5134.777, 'text': 'filter by field.', 'start': 5132.735, 'duration': 2.042}, {'end': 5139.901, 'text': "i'll say top 10 and i'll click on apply and ok.", 'start': 5134.777, 'duration': 5.124}, {'end': 5154.935, 'text': 'so here you can see we have siberia, sweden, Belgium, Netherlands, Spain, Italy, France, Russia, UK and you have Brazil.', 'start': 5139.901, 'duration': 15.034}], 'summary': 'Filtered top 10 countries with highest confirmed cases and deaths.', 'duration': 38.254, 'max_score': 5116.681, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w5116681.jpg'}, {'end': 5387.313, 'src': 'embed', 'start': 5272.594, 'weight': 1, 'content': [{'end': 5291.503, 'text': "okay, so, first I'll drag mortality rate on to column and then I'll have my country column under rows.", 'start': 5272.594, 'duration': 18.909}, {'end': 5293.304, 'text': 'so there you go.', 'start': 5291.503, 'duration': 1.801}, {'end': 5308.272, 'text': 'now let me go ahead and sort it in descending order, so you can see here you have Yemen, there is Mexico, Syria, Sudan, Egypt, Equator,', 'start': 5293.304, 'duration': 14.968}, {'end': 5312.974, 'text': 'China and other countries which have the highest mortality rate.', 'start': 5308.272, 'duration': 4.702}, {'end': 5320.929, 'text': 'and you can also select the country region under colors.', 'start': 5312.974, 'duration': 7.955}, {'end': 5324.232, 'text': "let's go ahead and edit the color.", 'start': 5320.929, 'duration': 3.303}, {'end': 5334.263, 'text': "instead of automatic, let's select summer, or you can go for some other palette, let's say tab blue, classic medium.", 'start': 5324.232, 'duration': 10.031}, {'end': 5336.004, 'text': "i'll assign this palette.", 'start': 5334.263, 'duration': 1.741}, {'end': 5339.328, 'text': "yes, i'll apply, and okay, all right.", 'start': 5336.004, 'duration': 3.324}, {'end': 5351.583, 'text': "finally, we'll move ahead to our last visualization, where we'll create a dual axis chart.", 'start': 5341.19, 'duration': 10.393}, {'end': 5357.611, 'text': 'so the dual axis chart in tableau will have one column and two different visualizations on zeros.', 'start': 5351.583, 'duration': 6.028}, {'end': 5364.124, 'text': 'so here you have one column and on the right also, you will have one column with their own axis values.', 'start': 5357.611, 'duration': 6.513}, {'end': 5370.407, 'text': 'So this dual axis chart will be for recovered cases and the death cases.', 'start': 5365.045, 'duration': 5.362}, {'end': 5372.087, 'text': 'Let me show you how to do it.', 'start': 5371.047, 'duration': 1.04}, {'end': 5387.313, 'text': "So I'll drag my country field onto columns and then I'll choose recovered column onto rows.", 'start': 5373.508, 'duration': 13.805}], 'summary': 'Visualizing mortality rates and creating dual axis chart for recovered and death cases.', 'duration': 114.719, 'max_score': 5272.594, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w5272594.jpg'}, {'end': 5567.369, 'src': 'embed', 'start': 5492.763, 'weight': 7, 'content': [{'end': 5503.967, 'text': 'okay, now, before we move ahead and create our final dashboard, I want to show you one more feature in Tableau, that is,', 'start': 5492.763, 'duration': 11.204}, {'end': 5511.79, 'text': 'to format the labels so you can see here, instead of writing 5,30, 821.', 'start': 5503.967, 'duration': 7.823}, {'end': 5519.174, 'text': "we can change this in terms of millions or thousands and write it as, let's say, 530k.", 'start': 5511.791, 'duration': 7.383}, {'end': 5528.759, 'text': "to do that, what I'll do is I'll go to format and here we have the option to select font,", 'start': 5519.174, 'duration': 9.585}, {'end': 5542.277, 'text': "and in font i'll go to fields and click on sum of deaths and here in this pane i'll choose number.", 'start': 5530.792, 'duration': 11.485}, {'end': 5553.401, 'text': "custom decimal places will be zero and display units i'll select thousands.", 'start': 5542.277, 'duration': 11.124}, {'end': 5556.463, 'text': 'you can see here the values have been changed to 531k, 193k, 158k, so on and so forth.', 'start': 5553.401, 'duration': 3.062}, {'end': 5564.545, 'text': "cool, now let's go ahead and create our final dashboard.", 'start': 5560.64, 'duration': 3.905}, {'end': 5567.369, 'text': 'here is the option to create a new dashboard.', 'start': 5564.545, 'duration': 2.824}], 'summary': 'In tableau, labels can be formatted to display values in thousands or millions, for example 530k, by adjusting the font settings and selecting display units as thousands.', 'duration': 74.606, 'max_score': 5492.763, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w5492763.jpg'}], 'start': 4732.47, 'title': 'Covid-19 data visualization', 'summary': 'Covers creating a tree map to visualize top 10 countries with the highest death cases, scatter plots for total confirmed cases versus deaths, and a dual axis chart for recovered and death cases in tableau. it also includes mortality rate analysis and dashboard creation.', 'chapters': [{'end': 5020.755, 'start': 4732.47, 'title': 'Tree map: top 10 deaths cases', 'summary': 'Demonstrates the creation of a tree map in tableau to visualize the top 10 countries with the highest number of deaths and also explains the process of changing the color palette and creating a visualization for the top 10 recoveries.', 'duration': 288.285, 'highlights': ['Created a tree map to display the top 10 countries with the highest number of deaths. The tree map was used to visualize the top 10 countries with the highest number of deaths, including countries like US, Brazil, Mexico, India, United Kingdom, Italy, Russia, Spain, France, and Germany.', 'Demonstrated the process of changing the color palette in Tableau. The process of changing the color palette in Tableau was explained, including the selection of different color palettes such as gold purple diverging and red, green, white diverging.', 'Created a visualization for the top 10 recoveries using a packed bubble chart. A visualization for the top 10 recoveries was created using a packed bubble chart, showcasing countries with the highest average number of recoveries like India, Brazil, Turkey, Italy, Argentina, Germany, Colombia, Russia, Poland, and Mexico.']}, {'end': 5320.929, 'start': 5020.755, 'title': 'Visual analysis of covid-19 data', 'summary': 'Demonstrates the creation of scatter plots to visualize total confirmed cases versus total deaths, filtering the top 10 countries with the highest number of deaths and active cases, and analyzing mortality rates for each country using tableau.', 'duration': 300.174, 'highlights': ['The chapter demonstrates the creation of scatter plots to visualize total confirmed cases versus total deaths. The speaker explains the process of creating a scatter plot to analyze the total confirmed cases versus the total number of deaths, emphasizing the importance of using scatter plots when dealing with two numeric fields.', 'Filtering the top 10 countries with the highest number of deaths and active cases. The speaker filters out the top 10 countries with the highest number of confirmed cases and deaths, providing a list of countries with the highest number of active cases and deaths reported so far.', 'Analyzing mortality rates for each country using Tableau. The chapter showcases the process of analyzing mortality rates for each country, demonstrating the sorting of countries in descending order based on their mortality rates and the use of color to represent country regions.']}, {'end': 5671.99, 'start': 5320.929, 'title': 'Creating dual axis chart in tableau', 'summary': 'Demonstrates creating a dual axis chart in tableau with recovered and death cases, showcasing how to synchronize axis values, format labels to display in thousands, and create a final dashboard with various visualizations.', 'duration': 351.061, 'highlights': ['Creating a dual axis chart with recovered and death cases The chapter demonstrates creating a dual axis chart in Tableau with recovered and death cases, showcasing how to synchronize axis values, format labels to display in thousands.', 'Formatting labels to display in thousands The presenter formats the labels to display in thousands by selecting font, choosing number, setting custom decimal places to zero, and displaying units in thousands.', 'Creating a final dashboard with various visualizations The chapter concludes by creating a final dashboard with visualizations including global cases, top 10 confirmed cases, death cases, and confirmed versus deaths, and emphasizes the option to customize the dashboards further.']}], 'duration': 939.52, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w4732470.jpg', 'highlights': ['Created a tree map to display the top 10 countries with the highest number of deaths.', 'Demonstrated the process of changing the color palette in Tableau.', 'Created a visualization for the top 10 recoveries using a packed bubble chart.', 'The chapter demonstrates the creation of scatter plots to visualize total confirmed cases versus total deaths.', 'Filtering the top 10 countries with the highest number of deaths and active cases.', 'Analyzing mortality rates for each country using Tableau.', 'Creating a dual axis chart with recovered and death cases.', 'Formatting labels to display in thousands.', 'Creating a final dashboard with various visualizations.']}, {'end': 6137.305, 'segs': [{'end': 5793.442, 'src': 'embed', 'start': 5768.302, 'weight': 2, 'content': [{'end': 5773.227, 'text': 'The first modern Olympics took place in Athens, Greece in 1896.', 'start': 5768.302, 'duration': 4.925}, {'end': 5783.916, 'text': 'As per National Geographic, the original Olympics took place in 776 BC, so they began as part of an ancient Greek festival which celebrated Zeus,', 'start': 5773.227, 'duration': 10.689}, {'end': 5785.357, 'text': 'the Greek god of sky and weather.', 'start': 5783.916, 'duration': 1.441}, {'end': 5793.442, 'text': 'The rings in the Olympics logo represent the five continents, Europe, Africa, Asia, the Americas and Oceania.', 'start': 5786.397, 'duration': 7.045}], 'summary': 'The first modern olympics were held in athens, greece in 1896, with origins dating back to 776 bc as part of an ancient greek festival celebrating zeus.', 'duration': 25.14, 'max_score': 5768.302, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w5768302.jpg'}, {'end': 5845.843, 'src': 'embed', 'start': 5820.034, 'weight': 4, 'content': [{'end': 5825.756, 'text': 'The Summer Olympics in Tokyo began on the 23rd of July and recently concluded on the 8th of August.', 'start': 5820.034, 'duration': 5.722}, {'end': 5833.657, 'text': 'We got to witness some thriller matches that went down to the wire, some amazing victories and sadly there were a lot of heartbreaks as well.', 'start': 5826.536, 'duration': 7.121}, {'end': 5836.458, 'text': 'Winning and losing are part and parcel of any game.', 'start': 5834.357, 'duration': 2.101}, {'end': 5845.843, 'text': "Fans across the world were really happy to see this global event happen this year following last year's postponement due to the coronavirus pandemic.", 'start': 5838.737, 'duration': 7.106}], 'summary': 'Tokyo summer olympics concluded on 8th august after thrilling matches and global fan excitement.', 'duration': 25.809, 'max_score': 5820.034, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w5820034.jpg'}, {'end': 6027.1, 'src': 'embed', 'start': 5949.451, 'weight': 0, 'content': [{'end': 5951.733, 'text': 'So this is a huge data set that we are going to use.', 'start': 5949.451, 'duration': 2.282}, {'end': 5957.96, 'text': 'The other data set we are going to use in this demo is called NOC underscore regions.', 'start': 5952.474, 'duration': 5.486}, {'end': 5963.026, 'text': 'Now here one thing to note is NOC stands for National Olympic Committee.', 'start': 5958.581, 'duration': 4.445}, {'end': 5966.83, 'text': 'This is a three letter code that is given by the Olympic Committee.', 'start': 5963.286, 'duration': 3.544}, {'end': 5974.91, 'text': 'we have the region names, so afg is for afghanistan, you have alb for albania, alg for algeria.', 'start': 5967.805, 'duration': 7.105}, {'end': 5977.212, 'text': 'we have also a column called notes.', 'start': 5974.91, 'duration': 2.302}, {'end': 5988.58, 'text': 'you find some notes about the region, for example here antiqua is actually antiqua and barbuda all right,', 'start': 5977.212, 'duration': 11.368}, {'end': 5992.662, 'text': "let's move to the first data set that i showed you.", 'start': 5988.58, 'duration': 4.082}, {'end': 5994.604, 'text': 'this is our primary data set for our demo.', 'start': 5992.662, 'duration': 1.942}, {'end': 6000.615, 'text': 'So these datasets have been directly taken from the internet and were not validated.', 'start': 5996.533, 'duration': 4.082}, {'end': 6004.797, 'text': 'So the results that you will see in the demo is purely based on the data that we have collected.', 'start': 6000.815, 'duration': 3.982}, {'end': 6016.141, 'text': 'So the file athlete underscore events dot csv contains nearly 271116 rows of information and there are 15 columns.', 'start': 6006.777, 'duration': 9.364}, {'end': 6022.044, 'text': 'So each row corresponds to an individual athlete competing in an individual Olympic event.', 'start': 6017.322, 'duration': 4.722}, {'end': 6027.1, 'text': 'so here id is actually a unique number for each athlete.', 'start': 6022.995, 'duration': 4.105}], 'summary': 'Demo uses huge dataset with 271116 rows of athlete events, 15 columns.', 'duration': 77.649, 'max_score': 5949.451, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w5949451.jpg'}], 'start': 5671.99, 'title': 'Olympics data analysis and female athletes', 'summary': 'Covers analysis of olympics data using tableau and python, with approximately 271,116 rows of athlete_events data and 15 columns, including noc information and details of olympic events.', 'chapters': [{'end': 5878.48, 'start': 5671.99, 'title': 'Olympics data analysis', 'summary': 'Covers the completion of a coronavirus data analysis project using tableau, the commencement of an olympics dataset analysis using python, including historical context, and specific questions related to the olympics, along with the use of python libraries for exploratory data analysis.', 'duration': 206.49, 'highlights': ['The first modern Olympics took place in Athens, Greece in 1896. The first modern Olympics began in Athens, Greece in 1896.', 'The Summer Olympics in Tokyo began on the 23rd of July and recently concluded on the 8th of August. The Summer Olympics in Tokyo started on July 23 and ended on August 8.', 'The rings in the Olympics logo represent the five continents, Europe, Africa, Asia, the Americas, and Oceania. The rings in the Olympics logo symbolize the five continents.']}, {'end': 6137.305, 'start': 5879.721, 'title': 'First olympics with all female athletes', 'summary': 'Discusses the datasets used for the analysis, including the number of rows and columns, the definition of noc, and the information contained in the athlete_events dataset, with approximately 271,116 rows of data and 15 columns, detailing athlete specifics and olympic event information.', 'duration': 257.584, 'highlights': ['The athlete_events dataset contains nearly 271,116 rows of information and 15 columns, including athlete specifics and Olympic event information. The athlete_events dataset contains around 271,116 rows and 15 columns, providing athlete specifics such as name, gender, age, height, weight, team name, NOC, games, year, season, city, sport name, event, and medal information.', 'The NOC (National Olympic Committee) regions dataset contains three-letter codes for countries and their corresponding region names. The NOC regions dataset contains three-letter codes for countries, along with their region names, and also includes a column for notes about the regions.', 'The chapter emphasizes the unreliability of the datasets, stating that the results shown in the demo are purely based on the collected data. The datasets used in the analysis are directly taken from the internet and not validated, with the presenter noting that the results demonstrated in the demo are solely based on the collected data, implying potential limitations in accuracy and reliability.']}], 'duration': 465.315, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w5671990.jpg', 'highlights': ['The athlete_events dataset contains around 271,116 rows and 15 columns, providing athlete specifics such as name, gender, age, height, weight, team name, NOC, games, year, season, city, sport name, event, and medal information.', 'The NOC regions dataset contains three-letter codes for countries, along with their region names, and also includes a column for notes about the regions.', 'The rings in the Olympics logo symbolize the five continents.', 'The first modern Olympics began in Athens, Greece in 1896.', 'The Summer Olympics in Tokyo started on July 23 and ended on August 8.', 'The datasets used in the analysis are directly taken from the internet and not validated, with the presenter noting that the results demonstrated in the demo are solely based on the collected data, implying potential limitations in accuracy and reliability.']}, {'end': 7805.508, 'segs': [{'end': 6232.93, 'src': 'embed', 'start': 6190.722, 'weight': 1, 'content': [{'end': 6196.247, 'text': 'now the next thing is to load the datasets using pandas.', 'start': 6190.722, 'duration': 5.525}, {'end': 6197.688, 'text': 'read underscore csv function.', 'start': 6196.247, 'duration': 1.441}, {'end': 6204.673, 'text': "so I'll create a variable called athletes.", 'start': 6197.688, 'duration': 6.985}, {'end': 6207.573, 'text': "I'll say equal to pd dot.", 'start': 6204.673, 'duration': 2.9}, {'end': 6217.335, 'text': "I'll use the read underscore csv function and inside this function will give the location of the files.", 'start': 6207.573, 'duration': 9.762}, {'end': 6223.276, 'text': "so I'll copy this location and here I'll paste.", 'start': 6217.335, 'duration': 5.941}, {'end': 6232.93, 'text': 'make sure this is within quotations and these are all forward slash.', 'start': 6223.276, 'duration': 9.654}], 'summary': 'Loading datasets using pandas read_csv function.', 'duration': 42.208, 'max_score': 6190.722, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w6190722.jpg'}, {'end': 6515.418, 'src': 'embed', 'start': 6484.584, 'weight': 2, 'content': [{'end': 6488.486, 'text': 'if you see here, the rest of the columns start with a capital letter.', 'start': 6484.584, 'duration': 3.902}, {'end': 6493.248, 'text': 'but if you see, the last two columns that we just added now start with a lowercase letter.', 'start': 6488.486, 'duration': 4.762}, {'end': 6500.851, 'text': "so we'll use the rename function to make the column names consistent.", 'start': 6493.248, 'duration': 7.603}, {'end': 6508.654, 'text': "but before that let's see the shape of the data frame.", 'start': 6500.851, 'duration': 7.803}, {'end': 6512.937, 'text': "so i'll print the shape of the data frame to know the total number of rows and columns.", 'start': 6508.654, 'duration': 4.283}, {'end': 6515.418, 'text': "i'll say athletes.", 'start': 6512.937, 'duration': 2.481}], 'summary': 'Using rename function to standardize column names, checking data frame shape.', 'duration': 30.834, 'max_score': 6484.584, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w6484584.jpg'}, {'end': 6783.408, 'src': 'embed', 'start': 6757.472, 'weight': 0, 'content': [{'end': 6765.878, 'text': 'and then we have the 25th percentile, the 50th percentile and the 75th percentile value of the columns id, age, height, weight and year.', 'start': 6757.472, 'duration': 8.406}, {'end': 6775.806, 'text': 'all right now, one thing to notice here is, if you see the year column, the minimum year is 1896.', 'start': 6767.543, 'duration': 8.263}, {'end': 6783.408, 'text': 'so this is when Olympics started and, until recently, the Rio Olympics that was held in 2016.', 'start': 6775.806, 'duration': 7.602}], 'summary': 'Data includes 25th, 50th, and 75th percentiles for id, age, height, and weight; olympics data ranges from 1896 to 2016.', 'duration': 25.936, 'max_score': 6757.472, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w6757472.jpg'}, {'end': 7362.842, 'src': 'embed', 'start': 7336.258, 'weight': 3, 'content': [{'end': 7342.62, 'text': 'first we have united states, which has the highest participation since the beginning of the olympics.', 'start': 7336.258, 'duration': 6.362}, {'end': 7349.902, 'text': 'then we have france, great britain, italy, germany, canada and we have the rest of the countries.', 'start': 7342.62, 'duration': 7.282}, {'end': 7359.379, 'text': 'cool, Now, moving ahead, the next visualization we are going to see is the age distribution of the athletes.', 'start': 7349.902, 'duration': 9.477}, {'end': 7362.842, 'text': 'So for this, we are going to create a histogram.', 'start': 7360.159, 'duration': 2.683}], 'summary': 'Usa has highest olympic participation. next, age distribution visualization will be a histogram.', 'duration': 26.584, 'max_score': 7336.258, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w7336258.jpg'}, {'end': 7566.16, 'src': 'embed', 'start': 7535.24, 'weight': 4, 'content': [{'end': 7537.922, 'text': 'on the y-axis, you have the number of participants.', 'start': 7535.24, 'duration': 2.682}, {'end': 7548.089, 'text': 'on the x-axis, we have the age values ranging from 10 to 80, with a size of 2, and if you see this,', 'start': 7537.922, 'duration': 10.167}, {'end': 7554.733, 'text': 'we have most number of athletes who have an age between 20 to 30.', 'start': 7548.089, 'duration': 6.644}, {'end': 7562.118, 'text': 'you can see here so early 20s, we have maximum number of athletes participating in the olympics.', 'start': 7554.733, 'duration': 7.385}, {'end': 7566.16, 'text': 'we also have a few athletes who are beyond 40 years of age.', 'start': 7562.118, 'duration': 4.042}], 'summary': 'Most olympic athletes are in their early 20s, with few over 40.', 'duration': 30.92, 'max_score': 7535.24, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w7535240.jpg'}, {'end': 7627.661, 'src': 'embed', 'start': 7596.705, 'weight': 6, 'content': [{'end': 7602.238, 'text': 'moving ahead Now, in the initial slides of the video, we discussed the Summer and Winter Olympic Games.', 'start': 7596.705, 'duration': 5.533}, {'end': 7606.402, 'text': "Let's look at the different sporting events that are part of the Summer and Winter Olympic Games.", 'start': 7602.839, 'duration': 3.563}, {'end': 7613.149, 'text': 'Just to give you a heads up, the Winter Olympic Games are held once every four years for sports practiced on snow and ice.', 'start': 7607.023, 'duration': 6.126}, {'end': 7618.153, 'text': 'So here I have my variable name winter underscore sports.', 'start': 7613.949, 'duration': 4.204}, {'end': 7627.661, 'text': "So I'm extracting for season equal to equal to winter, and i am going to display only the unique values.", 'start': 7618.454, 'duration': 9.207}], 'summary': 'The winter olympic games are held once every four years for sports practiced on snow and ice.', 'duration': 30.956, 'max_score': 7596.705, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w7596705.jpg'}, {'end': 7754.495, 'src': 'embed', 'start': 7717.339, 'weight': 7, 'content': [{'end': 7735.336, 'text': "i'll use my data frame, that is, athletes underscore df dot my variable name, that is, sex dot value underscore counts,", 'start': 7717.339, 'duration': 17.997}, {'end': 7740.079, 'text': 'and then we are going to print gender underscore counts.', 'start': 7735.336, 'duration': 4.743}, {'end': 7744.748, 'text': "let's run this there you go.", 'start': 7741.505, 'duration': 3.243}, {'end': 7754.495, 'text': 'so since the inception of olympics, we have more number of male participants than female participants.', 'start': 7744.748, 'duration': 9.747}], 'summary': 'More male than female olympic participants since inception.', 'duration': 37.156, 'max_score': 7717.339, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w7717339.jpg'}], 'start': 6137.305, 'title': 'Olympic dataset analysis', 'summary': 'Covers analysis of olympics dataset using jupyter notebook, including importing datasets, loading and merging using pandas, checking for null values, identifying columns with the most null values, and visualizing athlete age distribution, revealing insights such as top participating countries and gender distribution.', 'chapters': [{'end': 6783.408, 'start': 6137.305, 'title': 'Olympics dataset analysis', 'summary': 'Covers the analysis of olympics dataset using jupyter notebook, including importing datasets, loading and merging using pandas, renaming columns, and displaying statistical information.', 'duration': 646.103, 'highlights': ['The dataset comprises 271,116 rows and 17 columns, with the year column ranging from 1896 to 2016. Quantifiable data: 271,116 rows, 17 columns, year range 1896-2016.', "The process involves importing datasets, loading using pandas' read_csv function, and merging them using the merge function. Key points: Importing and loading datasets, merging using pandas functions.", 'The column names are made consistent using the rename function, ensuring uniformity in the dataset. Key point: Standardizing column names for consistency.']}, {'end': 7507.369, 'start': 6783.408, 'title': 'Checking null values & analyzing athlete data', 'summary': 'Covers checking for null values in columns, identifying the columns with the most null values, filtering data for specific countries, finding the top 10 countries with the most participants in the olympics since 1896, and visualizing the age distribution of athletes.', 'duration': 723.961, 'highlights': ['Identifying the columns with the most null values There are nearly six columns with missing values: age, height, weight, medal, region, and notes.', 'Filtering data for specific countries The process of filtering data for specific countries like India and Japan, and displaying the details of the athletes from those countries.', 'Finding the top 10 countries with the most participants in the Olympics since 1896 United States has the highest number of participants in the Olympics, followed by France, Great Britain, Italy, Germany, Canada, Japan, Sweden, Australia, and Hungary.', 'Visualizing the age distribution of athletes using a histogram Creating a histogram to visualize the age distribution of athletes, with bins ranging from 10 to 80 and a bin size of 2.']}, {'end': 7805.508, 'start': 7507.369, 'title': 'Olympic data analysis', 'summary': 'Discusses histogram distribution of athlete ages, comparison of summer and winter olympic sports, and gender distribution of olympic participants, revealing that most athletes are in their early 20s, more sports are played in summer olympics, and there are more male participants than female.', 'duration': 298.139, 'highlights': ['The histogram shows most athletes are in their early 20s with a peak between 20 to 30 years, revealing age distribution in the Olympics. The histogram displays the age distribution of athletes, showing a peak between 20 to 30 years, with a range from 10 to 80 years and most athletes falling within this range.', "More sports are played in Summer Olympics compared to Winter Olympics, as indicated by the unique values extracted for each season's sports. The analysis reveals that there are more Olympic sports played during the Summer Olympics compared to the Winter Olympics, as demonstrated by the larger number of unique sports associated with the Summer Olympics.", 'There are more male participants than female participants since the inception of the Olympics, as shown by the gender distribution pie chart. The pie chart illustrates a higher number of male participants in the Olympics compared to female participants, indicating a gender disparity in the total number of participants since the inception of the games.']}], 'duration': 1668.203, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w6137305.jpg', 'highlights': ['The dataset comprises 271,116 rows and 17 columns, with the year column ranging from 1896 to 2016. Quantifiable data: 271,116 rows, 17 columns, year range 1896-2016.', "The process involves importing datasets, loading using pandas' read_csv function, and merging them using the merge function. Key points: Importing and loading datasets, merging using pandas functions.", 'The column names are made consistent using the rename function, ensuring uniformity in the dataset. Key point: Standardizing column names for consistency.', 'Finding the top 10 countries with the most participants in the Olympics since 1896 United States has the highest number of participants in the Olympics, followed by France, Great Britain, Italy, Germany, Canada, Japan, Sweden, Australia, and Hungary.', 'Visualizing the age distribution of athletes using a histogram Creating a histogram to visualize the age distribution of athletes, with bins ranging from 10 to 80 and a bin size of 2.', 'The histogram shows most athletes are in their early 20s with a peak between 20 to 30 years, revealing age distribution in the Olympics. The histogram displays the age distribution of athletes, showing a peak between 20 to 30 years, with a range from 10 to 80 years and most athletes falling within this range.', "More sports are played in Summer Olympics compared to Winter Olympics, as indicated by the unique values extracted for each season's sports. The analysis reveals that there are more Olympic sports played during the Summer Olympics compared to the Winter Olympics, as demonstrated by the larger number of unique sports associated with the Summer Olympics.", 'There are more male participants than female participants since the inception of the Olympics, as shown by the gender distribution pie chart. The pie chart illustrates a higher number of male participants in the Olympics compared to female participants, indicating a gender disparity in the total number of participants since the inception of the games.']}, {'end': 9209.157, 'segs': [{'end': 7836.803, 'src': 'embed', 'start': 7805.508, 'weight': 2, 'content': [{'end': 7814.123, 'text': 'so you can see here this is my pie chart that shows you the distribution of the male and female participation.', 'start': 7805.508, 'duration': 8.615}, {'end': 7817.486, 'text': 'so for male it is 72.5 percent.', 'start': 7814.123, 'duration': 3.363}, {'end': 7821.73, 'text': 'for female, so far it is 27.5 percent.', 'start': 7817.486, 'duration': 4.244}, {'end': 7834.681, 'text': "as per the data set that we have, we can change this start angle to, let's say, 180 degree and it will change the pie chart to this direction.", 'start': 7821.73, 'duration': 12.951}, {'end': 7836.803, 'text': 'cool, now, moving ahead.', 'start': 7834.681, 'duration': 2.122}], 'summary': 'Pie chart shows 72.5% male and 27.5% female participation.', 'duration': 31.295, 'max_score': 7805.508, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w7805508.jpg'}, {'end': 8065.793, 'src': 'embed', 'start': 8030.264, 'weight': 0, 'content': [{'end': 8037.357, 'text': 'so you can see here from 1900, 1904, all these years, you can see the female participation.', 'start': 8030.264, 'duration': 7.093}, {'end': 8044.132, 'text': 'let me change it to tail so that we have the recent data of the Olympics.', 'start': 8038.286, 'duration': 5.846}, {'end': 8047.255, 'text': 'you can see it here for the Beijing Olympics, 5816 women athletes participated.', 'start': 8044.132, 'duration': 3.123}, {'end': 8049.357, 'text': 'in 2012 London Olympics, we had 5815 Olympics.', 'start': 8047.255, 'duration': 2.102}, {'end': 8065.793, 'text': 'similarly, for the 2016 Rio Olympics, we had more participation than the London Olympics, so 6223 women athletes had participated.', 'start': 8049.357, 'duration': 16.436}], 'summary': "Female participation in olympics: 2016 rio had 6223 athletes, higher than 2012 london's 5815.", 'duration': 35.529, 'max_score': 8030.264, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w8030264.jpg'}, {'end': 8328.163, 'src': 'embed', 'start': 8298.133, 'weight': 4, 'content': [{'end': 8310.943, 'text': 'cool. now I want to see the athletes who have secured a gold medal beyond the age of 60 years, which is very rare.', 'start': 8298.133, 'duration': 12.81}, {'end': 8321.9, 'text': "so we'll see the total number of athletes with more than 60 years of age having won a gold medal.", 'start': 8314.977, 'duration': 6.923}, {'end': 8328.163, 'text': "so here I'm going to say gold medals.", 'start': 8321.9, 'duration': 6.263}], 'summary': 'Total number of athletes over 60 with gold medals is rare.', 'duration': 30.03, 'max_score': 8298.133, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w8298133.jpg'}, {'end': 8843.878, 'src': 'embed', 'start': 8807.328, 'weight': 1, 'content': [{'end': 8816.133, 'text': 'so in geo olympics, united states secured the most number of gold medals.', 'start': 8807.328, 'duration': 8.805}, {'end': 8822.336, 'text': 'now the reason this is 137 is we have also counted the team events, for example basketball.', 'start': 8816.133, 'duration': 6.203}, {'end': 8829.937, 'text': 'similarly, great britain had 64 gold medals in total, russia 50.', 'start': 8823.59, 'duration': 6.347}, {'end': 8831.619, 'text': 'we have brazil 34, argentina 21, france 20 and japan 17..', 'start': 8829.937, 'duration': 1.682}, {'end': 8843.878, 'text': 'all. right now, using the above result, we are going to create a bar plot.', 'start': 8831.619, 'duration': 12.259}], 'summary': 'In the geo olympics, the united states won 137 gold medals, with great britain at 64, russia at 50, brazil at 34, argentina at 21, france at 20, and japan at 17.', 'duration': 36.55, 'max_score': 8807.328, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w8807328.jpg'}, {'end': 8926.604, 'src': 'embed', 'start': 8895.33, 'weight': 3, 'content': [{'end': 8905.233, 'text': 'now it can be a gold medal, a silver medal or a bronze medal, but before that we need to filter the data only for athletes who have won a medal.', 'start': 8895.33, 'duration': 9.903}, {'end': 8916.397, 'text': 'now. if you had noticed here our medals column also had a few null values.', 'start': 8905.233, 'duration': 11.164}, {'end': 8920.178, 'text': 'so we are not going to consider those null values here.', 'start': 8916.397, 'duration': 3.781}, {'end': 8926.604, 'text': 'we are only going to consider for the athletes who have won a medal.', 'start': 8921.161, 'duration': 5.443}], 'summary': 'Filter data for athletes who have won a medal, excluding null values.', 'duration': 31.274, 'max_score': 8895.33, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w8895330.jpg'}], 'start': 7805.508, 'title': 'Olympics data analysis', 'summary': 'Encompasses the analysis of male and female participation, with males at 72.5% and females at 27.5%, an increase in female athlete participation over time, 6 gold medalists over 60, and top countries in gold medals, including the united states with 137, great britain with 64, and russia with 50.', 'chapters': [{'end': 7999.612, 'start': 7805.508, 'title': 'Pie chart distribution and medal counts', 'summary': 'Discusses the distribution of male and female participation in a pie chart, with males at 72.5% and females at 27.5%, and the analysis of total medals won, where the gold, silver, and bronze medals are similar in numbers.', 'duration': 194.104, 'highlights': ["The distribution of male and female participation in a pie chart is shown, indicating 72.5% for males and 27.5% for females. {'malePercentage': 72.5, 'femalePercentage': 27.5}", "Analysis of total medals won, showing similar counts for gold, silver, and bronze medals. {'goldMedalsCount': 'quantify', 'silverMedalsCount': 'quantify', 'bronzeMedalsCount': 'quantify'}"]}, {'end': 8209.638, 'start': 7999.612, 'title': 'Female athlete participation analysis', 'summary': 'Analyzes the participation of female athletes in the summer olympics, showing an increase in participation over time, with 6223 women athletes participating in the 2016 rio olympics and a continuous increase in participation since 1980.', 'duration': 210.026, 'highlights': ['Female participation in the 2016 Rio Olympics reached 6223, surpassing the 2012 London Olympics count of 5815, demonstrating a continuous increase since 1980.', 'The line graph illustrates the increasing trend of female athlete participation in the Olympics over time, with a slight decrease in the 1950s and 1980 before a continuous rise.', 'A count plot using seaborn sns library demonstrates the increase in female athlete participation, with the 2016 Olympics showing the highest number of participants.']}, {'end': 8566.609, 'start': 8209.638, 'title': 'Analyzing gold medalists over 60', 'summary': 'Involves filtering athletes who won gold medals, identifying those over 60 years old, and visualizing the sports and country distribution, revealing 6 gold medalists over 60 and their sports distribution.', 'duration': 356.971, 'highlights': ["Filtered and displayed the top 5 athletes who won gold medals, presenting records with the 'gold' medal condition.", 'Identified and verified that there are 6 athletes who have won a gold medal at the age of more than 60 years.', 'Visualized the count of gold medals for athletes over 60 years for different sports, showing archery with 3, art competition with 1, rook with 1, and shooting with 1 medal.']}, {'end': 9209.157, 'start': 8566.609, 'title': 'Olympics data analysis', 'summary': 'Explores the olympics dataset, analyzing the top countries in terms of gold medals, creating visualizations, and examining the height and weight of medal-winning athletes, with the united states securing the most gold medals at 137, followed by great britain with 64, russia with 50, and others.', 'duration': 642.548, 'highlights': ['The United States secured the most gold medals at 137, followed by Great Britain with 64 and Russia with 50. The analysis revealed that the United States secured the most gold medals at 137, followed by Great Britain with 64 and Russia with 50.', 'Created a horizontal bar plot displaying the top 20 nations with the most gold medals in the Olympics dataset. A horizontal bar plot was created to display the top 20 nations with the most gold medals in the Olympics dataset, showcasing the United States, Great Britain, Russia, Germany, China, and others.', 'A scatter plot was generated to visualize the height and weight of male and female athletes who won a medal, with blue points representing male athletes and orange points for female athletes. A scatter plot was generated to visualize the height and weight of male and female athletes who won a medal, with blue points representing male athletes and orange points for female athletes.']}], 'duration': 1403.649, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/G9NmACvXh8w/pics/G9NmACvXh8w7805508.jpg', 'highlights': ['Female participation in the 2016 Rio Olympics reached 6223, surpassing the 2012 London Olympics count of 5815, demonstrating a continuous increase since 1980.', 'The United States secured the most gold medals at 137, followed by Great Britain with 64 and Russia with 50.', 'The distribution of male and female participation in a pie chart is shown, indicating 72.5% for males and 27.5% for females.', "Filtered and displayed the top 5 athletes who won gold medals, presenting records with the 'gold' medal condition.", 'Identified and verified that there are 6 athletes who have won a gold medal at the age of more than 60 years.']}], 'highlights': ['The chapter covers two Python data analysis projects on COVID and Olympics data.', 'Covers hands-on python data analysis projects focusing on covid-19 and olympics data, including analyzing global covid-19 impact with over 22 crore infections and 4.5 million deaths.', 'The project involves using three different COVID-19 datasets for data analysis and visualization using Python and Tableau.', 'The top five states with the highest number of confirmed cases are Maharashtra, Kerala, Karnataka, Tamil Nadu, and Andhra Pradesh.', 'The mortality rate is high for Maharashtra, Uttarakhand, and Punjab.', 'The dataset contains information for 192 countries, including active cases, confirmed cases, and number of deaths.', 'The athlete_events dataset contains around 271,116 rows and 15 columns, providing athlete specifics such as name, gender, age, height, weight, team name, NOC, games, year, season, city, sport name, event, and medal information.', 'The NOC regions dataset contains three-letter codes for countries, along with their region names, and also includes a column for notes about the regions.', 'The datasets used in the analysis are directly taken from the internet and not validated, with the presenter noting that the results demonstrated in the demo are solely based on the collected data, implying potential limitations in accuracy and reliability.', 'The dataset comprises 271,116 rows and 17 columns, with the year column ranging from 1896 to 2016. Quantifiable data: 271,116 rows, 17 columns, year range 1896-2016.', 'Finding the top 10 countries with the most participants in the Olympics since 1896 United States has the highest number of participants in the Olympics, followed by France, Great Britain, Italy, Germany, Canada, Japan, Sweden, Australia, and Hungary.', 'Female participation in the 2016 Rio Olympics reached 6223, surpassing the 2012 London Olympics count of 5815, demonstrating a continuous increase since 1980.', 'The United States secured the most gold medals at 137, followed by Great Britain with 64 and Russia with 50.']}