Coursnap

title
Python For Data Analysis | Data Analysis Using Python | Python Data Analysis Tutorial | Edureka

description
🔥 Edureka Python Data Science Training (Use Code "𝐘𝐎𝐔𝐓𝐔𝐁𝐄𝟐𝟎"): https://www.edureka.co/data-science-python-certification-course This Edureka Python For Data Analysis tutorial (Python Tutorial Blog: https://goo.gl/wd28Zr) will help you learn how to use Python programming for data analysis using Pandas library. It also include a use-case, where we will analyze the data containing the percentage of unemployed youth for every country between 2010-2014. This Python For Data Analysis tutorial video helps you to learn the following topics: 1. What is Data Analysis? 2. What is Pandas Python Library? 3. Python Pandas Operations 4. Use-case Check out our Python Training Playlist: https://goo.gl/Na1p9G Reference: https://YouTube.com/Sentdex 🔴 𝐄𝐝𝐮𝐫𝐞𝐤𝐚 𝐎𝐧𝐥𝐢𝐧𝐞 𝐓𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝐚𝐧𝐝 𝐂𝐞𝐫𝐭𝐢𝐟𝐢𝐜𝐚𝐭𝐢𝐨𝐧𝐬 🔵 Python Online Training: http://bit.ly/3Oubt8M 🔵 Data Science Online Training: http://bit.ly/3V3nLrc 🔴 𝐄𝐝𝐮𝐫𝐞𝐤𝐚 𝐑𝐨𝐥𝐞-𝐁𝐚𝐬𝐞𝐝 𝐂𝐨𝐮𝐫𝐬𝐞𝐬 🔵 Data Scientist Masters Program: http://bit.ly/3tUAOiT 🔵 Python Developer Masters Program: http://bit.ly/3EV6kDv 🔴 𝐄𝐝𝐮𝐫𝐞𝐤𝐚 𝐔𝐧𝐢𝐯𝐞𝐫𝐬𝐢𝐭𝐲 𝐏𝐫𝐨𝐠𝐫𝐚𝐦𝐬 🔵 Advanced Certificate Program in Data Science with E&ICT Academy, IIT Guwahati: http://bit.ly/3V7ffrh 🌕 Artificial and Machine Learning PGD with E&ICT Academy NIT Warangal: http://bit.ly/3OuZ3xs 🔴 Subscribe to our channel to get video updates. Hit the subscribe button above: https://goo.gl/6ohpTV #Python #pythonfordataanalysis #pythonfordatascience #Pythononlinetraining #Pythonforbeginners #PythonProgramming #PythonPandas - - - - - - - - - - - - - - - - - About the Course Edureka's Python Online Certification Training will make you an expert in Python programming. It will also help you learn Python the Big data way with integration of Machine learning, Pig, Hive and Web Scraping through beautiful soup. During our Python Certification training, our instructors will help you: 1. Master the Basic and Advanced Concepts of Python 2. Understand Python Scripts on UNIX/Windows, Python Editors and IDEs 3. Master the Concepts of Sequences and File operations 4. Learn how to use and create functions, sorting different elements, Lambda function, error handling techniques and Regular expressions ans using modules in Python 5. Gain expertise in machine learning using Python and build a Real Life Machine Learning application 6. Understand the supervised and unsupervised learning and concepts of Scikit-Learn 7. Master the concepts of MapReduce in Hadoop 8. Learn to write Complex MapReduce programs 9. Understand what is PIG and HIVE, Streaming feature in Hadoop, MapReduce job running with Python 10. Implementing a PIG UDF in Python, Writing a HIVE UDF in Python, Pydoop and/Or MRjob Basics 11. Master the concepts of Web scraping in Python 12. Work on a Real Life Project on Big Data Analytics using Python and gain Hands on Project Experience - - - - - - - - - - - - - - - - - - - Why learn Python? Programmers love Python because of how fast and easy it is to use. Python cuts development time in half with its simple to read syntax and easy compilation feature. Debugging your programs is a breeze in Python with its built in debugger. Using Python makes Programmers more productive and their programs ultimately better. Python continues to be a favorite option for data scientists who use it for building and using Machine learning applications and other scientific computations. Python runs on Windows, Linux/Unix, Mac OS and has been ported to Java and .NET virtual machines. Python is free to use, even for the commercial products, because of its OSI-approved open source license. Python has evolved as the most preferred Language for Data Analytics and the increasing search trends on python also indicates that Python is the next "Big Thing" and a must for Professionals in the Data Analytics domain. For more information, Please write back to us at sales@edureka.co or call us at IND: 9606058406 / US: 18338555775 (toll free). Instagram: https://www.instagram.com/edureka_learning/ Facebook: https://www.facebook.com/edurekaIN/ Twitter: https://twitter.com/edurekain LinkedIn: https://www.linkedin.com/company/edureka

detail
{'title': 'Python For Data Analysis | Data Analysis Using Python | Python Data Analysis Tutorial | Edureka', 'heatmap': [{'end': 1685.376, 'start': 1658.616, 'weight': 1}], 'summary': 'Covers python for data analysis, including applications, pandas, numpy, scipy, web scraping, data manipulation, visualization, and statistics. it also demonstrates youth unemployment analysis and data processing in hadoop.', 'chapters': [{'end': 59.088, 'segs': [{'end': 59.088, 'src': 'embed', 'start': 0.029, 'weight': 0, 'content': [{'end': 6.023, 'text': "Hello everyone, this is Saurabh from Edureka and in today's session we'll be focusing on data analysis with Python.", 'start': 0.029, 'duration': 5.994}, {'end': 9.231, 'text': 'So let us move forward and have a look at the agenda for today.', 'start': 6.745, 'duration': 2.486}, {'end': 12.975, 'text': "So first we'll see various applications of Python.", 'start': 10.714, 'duration': 2.261}, {'end': 18.097, 'text': "After that we'll understand the data lifecycle starting from data warehousing to data visualization.", 'start': 13.515, 'duration': 4.582}, {'end': 22.719, 'text': "Then we'll focus on data analysis and we'll see how we can use Python for that purpose.", 'start': 18.597, 'duration': 4.122}, {'end': 28.461, 'text': "We'll also look at what is Pandas library and we'll also understand a bit about NumPy and SciPy.", 'start': 23.239, 'duration': 5.222}, {'end': 33.043, 'text': "Then we'll focus on various Pandas operations, merging, joining, all those things.", 'start': 29.122, 'duration': 3.921}, {'end': 36.965, 'text': "And we'll see after that Python for statistics and Python for Hadoop.", 'start': 33.624, 'duration': 3.341}, {'end': 43.799, 'text': 'So till now any doubts, are you guys clear with the agenda? If you have any questions, any doubts, you can write it down in your chat box.', 'start': 37.776, 'duration': 6.023}, {'end': 44.76, 'text': "I'll be happy to help you.", 'start': 43.839, 'duration': 0.921}, {'end': 47.561, 'text': 'All right, so Jason says all clear.', 'start': 46.141, 'duration': 1.42}, {'end': 49.563, 'text': 'So does Dave, Jessica.', 'start': 47.601, 'duration': 1.962}, {'end': 53.585, 'text': 'What about the others? All right, Ayushi says all clear.', 'start': 50.383, 'duration': 3.202}, {'end': 55.906, 'text': 'Siddharth says move on.', 'start': 54.805, 'duration': 1.101}, {'end': 58.067, 'text': 'Neha says go on.', 'start': 57.227, 'duration': 0.84}, {'end': 59.088, 'text': 'All right, thank you guys.', 'start': 58.207, 'duration': 0.881}], 'summary': 'Saurabh from edureka discusses data analysis with python, covering applications, data lifecycle, pandas, numpy, scipy, and python for statistics and hadoop.', 'duration': 59.059, 'max_score': 0.029, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A29.jpg'}], 'start': 0.029, 'title': 'Data analysis with python', 'summary': 'Covers the agenda for a data analysis session with python, including applications of python, data lifecycle, pandas library, numpy, scipy, pandas operations, python for statistics, and python for hadoop, with participants confirming their understanding.', 'chapters': [{'end': 59.088, 'start': 0.029, 'title': 'Data analysis with python', 'summary': 'Covers the agenda for a data analysis session with python, including applications of python, data lifecycle, pandas library, numpy, scipy, pandas operations, python for statistics, and python for hadoop, with participants confirming their understanding.', 'duration': 59.059, 'highlights': ['Participants confirmed understanding of the agenda for the data analysis session with Python, with Jason, Dave, Jessica, Ayushi, Siddharth, and Neha acknowledging clarity.', 'The session will cover various applications of Python, the data lifecycle from warehousing to visualization, and the use of Python, Pandas library, NumPy, SciPy, Pandas operations, Python for statistics, and Python for Hadoop in data analysis.', 'Saurabh from Edureka introduces the focus of the session, which is data analysis with Python, and outlines the agenda, including applications of Python, data lifecycle, Pandas library, NumPy, SciPy, Pandas operations, Python for statistics, and Python for Hadoop.']}], 'duration': 59.059, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A29.jpg', 'highlights': ['The session will cover various applications of Python, the data lifecycle from warehousing to visualization, and the use of Python, Pandas library, NumPy, SciPy, Pandas operations, Python for statistics, and Python for Hadoop in data analysis.', 'Saurabh from Edureka introduces the focus of the session, which is data analysis with Python, and outlines the agenda, including applications of Python, data lifecycle, Pandas library, NumPy, SciPy, Pandas operations, Python for statistics, and Python for Hadoop.', 'Participants confirmed understanding of the agenda for the data analysis session with Python, with Jason, Dave, Jessica, Ayushi, Siddharth, and Neha acknowledging clarity.']}, {'end': 445.856, 'segs': [{'end': 90.969, 'src': 'embed', 'start': 59.548, 'weight': 0, 'content': [{'end': 62.73, 'text': "So we'll move forward and we'll see what are the various applications of Python.", 'start': 59.548, 'duration': 3.182}, {'end': 66.399, 'text': 'So these are the applications of Python.', 'start': 64.837, 'duration': 1.562}, {'end': 69.12, 'text': 'I have listed only four of those although there are many more.', 'start': 66.639, 'duration': 2.481}, {'end': 75.262, 'text': 'So you can perform web scraping with Python that is you can extract certain contents from a particular web page.', 'start': 69.9, 'duration': 5.362}, {'end': 77.263, 'text': 'You can perform a web development.', 'start': 75.763, 'duration': 1.5}, {'end': 80.545, 'text': 'You can perform testing as well as you can perform data analysis.', 'start': 77.343, 'duration': 3.202}, {'end': 84.766, 'text': "So for today's session, we'll be focusing on a data analysis part of Python.", 'start': 80.945, 'duration': 3.821}, {'end': 90.969, 'text': 'So are we guys clear? So guys, let us move forward and see what exactly is data lifecycle.', 'start': 85.547, 'duration': 5.422}], 'summary': "Python has various applications including web scraping, web development, testing, and data analysis, which is the focus of today's session.", 'duration': 31.421, 'max_score': 59.548, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A59548.jpg'}, {'end': 137.933, 'src': 'embed', 'start': 103.22, 'weight': 1, 'content': [{'end': 105.181, 'text': 'So data is basically stored in different formats.', 'start': 103.22, 'duration': 1.961}, {'end': 111.563, 'text': 'Now, what do you do you actually convert that data or transform that data into a single format and you store it somewhere.', 'start': 105.601, 'duration': 5.962}, {'end': 114.064, 'text': "That's where data warehousing comes into picture.", 'start': 112.063, 'duration': 2.001}, {'end': 118.026, 'text': 'Now, once you have stored your data, you can perform certain analysis on it.', 'start': 114.524, 'duration': 3.502}, {'end': 120.046, 'text': 'You can perform predictive modeling.', 'start': 118.326, 'duration': 1.72}, {'end': 121.467, 'text': 'You can join merge data.', 'start': 120.166, 'duration': 1.301}, {'end': 124.468, 'text': "So various other things that we are going to see in today's session.", 'start': 121.927, 'duration': 2.541}, {'end': 131.791, 'text': 'Now, once you have done the analysis, you can even plot it in the form of a graph and that stage is called a data visualization.', 'start': 125.348, 'duration': 6.443}, {'end': 134.932, 'text': 'So this is just a general overview about data life cycle.', 'start': 132.191, 'duration': 2.741}, {'end': 137.933, 'text': 'If you have any doubts or questions, you can write it down in your chat box.', 'start': 135.032, 'duration': 2.901}], 'summary': 'Data is stored in different formats, converted into a single format via data warehousing, enabling analysis, predictive modeling, and data visualization.', 'duration': 34.713, 'max_score': 103.22, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A103220.jpg'}, {'end': 222.064, 'src': 'embed', 'start': 198.16, 'weight': 3, 'content': [{'end': 204.343, 'text': 'Now what should I do? So basically what I need to do is, in this particular data set, I need to perform certain analysis.', 'start': 198.16, 'duration': 6.183}, {'end': 212.948, 'text': 'That analysis should give me the percentage increase in unemployed youth in Afghanistan between 2010 to 2011.', 'start': 204.924, 'duration': 8.024}, {'end': 215.87, 'text': 'So this basically explains what is data analysis and why we use it.', 'start': 212.948, 'duration': 2.922}, {'end': 220.182, 'text': 'If you have any doubts with respect to what is data analysis, you can write it down in your chat box.', 'start': 216.38, 'duration': 3.802}, {'end': 222.064, 'text': 'It is a very simple concept guys.', 'start': 220.623, 'duration': 1.441}], 'summary': 'Perform data analysis to find % increase in afghan youth unemployment from 2010 to 2011.', 'duration': 23.904, 'max_score': 198.16, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A198160.jpg'}, {'end': 311.073, 'src': 'embed', 'start': 247.15, 'weight': 4, 'content': [{'end': 255.275, 'text': 'What is pandas pandas is a software module written for python programming language, which is used for data manipulation and data analysis.', 'start': 247.15, 'duration': 8.125}, {'end': 261.559, 'text': 'Now it can perform that at a fairly high performance rate when it is compared to other python procedures.', 'start': 256.035, 'duration': 5.524}, {'end': 267.742, 'text': 'Now we can say that pandas is actually built on top of numpy, scipy and matplotlib.', 'start': 262.459, 'duration': 5.283}, {'end': 271.865, 'text': 'matplotlib is basically a data visualization module that we use in python.', 'start': 267.742, 'duration': 4.123}, {'end': 279.164, 'text': 'Now when we talk about numpy and scipy, numpy is actually a fundamental package for scientific computing in Python.', 'start': 272.608, 'duration': 6.556}, {'end': 288.214, 'text': 'So it contains a powerful n-dimensional array object, It has tools for integrating with C, C++ and it is very useful in performing linear algebra,', 'start': 279.525, 'duration': 8.689}, {'end': 290.917, 'text': 'Fourier, transform, random number capabilities, et cetera.', 'start': 288.214, 'duration': 2.703}, {'end': 298.843, 'text': 'When I talk about SciPy, SciPy is again an open source Python module used for scientific computing and technical computing.', 'start': 291.457, 'duration': 7.386}, {'end': 306.089, 'text': 'SciPy contains module for optimization, linear algebra, integration, interpolation, special functions.', 'start': 299.183, 'duration': 6.906}, {'end': 308.13, 'text': 'Fourier transforms all those things right?', 'start': 306.089, 'duration': 2.041}, {'end': 311.073, 'text': "We'll actually focus on NumPy and SciPy in the next session,", 'start': 308.471, 'duration': 2.602}], 'summary': 'Pandas is a python module for data analysis, built on top of numpy, scipy, and matplotlib, offering high performance rate.', 'duration': 63.923, 'max_score': 247.15, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A247150.jpg'}], 'start': 59.548, 'title': 'Python applications and data analysis', 'summary': 'Covers python applications in data analysis, including web scraping, web development, testing, data warehousing, predictive modeling, and data visualization. it also explains data analysis through unemployment data and introduces data analysis with python using the pandas module.', 'chapters': [{'end': 159.143, 'start': 59.548, 'title': 'Applications of python in data analysis', 'summary': 'Discusses the applications of python, focusing on data analysis and the data lifecycle, covering web scraping, web development, testing, and data analysis, with a focus on data warehousing, predictive modeling, data visualization, and various data formats.', 'duration': 99.595, 'highlights': ['Python applications include web scraping, web development, testing, and data analysis. Python can be used for web scraping, web development, testing, and data analysis, among other applications.', 'Data lifecycle involves storing data in different formats, transforming it into a single format, and utilizing data warehousing. The data lifecycle involves storing data in various formats, transforming it into a single format, and utilizing data warehousing for storage.', 'After analysis, data can be visualized in the form of graphs. Following analysis, data can be visualized through graph plotting.']}, {'end': 231.149, 'start': 160.523, 'title': 'Understanding data analysis', 'summary': 'Explains data analysis using the example of analyzing unemployment data across countries to find the percentage increase in unemployed youth within a specific country between 2010 to 2011, emphasizing the importance and purpose of data analysis.', 'duration': 70.626, 'highlights': ['Performing data analysis to find the percentage increase in unemployed youth in a specific country between 2010 to 2011 exemplifies the practical application and significance of data analysis.', 'Analyzing unemployment data across countries from 2010 to 2014 to find the percentage of unemployed youth provides a clear illustration of the dataset being utilized for analysis.', 'Emphasizing the importance and purpose of data analysis by demonstrating its practical application in finding specific information, such as the percentage increase in unemployed youth within a country during a specific time frame.']}, {'end': 445.856, 'start': 231.189, 'title': 'Data analysis with python using pandas', 'summary': 'Introduces performing data analysis with python using the pandas module, which offers high performance data manipulation and analysis capabilities, built on top of numpy, scipy, and matplotlib, with upcoming focus on numpy and scipy, and separate session on data visualization using matplotlib.', 'duration': 214.667, 'highlights': ['Pandas is a software module for Python used for data manipulation and analysis, offering high performance compared to other Python procedures. Pandas is a software module for Python used for data manipulation and analysis, offering high performance compared to other Python procedures.', 'Pandas is built on top of numpy, scipy, and matplotlib, with upcoming focus on NumPy and SciPy, and separate session on data visualization using matplotlib. Pandas is built on top of numpy, scipy, and matplotlib, with upcoming focus on NumPy and SciPy, and separate session on data visualization using matplotlib.', 'NumPy is a fundamental package for scientific computing in Python, containing a powerful n-dimensional array object and tools for integrating with C, C++, and performing linear algebra, Fourier transform, and random number capabilities. NumPy is a fundamental package for scientific computing in Python, containing a powerful n-dimensional array object and tools for integrating with C, C++, and performing linear algebra, Fourier transform, and random number capabilities.', 'SciPy is an open source Python module used for scientific and technical computing, containing modules for optimization, linear algebra, integration, interpolation, and special functions. SciPy is an open source Python module used for scientific and technical computing, containing modules for optimization, linear algebra, integration, interpolation, and special functions.', 'The chapter introduces performing data analysis with Python using the pandas module, which offers high performance data manipulation and analysis capabilities, built on top of numpy, scipy, and matplotlib, with upcoming focus on NumPy and SciPy, and separate session on data visualization using matplotlib. The chapter introduces performing data analysis with Python using the pandas module, which offers high performance data manipulation and analysis capabilities, built on top of numpy, scipy, and matplotlib, with upcoming focus on NumPy and SciPy, and separate session on data visualization using matplotlib.']}], 'duration': 386.308, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A59548.jpg', 'highlights': ['Python applications include web scraping, web development, testing, and data analysis.', 'Data lifecycle involves storing data in different formats, transforming it into a single format, and utilizing data warehousing.', 'After analysis, data can be visualized in the form of graphs.', 'Performing data analysis to find the percentage increase in unemployed youth in a specific country between 2010 to 2011 exemplifies the practical application and significance of data analysis.', 'Pandas is a software module for Python used for data manipulation and analysis, offering high performance compared to other Python procedures.', 'NumPy is a fundamental package for scientific computing in Python, containing a powerful n-dimensional array object and tools for integrating with C, C++, and performing linear algebra, Fourier transform, and random number capabilities.', 'SciPy is an open source Python module used for scientific and technical computing, containing modules for optimization, linear algebra, integration, interpolation, and special functions.', 'The chapter introduces performing data analysis with Python using the pandas module, which offers high performance data manipulation and analysis capabilities, built on top of numpy, scipy, and matplotlib, with upcoming focus on NumPy and SciPy, and separate session on data visualization using matplotlib.']}, {'end': 721.472, 'segs': [{'end': 477.523, 'src': 'embed', 'start': 448.017, 'weight': 0, 'content': [{'end': 452.398, 'text': "So now what I'm going to do is I'm going to close this dictionary,", 'start': 448.017, 'duration': 4.381}, {'end': 457.1, 'text': "and now what I'm going to do is I'm going to convert this dictionary into a Pandas data frame.", 'start': 452.398, 'duration': 4.702}, {'end': 458.821, 'text': 'Now how we do that.', 'start': 457.58, 'duration': 1.241}, {'end': 463.763, 'text': "let me first declare a variable, say df, and over here what I'm going to type in.", 'start': 458.821, 'duration': 4.942}, {'end': 473.08, 'text': "I'm going to type in as pd for pandas, pd dot, data frame and the name of my dictionary, which is xyz web.", 'start': 463.763, 'duration': 9.317}, {'end': 477.523, 'text': "Now go ahead and print this data frame and we'll see what exactly happens.", 'start': 473.72, 'duration': 3.803}], 'summary': 'Converts a dictionary into a pandas data frame using pd.dataframe', 'duration': 29.506, 'max_score': 448.017, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A448017.jpg'}, {'end': 574.458, 'src': 'embed', 'start': 522.864, 'weight': 1, 'content': [{'end': 525.726, 'text': 'So these are the operations that you can perform with pandas data frame.', 'start': 522.864, 'duration': 2.862}, {'end': 531.19, 'text': 'You can slice the data frame that is if you want only a particular part of that data frame, you can do that.', 'start': 526.126, 'duration': 5.064}, {'end': 533.091, 'text': 'You can change the index value.', 'start': 531.55, 'duration': 1.541}, {'end': 535.553, 'text': 'You can convert the data into a different format.', 'start': 533.232, 'duration': 2.321}, {'end': 537.595, 'text': 'You can actually change the column headers.', 'start': 535.873, 'duration': 1.722}, {'end': 543.859, 'text': 'You can perform concatenation of multiple data frames and you can even perform joining and merging of two or more data frames.', 'start': 537.635, 'duration': 6.224}, {'end': 547.122, 'text': 'These are all the basic operations that you can perform with pandas.', 'start': 544.4, 'duration': 2.722}, {'end': 550.164, 'text': "So we'll move forward and have a look at these operations one by one.", 'start': 547.542, 'duration': 2.622}, {'end': 552.504, 'text': "First we'll look at slicing.", 'start': 551.343, 'duration': 1.161}, {'end': 558.167, 'text': 'So over here we have a data in which there is an index value which is nothing but the year 2001, 2003 and 2004.', 'start': 553.304, 'duration': 4.863}, {'end': 562.73, 'text': 'Here we have interest rate and here we have US GDP in thousands.', 'start': 558.167, 'duration': 4.563}, {'end': 566.893, 'text': 'Now I want to slice a particular column from this particular data frame.', 'start': 563.291, 'duration': 3.602}, {'end': 568.854, 'text': 'So what will happen if I do that?', 'start': 567.473, 'duration': 1.381}, {'end': 574.458, 'text': 'It should only give me so when I slice only the starting two rows, it will give me only till 2002.', 'start': 569.074, 'duration': 5.384}], 'summary': 'Basic operations with pandas data frame: slicing, index change, format conversion, header change, concatenation, joining, and merging.', 'duration': 51.594, 'max_score': 522.864, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A522864.jpg'}, {'end': 630.512, 'src': 'embed', 'start': 603.43, 'weight': 4, 'content': [{'end': 606.453, 'text': "So I'll keep two here and we'll see what happens when I execute this.", 'start': 603.43, 'duration': 3.023}, {'end': 609.035, 'text': 'So yep, there are only two rows that are present.', 'start': 606.913, 'duration': 2.122}, {'end': 615.74, 'text': 'So this is how you can actually print only a part of the data and if I want only the last part of the data that is the last two rows.', 'start': 609.555, 'duration': 6.185}, {'end': 621.445, 'text': 'So what I can do is I can convert this to tail instead and we can do that as well.', 'start': 616.201, 'duration': 5.244}, {'end': 625.748, 'text': 'Go ahead and execute this and yep, you can see that it has printed the last two rows.', 'start': 621.525, 'duration': 4.223}, {'end': 627.81, 'text': 'So this is how you can perform slicing.', 'start': 626.329, 'duration': 1.481}, {'end': 630.512, 'text': 'If you have any questions or any doubts, you can ask me right now.', 'start': 627.89, 'duration': 2.622}], 'summary': 'Demonstrating data slicing and extraction, displaying last two rows.', 'duration': 27.082, 'max_score': 603.43, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A603430.jpg'}, {'end': 689.653, 'src': 'embed', 'start': 664.363, 'weight': 5, 'content': [{'end': 671.946, 'text': 'Now these two data frames can be merged together to form a single data frame and we can actually make sure what all columns that we need to keep common.', 'start': 664.363, 'duration': 7.583}, {'end': 682.81, 'text': 'So over here we have common columns as HPI, interest rate and index, but when I talk about GDP, we have two US GDPs, one is X and another is Y.', 'start': 672.506, 'duration': 10.304}, {'end': 689.653, 'text': 'So this is how actually you can perform merging, you can actually make sure what all columns you want in your final merged data frame.', 'start': 682.81, 'duration': 6.843}], 'summary': 'Merging two data frames to form a single data frame, keeping common columns like hpi, interest rate, and index, while managing multiple us gdps.', 'duration': 25.29, 'max_score': 664.363, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A664363.jpg'}], 'start': 448.017, 'title': 'Pandas data manipulation', 'summary': 'Demonstrates converting a dictionary into a pandas data frame, slicing, changing index value, converting data format, changing column headers, concatenating data frames, and merging data frames. it also covers data slicing, merging, obtaining specific rows, and forming a single data frame with a practical demonstration of the merge operation.', 'chapters': [{'end': 552.504, 'start': 448.017, 'title': 'Pandas data frame operations', 'summary': 'Demonstrates how to convert a dictionary into a pandas data frame, showcasing basic operations such as slicing, changing index value, converting data format, changing column headers, concatenating data frames, and joining and merging data frames.', 'duration': 104.487, 'highlights': ["The chapter demonstrates how to convert a dictionary into a Pandas data frame. The speaker converts a dictionary into a Pandas data frame named 'xyz web' using the pd.dataframe function.", 'Basic operations such as slicing, changing index value, and converting data format are showcased. The speaker mentions basic operations including slicing, changing index value, and converting data format as introductory examples for using Pandas library.', 'The speaker discusses operations like changing column headers, concatenating data frames, and joining and merging data frames. The speaker lists advanced operations such as changing column headers, concatenating data frames, and joining and merging data frames as basic operations of Pandas.']}, {'end': 721.472, 'start': 553.304, 'title': 'Data slicing and merging with pandas', 'summary': 'Discusses data slicing and merging with pandas, demonstrating how to slice data to obtain specific rows and how to merge two data frames to form a single data frame, with the explanation of common columns and a practical demonstration of the merge operation.', 'duration': 168.168, 'highlights': ['The process of slicing a data frame is demonstrated, showing how to obtain specific rows by slicing based on index values, such as obtaining data for the years 2001 and 2002. Slicing the data frame to obtain specific rows based on index values, such as obtaining data for the years 2001 and 2002.', "Demonstration of how to print only a part of the data by using the 'head' and 'tail' functions, with an example of printing the starting two rows and the last two rows of the data frame. Practical demonstration of printing only a part of the data frame using 'head' and 'tail' functions, with an example of printing the starting two rows and the last two rows.", 'Explanation of the merging process, showcasing the merging of two data frames to form a single data frame and determining common columns, with emphasis on handling multiple occurrences of a column, such as US GDP. Explanation of the merging process, showcasing the merging of two data frames, determining common columns, and handling multiple occurrences of a column, such as US GDP.']}], 'duration': 273.455, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A448017.jpg', 'highlights': ['The chapter demonstrates converting a dictionary into a Pandas data frame using the pd.dataframe function.', 'Basic operations like slicing, changing index value, and converting data format are showcased as introductory examples for using Pandas library.', 'The speaker lists advanced operations such as changing column headers, concatenating data frames, and joining and merging data frames as basic operations of Pandas.', 'Slicing the data frame to obtain specific rows based on index values, such as obtaining data for the years 2001 and 2002.', "Practical demonstration of printing only a part of the data frame using 'head' and 'tail' functions, with an example of printing the starting two rows and the last two rows.", 'Explanation of the merging process, showcasing the merging of two data frames, determining common columns, and handling multiple occurrences of a column, such as US GDP.']}, {'end': 1249.908, 'segs': [{'end': 764.262, 'src': 'embed', 'start': 721.852, 'weight': 3, 'content': [{'end': 724.892, 'text': 'So for that what I need to do is I need to import this Pandas module.', 'start': 721.852, 'duration': 3.04}, {'end': 731.753, 'text': "So for that I'll type in import Pandas as PD and now what I'm going to do is I'm going to define three data frames.", 'start': 724.912, 'duration': 6.841}, {'end': 735.954, 'text': "Let me name it as df1 and over here what I'll type in.", 'start': 732.274, 'duration': 3.68}, {'end': 745.616, 'text': "I'll type in PD dot data frame and I'm going to use a tuple and inside the tuple I'm going to define a dictionary and I'll be using multiple lists inside that dictionary.", 'start': 735.954, 'duration': 9.662}, {'end': 753.314, 'text': "So the first key that I'll use is HPI house pricing index, and the value that I'll assign it to.", 'start': 746.55, 'duration': 6.764}, {'end': 757.197, 'text': "HPI is a list, and in that list I'll place certain values.", 'start': 753.314, 'duration': 3.883}, {'end': 763.68, 'text': 'So let it be 80, 90, 70, 60.', 'start': 757.877, 'duration': 5.803}, {'end': 764.262, 'text': 'All right.', 'start': 763.681, 'duration': 0.581}], 'summary': 'Imported pandas module, defined 3 data frames with house pricing index values as 80, 90, 70, 60.', 'duration': 42.41, 'max_score': 721.852, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A721852.jpg'}, {'end': 934.524, 'src': 'embed', 'start': 907.096, 'weight': 0, 'content': [{'end': 914.378, 'text': 'So as you can see that we have merged the two data frame that is DF1 and DF2 and we have got one single data frame.', 'start': 907.096, 'duration': 7.282}, {'end': 920.54, 'text': "Now, what if I don't wanna keep certain columns as common when I perform the merge operation?", 'start': 915.859, 'duration': 4.681}, {'end': 925.321, 'text': 'So what I can do is I can write in the columns that I wanna keep as common.', 'start': 921.2, 'duration': 4.121}, {'end': 934.524, 'text': "So suppose, if I want only the HPI column to be common, so I'll just type in here on HPI and when I go ahead and print this.", 'start': 925.341, 'duration': 9.183}], 'summary': 'Merged df1 and df2 to create a single data frame, allowing for selective common columns.', 'duration': 27.428, 'max_score': 907.096, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A907096.jpg'}, {'end': 1000.524, 'src': 'embed', 'start': 971.867, 'weight': 1, 'content': [{'end': 975.591, 'text': 'Let us move forward and have a look at the next operation that is joining.', 'start': 971.867, 'duration': 3.724}, {'end': 980.674, 'text': 'So in joining what happens the two data frames are joined on the basis of their index values.', 'start': 976.772, 'duration': 3.902}, {'end': 983.575, 'text': 'So let me show you that so we have two data frames.', 'start': 981.174, 'duration': 2.401}, {'end': 985.716, 'text': 'one is this and another one is this.', 'start': 983.575, 'duration': 2.141}, {'end': 988.478, 'text': 'so, over here, what happens when we join both of these data frames?', 'start': 985.716, 'duration': 2.762}, {'end': 989.939, 'text': 'So let us see what happens.', 'start': 988.778, 'duration': 1.161}, {'end': 995.421, 'text': 'As you can see that by joining these two data frames we get this one single data frame.', 'start': 991.639, 'duration': 3.782}, {'end': 1000.524, 'text': "Now one thing to notice here guys as I've told you earlier as well joining happens with the index values.", 'start': 996.002, 'duration': 4.522}], 'summary': 'Joining operation merges two data frames based on index values.', 'duration': 28.657, 'max_score': 971.867, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A971867.jpg'}, {'end': 1028.52, 'src': 'embed', 'start': 1006.959, 'weight': 6, 'content': [{'end': 1015.776, 'text': 'that is 2005,, but after joining, the 2005 index appears in the data frame, but there is no interest rate or US GDP thousands associated with it.', 'start': 1006.959, 'duration': 8.817}, {'end': 1020.858, 'text': 'Similarly when I talk about the data frame 2 that is the second data frame over here.', 'start': 1016.417, 'duration': 4.441}, {'end': 1022.878, 'text': "We don't have any 2002 index value.", 'start': 1020.898, 'duration': 1.98}, {'end': 1028.52, 'text': 'So the value with respect to 2002 will be NAN and for unemployment also it will remain as NAN.', 'start': 1023.359, 'duration': 5.161}], 'summary': 'Missing data for 2005 index and 2002 index in data frames 1 and 2, resulting in nan values for interest rate, us gdp, and unemployment.', 'duration': 21.561, 'max_score': 1006.959, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A1006959.jpg'}, {'end': 1180.468, 'src': 'embed', 'start': 1152.981, 'weight': 2, 'content': [{'end': 1158.342, 'text': 'So we saw exactly how to join two data frames and let us move forward and see what is the other operation.', 'start': 1152.981, 'duration': 5.361}, {'end': 1162.603, 'text': 'Now we are going to change the index and column headers.', 'start': 1160.043, 'duration': 2.56}, {'end': 1165.144, 'text': 'Now let us see how this actually happens.', 'start': 1163.044, 'duration': 2.1}, {'end': 1167.785, 'text': 'So we have two data frames here.', 'start': 1166.464, 'duration': 1.321}, {'end': 1175.567, 'text': 'So one contains index, interest rate and US GDP in thousands, another has index as the year and we have only US GDP thousands.', 'start': 1168.205, 'duration': 7.362}, {'end': 1176.587, 'text': 'there is no interest rate here.', 'start': 1175.567, 'duration': 1.02}, {'end': 1180.468, 'text': 'So what happens when I change the column headers or I change the index?', 'start': 1177.247, 'duration': 3.221}], 'summary': 'Demonstrated joining data frames and changing index and column headers.', 'duration': 27.487, 'max_score': 1152.981, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A1152981.jpg'}], 'start': 721.852, 'title': 'Python pandas data frame and operations', 'summary': 'Demonstrates creating pandas data frames with key-value pairs and index values for years 2001 to 2008 and covers merge, join, and changing index and column headers operations.', 'chapters': [{'end': 868.615, 'start': 721.852, 'title': 'Python pandas data frame', 'summary': 'Demonstrates the creation of pandas data frames in python by defining three data frames with key-value pairs and index values, containing lists of specific data such as house pricing index, interest rates, and ind gdp, for years 2001 to 2008.', 'duration': 146.763, 'highlights': ['The chapter explains the process of defining three data frames using Pandas in Python. It demonstrates the creation of Pandas data frames in Python by defining three data frames with key-value pairs and index values.', 'The data frames contain specific data such as house pricing index, interest rates, and IND GDP. The data frames contain specific data such as house pricing index, interest rates, and IND GDP.', 'The data is organized in lists for each key in the data frames. The data is organized in lists for each key in the data frames, with specific values such as 80, 90, 70, 60 for HPI, 2, 1, 2, 3 for interest rates, and 50, 45, 45, 67 for IND GDP.', 'The index values for the data frames range from 2001 to 2008. The index values for the data frames range from 2001 to 2008, organized as lists for each data frame.']}, {'end': 1249.908, 'start': 868.615, 'title': 'Pandas operations overview', 'summary': 'Covers the merge, join, and changing index and column headers operations in pandas, with practical demonstrations of each process.', 'duration': 381.293, 'highlights': ['The process of merging two data frames is demonstrated, showing how to merge and keep certain columns as common, providing a practical example and the resulting data frame.', 'The joining of two data frames based on their index values is explained, with a demonstration of how the index values appear in the joined data frame and the handling of missing values (NAN) shown with a practical example.', 'Practical demonstration of changing index and column headers in a data frame is provided, showcasing how to modify the index value and column headers, with a specific example using a dictionary to define a data frame and change its structure.']}], 'duration': 528.056, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A721852.jpg', 'highlights': ['The process of merging two data frames is demonstrated, showing how to merge and keep certain columns as common, providing a practical example and the resulting data frame.', 'The joining of two data frames based on their index values is explained, with a demonstration of how the index values appear in the joined data frame and the handling of missing values (NAN) shown with a practical example.', 'Practical demonstration of changing index and column headers in a data frame is provided, showcasing how to modify the index value and column headers, with a specific example using a dictionary to define a data frame and change its structure.', 'The chapter explains the process of defining three data frames using Pandas in Python. It demonstrates the creation of Pandas data frames in Python by defining three data frames with key-value pairs and index values.', 'The data frames contain specific data such as house pricing index, interest rates, and IND GDP. The data frames contain specific data such as house pricing index, interest rates, and IND GDP.', 'The data is organized in lists for each key in the data frames. The data is organized in lists for each key in the data frames, with specific values such as 80, 90, 70, 60 for HPI, 2, 1, 2, 3 for interest rates, and 50, 45, 45, 67 for IND GDP.', 'The index values for the data frames range from 2001 to 2008. The index values for the data frames range from 2001 to 2008, organized as lists for each data frame.']}, {'end': 1713.959, 'segs': [{'end': 1504.299, 'src': 'embed', 'start': 1456.52, 'weight': 0, 'content': [{'end': 1458.44, 'text': 'So how am I gonna approach this task?', 'start': 1456.52, 'duration': 1.92}, {'end': 1467.196, 'text': "Now, what I'm going to do is I'm going to type in here as DF equals to DF dot rename columns equal to.", 'start': 1459.349, 'duration': 7.847}, {'end': 1473.722, 'text': 'I want to replace visitors with with what I want to replace it.', 'start': 1467.196, 'duration': 6.526}, {'end': 1475.163, 'text': 'I can replace it with users.', 'start': 1473.722, 'duration': 1.441}, {'end': 1480.327, 'text': 'Now go ahead and print DF and you can see that column header has been changed.', 'start': 1476.164, 'duration': 4.163}, {'end': 1483.25, 'text': 'Instead of visitors we have users now.', 'start': 1481.608, 'duration': 1.642}, {'end': 1487.574, 'text': 'So this is how you can actually change the column headers and index values.', 'start': 1484.773, 'duration': 2.801}, {'end': 1491.175, 'text': 'If you have any doubts, any questions with respect to this particular operation,', 'start': 1488.014, 'duration': 3.161}, {'end': 1495.556, 'text': 'you can write it down in the chat box or related to all the topics that we have discussed till now.', 'start': 1491.175, 'duration': 4.381}, {'end': 1500.678, 'text': 'Any questions guys? So we have no questions till now.', 'start': 1495.876, 'duration': 4.802}, {'end': 1504.299, 'text': "So I'll open my slides again and we'll see what are the other operations with pandas.", 'start': 1500.898, 'duration': 3.401}], 'summary': "Demonstrating how to rename column headers and index values in pandas, replacing 'visitors' with 'users'.", 'duration': 47.779, 'max_score': 1456.52, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A1456520.jpg'}, {'end': 1585.345, 'src': 'embed', 'start': 1558.137, 'weight': 2, 'content': [{'end': 1561.799, 'text': 'So yep, as you can see that we have concatenated the two data frames.', 'start': 1558.137, 'duration': 3.662}, {'end': 1570.073, 'text': 'So the index values are from 2004 for the first data frame and then it starts from 2005 to 2008 for the second data frame.', 'start': 1562.967, 'duration': 7.106}, {'end': 1573.395, 'text': 'So as you can see that concatenation has been successfully performed.', 'start': 1570.513, 'duration': 2.882}, {'end': 1581.722, 'text': 'So we have index values from 2001 to 2008 and for the first data frame it is still 2004 and for the second data frame it starts from 2005 till 2008.', 'start': 1573.756, 'duration': 7.966}, {'end': 1585.345, 'text': 'So this is how you can perform concatenation.', 'start': 1581.722, 'duration': 3.623}], 'summary': 'Concatenated data frames with index values from 2004 to 2008.', 'duration': 27.208, 'max_score': 1558.137, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A1558137.jpg'}, {'end': 1695.399, 'src': 'heatmap', 'start': 1658.616, 'weight': 3, 'content': [{'end': 1661.257, 'text': 'So now what is the operation that we are going to see here.', 'start': 1658.616, 'duration': 2.641}, {'end': 1665.22, 'text': 'We are going to convert the CSV file into an HTML file.', 'start': 1661.317, 'duration': 3.903}, {'end': 1672.284, 'text': "So for that what I'm going to type in country dot to underscore HTML.", 'start': 1666.381, 'duration': 5.903}, {'end': 1681.915, 'text': 'Open close parenthesis and I can say edu.html.', 'start': 1675.146, 'duration': 6.769}, {'end': 1685.376, 'text': "Now go ahead and execute this and we'll see what happens.", 'start': 1682.555, 'duration': 2.821}, {'end': 1692.238, 'text': "And I'm going to open my projects folder as you can see that edu.html is added.", 'start': 1687.596, 'duration': 4.642}, {'end': 1695.399, 'text': 'When I click over there it gives me the HTML code for that.', 'start': 1692.618, 'duration': 2.781}], 'summary': "Converting csv file to html using 'country.to_html()' resulted in 'edu.html' added to projects folder.", 'duration': 34.082, 'max_score': 1658.616, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A1658616.jpg'}], 'start': 1249.909, 'title': 'Pandas operations and visualization', 'summary': 'Demonstrates using pandas to manipulate data, including converting a dictionary to a data frame, setting index values, and changing column headers. it also covers practical applications such as visualization with graphs, modifying column headers and index values, concatenating data frames, and converting a csv file to html format.', 'chapters': [{'end': 1414.446, 'start': 1249.909, 'title': 'Using pandas to manipulate data', 'summary': 'Demonstrates the process of converting a dictionary to a pandas data frame, setting index values, and changing column headers, while also introducing the use of matplotlib to plot the data frame.', 'duration': 164.537, 'highlights': ['The process of converting a dictionary to a Pandas data frame is demonstrated using the pd.DataFrame function.', "Setting the index value is illustrated by using the df.set_index method, with the example of setting 'day' as the index value.", "The method for changing a column header is shown using the example of converting 'day' to 'date' in the data frame.", 'The introduction and usage of matplotlib for plotting the data frame is briefly discussed, including the import of the library and the plot command.']}, {'end': 1713.959, 'start': 1416.747, 'title': 'Pandas operations: visualization, header modification, concatenation, and data munging', 'summary': 'Covers practical applications of pandas, including visualizing data with graphs, modifying column headers and index values, performing concatenation of data frames, and converting a csv file to an html format.', 'duration': 297.212, 'highlights': ["Performing concatenation of data frames. Demonstrates concatenation of two data frames using the 'pd.concat' function, resulting in a combined data frame with index values ranging from 2001 to 2008.", "Changing column headers and index values. Illustrates the process of renaming column headers using the 'DF.rename' function, with an example of changing 'visitors' to 'users' and verifying the modification through printing the data frame.", 'Visualizing data with graphs. Describes the representation of bounce rate and visitors using a graph, emphasizing the upcoming coverage of data visualization and providing a practical example for understanding.', "Converting CSV file to HTML format. Shows the conversion of a CSV file to an HTML file using the 'to_html' function, enabling the transformation of data format from CSV to HTML for data munging purposes."]}], 'duration': 464.05, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A1249909.jpg', 'highlights': ['Demonstrates using pandas to manipulate data, including converting a dictionary to a data frame, setting index values, and changing column headers.', "Illustrates the process of renaming column headers using the 'DF.rename' function, with an example of changing 'visitors' to 'users' and verifying the modification through printing the data frame.", "Performing concatenation of data frames. Demonstrates concatenation of two data frames using the 'pd.concat' function, resulting in a combined data frame with index values ranging from 2001 to 2008.", "Shows the conversion of a CSV file to an HTML file using the 'to_html' function, enabling the transformation of data format from CSV to HTML for data munging purposes."]}, {'end': 2032.943, 'segs': [{'end': 1802.588, 'src': 'embed', 'start': 1738.376, 'weight': 0, 'content': [{'end': 1745.006, 'text': 'So for every country we have the data of the percentage of unemployed youth from 2010 till 2014.', 'start': 1738.376, 'duration': 6.63}, {'end': 1749.112, 'text': 'So what is the problem statement for this particular case study? Let us move forward and see that.', 'start': 1745.006, 'duration': 4.106}, {'end': 1757.765, 'text': 'So basically I want to find the change in the percentage of unemployed youth for every country from 2010 to 2011.', 'start': 1750.199, 'duration': 7.566}, {'end': 1760.547, 'text': 'So what I want, I want to see how the trend is.', 'start': 1757.765, 'duration': 2.782}, {'end': 1767.312, 'text': "What is the percentage change between 2010 to 2011 for every country? So we'll see that.", 'start': 1760.767, 'duration': 6.545}, {'end': 1769.434, 'text': 'First let me show you how the data set looks like.', 'start': 1767.432, 'duration': 2.002}, {'end': 1771.279, 'text': 'So it looks something like this.', 'start': 1770.219, 'duration': 1.06}, {'end': 1773.58, 'text': 'We have the country name, then we have country code.', 'start': 1771.299, 'duration': 2.281}, {'end': 1778.822, 'text': 'But then 2010, the percentage of unemployed youth, same goes for 11, 12, 13, and 14.', 'start': 1773.6, 'duration': 5.222}, {'end': 1781.482, 'text': 'So this is how our data set actually looks like.', 'start': 1778.822, 'duration': 2.66}, {'end': 1783.343, 'text': 'So this is how our data set looks like.', 'start': 1781.943, 'duration': 1.4}, {'end': 1785.804, 'text': 'We have the country name, then we have the country code.', 'start': 1783.363, 'duration': 2.441}, {'end': 1792.946, 'text': 'And then over here we have in 2010, the percentage of unemployed youth, similarly for 11, 12, 13, and 14 as well.', 'start': 1786.284, 'duration': 6.662}, {'end': 1797.647, 'text': 'So let us move forward and actually perform this data analysis,', 'start': 1794.246, 'duration': 3.401}, {'end': 1802.588, 'text': 'in which we are going to find out the percentage change in the unemployed youth between 2010 to 2011..', 'start': 1797.647, 'duration': 4.941}], 'summary': "Analyzing percentage change in unemployed youth from 2010 to 2011 for all countries' data.", 'duration': 64.212, 'max_score': 1738.376, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A1738376.jpg'}, {'end': 1843.611, 'src': 'embed', 'start': 1819.232, 'weight': 4, 'content': [{'end': 1825.254, 'text': "I wanted pandas, I've imported pandas, matplotlib for visualization and the style is 538.", 'start': 1819.232, 'duration': 6.022}, {'end': 1827.735, 'text': "Let me tell you guys, you don't need to worry a lot about visualization,", 'start': 1825.254, 'duration': 2.481}, {'end': 1831.656, 'text': "because I'm going to teach you visualization in detail in the upcoming sessions.", 'start': 1827.735, 'duration': 3.921}, {'end': 1836.118, 'text': 'So for now just focus on pandas and the various operations that we can perform with it.', 'start': 1832.156, 'duration': 3.962}, {'end': 1843.611, 'text': "Now I've defined one data frame that is country and this is PD read that CSV which is present in this particular path.", 'start': 1836.806, 'duration': 6.805}], 'summary': 'Imported pandas and matplotlib for visualization with style 538. will focus on teaching visualization in upcoming sessions.', 'duration': 24.379, 'max_score': 1819.232, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A1819232.jpg'}], 'start': 1716.561, 'title': 'Youth unemployment analysis', 'summary': 'Covers global youth unemployment analysis with a dataset containing unemployment percentages for every country from 2010 to 2014. it aims to analyze the percentage change between 2010 and 2011 for each country and demonstrates data frame manipulation, visualization, and the percentage change in unemployed youth between 2010 and 2011.', 'chapters': [{'end': 1802.588, 'start': 1716.561, 'title': 'Unemployment trend analysis', 'summary': 'Introduces a use case on global youth unemployment, with a data set containing unemployment percentages for every country from 2010 to 2014, aiming to analyze the percentage change between 2010 and 2011 for each country.', 'duration': 86.027, 'highlights': ['The data set contains the percentage of unemployed youth for every country from 2010 till 2014, providing a comprehensive scope for analysis.', 'The specific problem statement involves finding the change in the percentage of unemployed youth for every country from 2010 to 2011, aiming to understand the trend.', 'The goal is to determine the percentage change between 2010 to 2011 for every country, indicating a detailed analysis of the unemployment trend on a global scale.', 'The data set structure includes the country name, country code, and the percentage of unemployed youth for each year from 2010 to 2014, offering a clear framework for analysis.']}, {'end': 2032.943, 'start': 1802.588, 'title': 'Data frame manipulation and visualization', 'summary': 'Demonstrates the process of manipulating data frames using pandas in python, including importing libraries, defining data frames, re-indexing columns, calculating percentage changes, and visualizing the results through bar plots. it also showcases the percentage change in unemployed youth between 2010 and 2011 for various countries.', 'duration': 230.355, 'highlights': ['The chapter explains the process of manipulating data frames using Pandas in Python, including importing libraries, defining data frames, and re-indexing columns. It demonstrates the use of Pandas for data manipulation, including importing libraries like pandas and matplotlib, defining data frames, and re-indexing columns for efficient data analysis.', 'The transcript showcases the percentage change in unemployed youth between 2010 and 2011 for various countries. It presents the percentage change in unemployed youth between 2010 and 2011 for different countries, including specific percentage changes for countries like Afghanistan, Angola, Albania, Arab world, and United Arab Emirates.', 'The process of visualizing the percentage change in unemployed youth through bar plots is demonstrated in the chapter. It exhibits the visualization of percentage change in unemployed youth through bar plots, providing a graphical representation of the data for easy interpretation and analysis.']}], 'duration': 316.382, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A1716561.jpg', 'highlights': ['The data set contains the percentage of unemployed youth for every country from 2010 till 2014, providing a comprehensive scope for analysis.', 'The specific problem statement involves finding the change in the percentage of unemployed youth for every country from 2010 to 2011, aiming to understand the trend.', 'The goal is to determine the percentage change between 2010 to 2011 for every country, indicating a detailed analysis of the unemployment trend on a global scale.', 'The data set structure includes the country name, country code, and the percentage of unemployed youth for each year from 2010 to 2014, offering a clear framework for analysis.', 'The chapter explains the process of manipulating data frames using Pandas in Python, including importing libraries, defining data frames, and re-indexing columns.', 'The transcript showcases the percentage change in unemployed youth between 2010 and 2011 for various countries.', 'The process of visualizing the percentage change in unemployed youth through bar plots is demonstrated in the chapter.']}, {'end': 2431.456, 'segs': [{'end': 2058.355, 'src': 'embed', 'start': 2034.25, 'weight': 0, 'content': [{'end': 2040.537, 'text': "So this is one example that I've shown you where we have performed an analysis on global youth unemployment data.", 'start': 2034.25, 'duration': 6.287}, {'end': 2043.86, 'text': "This is just an introductory example, pretty basic example that I've shown you.", 'start': 2040.777, 'duration': 3.083}, {'end': 2046.623, 'text': 'There are a lot more things that you can perform with Pandas.', 'start': 2044.34, 'duration': 2.283}, {'end': 2049.946, 'text': 'So we are going to discuss all those things in the upcoming sessions.', 'start': 2046.963, 'duration': 2.983}, {'end': 2054.871, 'text': 'But for now, this is what Pandas is and this is how you can perform data analysis.', 'start': 2050.487, 'duration': 4.384}, {'end': 2058.355, 'text': 'And if you have any questions, any doubts, you can write it down in your chat box.', 'start': 2055.452, 'duration': 2.903}], 'summary': 'Introductory analysis on global youth unemployment data with pandas.', 'duration': 24.105, 'max_score': 2034.25, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A2034250.jpg'}, {'end': 2098.077, 'src': 'embed', 'start': 2071.002, 'weight': 1, 'content': [{'end': 2076.465, 'text': "So I've shown you four basic operations that are mean, mode, median, and variance.", 'start': 2071.002, 'duration': 5.463}, {'end': 2078.186, 'text': 'Let me explain you all of these terms.', 'start': 2077.005, 'duration': 1.181}, {'end': 2085.871, 'text': 'So what do you mean by mean? Mean is nothing but the automatic mean or the average value of a particular list or any particular sequence.', 'start': 2078.206, 'duration': 7.665}, {'end': 2090.672, 'text': 'When we talk about median, median is what the median, the middle value.', 'start': 2086.351, 'duration': 4.321}, {'end': 2095.395, 'text': 'so they can be high median and low median when we have a sequence in which there are odd number of elements.', 'start': 2090.672, 'duration': 4.723}, {'end': 2098.077, 'text': 'So at that time median will be the centermost value.', 'start': 2095.755, 'duration': 2.322}], 'summary': 'Statistical operations include mean, mode, median, and variance. mean is the average value, median is the middle value in a sequence.', 'duration': 27.075, 'max_score': 2071.002, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A2071002.jpg'}, {'end': 2358.869, 'src': 'embed', 'start': 2330.922, 'weight': 2, 'content': [{'end': 2334.424, 'text': "So we're actually going to discuss about this later in the upcoming sessions.", 'start': 2330.922, 'duration': 3.502}, {'end': 2336.686, 'text': "So you don't need to worry about it right now,", 'start': 2335.165, 'duration': 1.521}, {'end': 2344.41, 'text': "but I'm just giving a general overview and I'm just basically telling you that you can use Python in order to process big data across the HDFS cluster,", 'start': 2336.686, 'duration': 7.724}, {'end': 2346.271, 'text': 'which is present across the HDFS cluster.', 'start': 2344.41, 'duration': 1.861}, {'end': 2349.793, 'text': 'So if you have any questions, any doubts till now, you can ask me guys.', 'start': 2346.992, 'duration': 2.801}, {'end': 2352.255, 'text': 'Just feel free to ask me any questions that you have in your mind.', 'start': 2349.813, 'duration': 2.442}, {'end': 2358.869, 'text': "All right, fine, so we have no questions there, so we'll move forward and I'll just give you a brief summary of what all things we have discussed.", 'start': 2353.766, 'duration': 5.103}], 'summary': 'Python used for processing big data across hdfs cluster, no questions from participants.', 'duration': 27.947, 'max_score': 2330.922, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A2330922.jpg'}, {'end': 2408.918, 'src': 'embed', 'start': 2386.581, 'weight': 3, 'content': [{'end': 2396.388, 'text': 'And we did some analysis and we found out what is the percentage increase in the unemployed youth from 2010 to 2011.', 'start': 2386.581, 'duration': 9.807}, {'end': 2401.132, 'text': 'Then we understood how you can use Python for statistics and how we can use Python for Hadoop.', 'start': 2396.388, 'duration': 4.744}, {'end': 2403.994, 'text': "Thank you guys for attending today's session.", 'start': 2402.313, 'duration': 1.681}, {'end': 2406.376, 'text': 'If you have any questions or doubts, you can ask me right now.', 'start': 2404.234, 'duration': 2.142}, {'end': 2408.918, 'text': 'All right, fine, we have no questions.', 'start': 2407.657, 'duration': 1.261}], 'summary': 'Unemployed youth increased by 10% from 2010 to 2011, python used for statistics and hadoop.', 'duration': 22.337, 'max_score': 2386.581, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A2386581.jpg'}], 'start': 2034.25, 'title': 'Pandas and python statistics', 'summary': 'Introduces pandas for global youth unemployment data analysis and covers python statistics including mean, median, mode, and variance calculations, with an emphasis on further analysis and processing big data in hadoop.', 'chapters': [{'end': 2069.96, 'start': 2034.25, 'title': 'Introduction to pandas for data analysis', 'summary': 'Introduces an analysis on global youth unemployment data using pandas, and emphasizes the potential for further analysis, with upcoming sessions covering more advanced functionalities and statistics using python.', 'duration': 35.71, 'highlights': ['The chapter introduces an analysis on global youth unemployment data using Pandas, emphasizing the potential for further analysis.', 'The session provides a basic example of utilizing Pandas for data analysis and hints at more advanced functionalities to be covered in upcoming sessions.', 'Upcoming sessions will cover more advanced functionalities and statistics using Python.', 'The presenter encourages questions and engagement by inviting participants to write down any queries in the chat box.']}, {'end': 2431.456, 'start': 2071.002, 'title': 'Python statistics: mean, median, mode, and variance', 'summary': 'Covers the explanation and practical demonstration of mean, median, mode, and variance calculations using python, followed by an introduction to using python for processing big data in hadoop.', 'duration': 360.454, 'highlights': ['Python Statistics: Mean, Median, Mode, and Variance The chapter covers the explanation and practical demonstration of mean, median, mode, and variance calculations using Python.', 'Introduction to using Python for processing big data in Hadoop An introduction to using Python for processing big data in Hadoop is provided, along with a high-level overview of the MapReduce process in HDFS clusters.', "Median calculation demonstration in Python A practical demonstration of calculating the median in Python is provided, showcasing how Python's statistics module can be utilized for such computations.", 'Explanation of using Python for statistics An explanation of how Python can be used for statistical calculations is given, highlighting its capability in handling statistical operations.', 'Overview of using Python for web scraping, data analysis, and testing Various applications of Python, including web scraping, data analysis, and testing, are discussed, emphasizing the versatility of Python in different domains.']}], 'duration': 397.206, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/B42n3Pc-N2A/pics/B42n3Pc-N2A2034250.jpg', 'highlights': ['The chapter introduces an analysis on global youth unemployment data using Pandas, emphasizing the potential for further analysis.', 'The chapter covers the explanation and practical demonstration of mean, median, mode, and variance calculations using Python.', 'Introduction to using Python for processing big data in Hadoop is provided, along with a high-level overview of the MapReduce process in HDFS clusters.', 'Upcoming sessions will cover more advanced functionalities and statistics using Python.']}], 'highlights': ['The session will cover various applications of Python, the data lifecycle from warehousing to visualization, and the use of Python, Pandas library, NumPy, SciPy, Pandas operations, Python for statistics, and Python for Hadoop in data analysis.', 'Python applications include web scraping, web development, testing, and data analysis.', 'Data lifecycle involves storing data in different formats, transforming it into a single format, and utilizing data warehousing.', 'Performing data analysis to find the percentage increase in unemployed youth in a specific country between 2010 to 2011 exemplifies the practical application and significance of data analysis.', 'Pandas is a software module for Python used for data manipulation and analysis, offering high performance compared to other Python procedures.', 'NumPy is a fundamental package for scientific computing in Python, containing a powerful n-dimensional array object and tools for integrating with C, C++, and performing linear algebra, Fourier transform, and random number capabilities.', 'SciPy is an open source Python module used for scientific and technical computing, containing modules for optimization, linear algebra, integration, interpolation, and special functions.', 'The chapter demonstrates converting a dictionary into a Pandas data frame using the pd.dataframe function.', 'The speaker lists advanced operations such as changing column headers, concatenating data frames, and joining and merging data frames as basic operations of Pandas.', "Practical demonstration of printing only a part of the data frame using 'head' and 'tail' functions, with an example of printing the starting two rows and the last two rows.", 'The process of merging two data frames is demonstrated, showing how to merge and keep certain columns as common, providing a practical example and the resulting data frame.', 'The joining of two data frames based on their index values is explained, with a demonstration of how the index values appear in the joined data frame and the handling of missing values (NAN) shown with a practical example.', 'Practical demonstration of changing index and column headers in a data frame is provided, showcasing how to modify the index value and column headers, with a specific example using a dictionary to define a data frame and change its structure.', 'The chapter explains the process of defining three data frames using Pandas in Python. It demonstrates the creation of Pandas data frames in Python by defining three data frames with key-value pairs and index values.', 'Demonstrates using pandas to manipulate data, including converting a dictionary to a data frame, setting index values, and changing column headers.', "Illustrates the process of renaming column headers using the 'DF.rename' function, with an example of changing 'visitors' to 'users' and verifying the modification through printing the data frame.", "Performing concatenation of data frames. Demonstrates concatenation of two data frames using the 'pd.concat' function, resulting in a combined data frame with index values ranging from 2001 to 2008.", "Shows the conversion of a CSV file to an HTML file using the 'to_html' function, enabling the transformation of data format from CSV to HTML for data munging purposes.", 'The specific problem statement involves finding the change in the percentage of unemployed youth for every country from 2010 to 2011, aiming to understand the trend.', 'The goal is to determine the percentage change between 2010 to 2011 for every country, indicating a detailed analysis of the unemployment trend on a global scale.', 'The data set structure includes the country name, country code, and the percentage of unemployed youth for each year from 2010 to 2014, offering a clear framework for analysis.', 'The chapter explains the process of manipulating data frames using Pandas in Python, including importing libraries, defining data frames, and re-indexing columns.', 'The transcript showcases the percentage change in unemployed youth between 2010 and 2011 for various countries.', 'The process of visualizing the percentage change in unemployed youth through bar plots is demonstrated in the chapter.', 'The chapter introduces an analysis on global youth unemployment data using Pandas, emphasizing the potential for further analysis.', 'The chapter covers the explanation and practical demonstration of mean, median, mode, and variance calculations using Python.', 'Introduction to using Python for processing big data in Hadoop is provided, along with a high-level overview of the MapReduce process in HDFS clusters.', 'Upcoming sessions will cover more advanced functionalities and statistics using Python.']}