title
Python Pandas Tutorial 2: Dataframe Basics

description
This pandas tutorial covers basics on dataframe. DataFrame is a main object of pandas. It is used to represent tabular data (with rows and columns). This tutorial will go over, 1) What is dataframe? 2) Create dataframe from csv file and python dictionary 3) Dealing with rows and columns 4) Operations: mean, max, std, describe 5) Conditional selection 6) set_index function and usefulness of it Topics that are covered in this Python Pandas Video: 0:00 Introduction 0:15 What is Dataframe? 2:02 Import pandas in jupyternotebook 3:34 Create dataframeusing python dictionary 5:15 Use head() method 5:52 Use tail() method 6:10 Use Indexing and slicing in dataframe 8:12 Insert new cell in current cell 8:39 What is the type of your dataframe? 10:01 Operations with your dataframe 10:34 Use max() method 11:02 Use mean() method 11:11 Use min() method 11:23 Use describe() method 12:12 Conditional select the data in your dataframe 14:55 Pandas operations list 15:41 Use set_index() method 18:12 Use reset_index() method Code: https://github.com/codebasics/py/tree/master/pandas/2_dataframe_basics Do you want to learn technology from me? Check https://codebasics.io/?utm_source=description&utm_medium=yt&utm_campaign=description&utm_id=description for my affordable video courses. Next Video: Python Pandas Tutorial 3: Different Ways Of Creating DataFrame https://www.youtube.com/watch?v=3k0HbcUGErE&list=PLeo1K3hjS3uuASpe-1LjfG5f14Bnozjwy&index=3 Very Simple Explanation Of Neural Network: https://www.youtube.com/watch?v=ER2It2mIagI Popular Playlist: Complete python course: https://www.youtube.com/playlist?list=PLeo1K3hjS3uv5U-Lmlnucd7gqF-3ehIh0 Data science course: https://www.youtube.com/playlist?list=PLeo1K3hjS3us_ELKYSj_Fth2tIEkdKXvV Machine learning tutorials: https://www.youtube.com/playlist?list=PLeo1K3hjS3uvCeTYTeyfe0-rN5r8zn9rw Pandas tutorials: https://www.youtube.com/playlist?list=PLeo1K3hjS3uuASpe-1LjfG5f14Bnozjwy Git github tutorials: https://www.youtube.com/playlist?list=PLeo1K3hjS3usJuxZZUBdjAcilgfQHkRzW Matplotlib course: https://www.youtube.com/playlist?list=PLeo1K3hjS3uu4Lr8_kro2AqaO6CFYgKOl Data structures course: https://www.youtube.com/playlist?list=PLeo1K3hjS3uu_n_a__MI_KktGTLYopZ12 Data Science Project - Real Estate Price Prediction: https://www.youtube.com/watch?v=rdfbcdP75KI&list=PLeo1K3hjS3uu7clOTtwsp94PcHbzqpAdg To download csv and code for all tutorials: go to https://github.com/codebasics/py, click on a green button to clone or download the entire repository and then go to relevant folder to get access to that specific file. 🌎 My Website For Video Courses: https://codebasics.io/?utm_source=description&utm_medium=yt&utm_campaign=description&utm_id=description Need help building software or data analytics and AI solutions? My company https://www.atliq.com/ can help. Click on the Contact button on that website. #️⃣ Social Media #️⃣ 🔗 Discord: https://discord.gg/r42Kbuk 📸 Dhaval's Personal Instagram: https://www.instagram.com/dhavalsays/ 📸 Codebasics Instagram: https://www.instagram.com/codebasicshub/ 🔊 Facebook: https://www.facebook.com/codebasicshub 📱 Twitter: https://twitter.com/codebasicshub 📝 Linkedin (Personal): https://www.linkedin.com/in/dhavalsays/ 📝 Linkedin (Codebasics): https://www.linkedin.com/company/codebasics/ 🔗 Patreon: https://www.patreon.com/codebasics?fan_landing=true

detail
{'title': 'Python Pandas Tutorial 2: Dataframe Basics', 'heatmap': [{'end': 206.898, 'start': 170.881, 'weight': 0.752}, {'end': 240.82, 'start': 213.608, 'weight': 0.909}, {'end': 330.821, 'start': 273.314, 'weight': 0.724}, {'end': 428.501, 'start': 398.331, 'weight': 0.8}, {'end': 516.914, 'start': 496.243, 'weight': 0.731}, {'end': 695.179, 'start': 676.375, 'weight': 0.748}, {'end': 834.965, 'start': 785.378, 'weight': 0.709}, {'end': 1114.716, 'start': 1055.44, 'weight': 0.755}], 'summary': 'The tutorial covers pandas dataframe basics, importing weather data, accessing and printing data, data analysis, conditional selection, and data frame manipulation in python, emphasizing practical examples and key functionalities, such as using jupyter notebook and anaconda for installation.', 'chapters': [{'end': 118.566, 'segs': [{'end': 34.319, 'src': 'embed', 'start': 0.686, 'weight': 0, 'content': [{'end': 5.688, 'text': 'Dear friends in this tutorial we are going to cover data frame basics in pandas.', 'start': 0.686, 'duration': 5.002}, {'end': 12.83, 'text': "Now if you don't know what is pandas then you can watch my tutorials on pandas introduction first.", 'start': 6.568, 'duration': 6.262}, {'end': 20.613, 'text': 'Okay now data frame is a main object in pandas framework.', 'start': 13.951, 'duration': 6.662}, {'end': 25.275, 'text': 'If you are using pandas you are almost always going to use data frame.', 'start': 20.913, 'duration': 4.362}, {'end': 34.319, 'text': 'DataFrame is a data structure used to represent tabular data, such as this Excel file.', 'start': 25.993, 'duration': 8.326}], 'summary': 'Introduction to data frame basics in pandas, a main object in pandas framework for representing tabular data.', 'duration': 33.633, 'max_score': 0.686, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/F6kmIpWWEdU/pics/F6kmIpWWEdU686.jpg'}, {'end': 95.622, 'src': 'embed', 'start': 65.046, 'weight': 1, 'content': [{'end': 69.787, 'text': "I'm using Jupyter notebook because it is great with data visualization.", 'start': 65.046, 'duration': 4.741}, {'end': 76.269, 'text': 'so the way you launch Jupyter notebook is simply run Jupyter notebook command on your command prompt,', 'start': 69.787, 'duration': 6.482}, {'end': 82.073, 'text': 'now have a separate tutorial on how to install anaconda.', 'start': 76.269, 'duration': 5.804}, {'end': 89.718, 'text': 'so if you follow that and install anaconda, you will end up installing both jupyter notebook and pandas.', 'start': 82.073, 'duration': 7.645}, {'end': 95.622, 'text': 'okay, so i assume that you have followed that tutorial and install anaconda.', 'start': 89.718, 'duration': 5.904}], 'summary': 'Jupyter notebook is great for data visualization; installing anaconda results in both jupyter notebook and pandas.', 'duration': 30.576, 'max_score': 65.046, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/F6kmIpWWEdU/pics/F6kmIpWWEdU65046.jpg'}], 'start': 0.686, 'title': 'Pandas data frame basics', 'summary': 'Covers the basics of data frames in pandas, emphasizing their importance and role in representing tabular data, and provides guidance on using jupyter notebook for data visualization and anaconda for installation, with an emphasis on data frame as the main object in pandas.', 'chapters': [{'end': 118.566, 'start': 0.686, 'title': 'Pandas data frame basics', 'summary': 'Covers the basics of data frames in pandas, emphasizing their importance and role in representing tabular data, and provides guidance on using jupyter notebook for data visualization and anaconda for installation, with an emphasis on data frame as the main object in pandas.', 'duration': 117.88, 'highlights': ['Data frame is a main object in pandas framework, used to represent tabular data like Excel files.', 'Jupyter notebook is recommended for data visualization, and its installation is facilitated by anaconda, which also installs pandas.', 'The tutorial emphasizes the importance of data frame in pandas and provides guidance on using Jupyter notebook and anaconda for installation.']}], 'duration': 117.88, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/F6kmIpWWEdU/pics/F6kmIpWWEdU686.jpg', 'highlights': ['Data frame is a main object in pandas framework, used to represent tabular data like Excel files.', 'The tutorial emphasizes the importance of data frame in pandas and provides guidance on using Jupyter notebook and anaconda for installation.', 'Jupyter notebook is recommended for data visualization, and its installation is facilitated by anaconda, which also installs pandas.']}, {'end': 312.245, 'segs': [{'end': 206.898, 'src': 'heatmap', 'start': 119.83, 'weight': 0, 'content': [{'end': 125.432, 'text': 'all right, first thing we are going to do is import pandas module in our notebook.', 'start': 119.83, 'duration': 5.602}, {'end': 138.298, 'text': 'so you will say import pandas as pd, okay, and then you will create a data frame by saying df is equal to pd dot.', 'start': 125.432, 'duration': 12.866}, {'end': 142.64, 'text': "now the method i'm going to use is a read csv.", 'start': 138.298, 'duration': 4.342}, {'end': 150.305, 'text': 'okay, so I have this comma separated excel file which contains weather data.', 'start': 143.38, 'duration': 6.925}, {'end': 154.749, 'text': 'so it has number of days, the temperature, wind speed, event, etc.', 'start': 150.305, 'duration': 4.444}, {'end': 160.133, 'text': "okay, so I'm going to import this data into my data frame.", 'start': 154.749, 'duration': 5.384}, {'end': 170.881, 'text': 'okay, so read CSV takes part of your CSV file as an input,', 'start': 160.133, 'duration': 10.748}, {'end': 180.095, 'text': "and If you are having your CSV file at some other location than the notebook that you're running here, then you need to specify the full path.", 'start': 170.881, 'duration': 9.214}, {'end': 187.73, 'text': 'okay, when you run this, this star that you are seeing here means that it was running that.', 'start': 181.027, 'duration': 6.703}, {'end': 192.812, 'text': 'okay. now it successfully created your data frame here.', 'start': 187.73, 'duration': 5.082}, {'end': 200.215, 'text': 'now, first thing we are going to do is just type in DF and it will show you the content of that data frame.', 'start': 192.812, 'duration': 7.403}, {'end': 206.898, 'text': 'so you can see that this is pretty much looks like the excel sheet that we had.', 'start': 200.215, 'duration': 6.683}], 'summary': 'Imported weather data from a csv file to a data frame using pandas in the notebook.', 'duration': 67.9, 'max_score': 119.83, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/F6kmIpWWEdU/pics/F6kmIpWWEdU119830.jpg'}, {'end': 312.245, 'src': 'heatmap', 'start': 213.608, 'weight': 1, 'content': [{'end': 222.71, 'text': 'here you can also create your data frame by using a Python dictionary, so let me do that quickly.', 'start': 213.608, 'duration': 9.102}, {'end': 229.332, 'text': 'so here, if you have a dictionary like this,', 'start': 222.71, 'duration': 6.622}, {'end': 240.82, 'text': 'where each of these columns are your keys in the dictionary and the values are basically your rows or the column values, okay.', 'start': 229.332, 'duration': 11.488}, {'end': 247.005, 'text': 'so then you can say DF is equal to DF, PD dot data frame.', 'start': 240.82, 'duration': 6.185}, {'end': 260.938, 'text': 'okay. so if you do that and run it looks like it worked and if you print the content of your data frame, you will see the same content.', 'start': 248.914, 'duration': 12.024}, {'end': 267.86, 'text': 'so you can see that you can either use read CSV or use this Python dictionary to create your data frame.', 'start': 260.938, 'duration': 6.922}, {'end': 271.632, 'text': 'Now, DataFrame is all about rows and columns.', 'start': 269.028, 'duration': 2.604}, {'end': 273.254, 'text': 'It is a tabular data structure.', 'start': 271.732, 'duration': 1.522}, {'end': 277.159, 'text': 'So the first thing we are going to do is we are going to print a shape.', 'start': 273.314, 'duration': 3.845}, {'end': 279.983, 'text': 'Now, shape means the dimension.', 'start': 277.66, 'duration': 2.323}, {'end': 282.607, 'text': 'So six here represent number of rows.', 'start': 280.063, 'duration': 2.544}, {'end': 283.929, 'text': 'You can see it has six rows.', 'start': 282.627, 'duration': 1.302}, {'end': 285.59, 'text': 'four columns.', 'start': 284.669, 'duration': 0.921}, {'end': 289.332, 'text': 'okay, so if you and this thing is a tuple.', 'start': 285.59, 'duration': 3.742}, {'end': 303.62, 'text': 'so if you want to store number of rows and columns, you can say rows, columns equal to DF dot shape, and when you print rows it says six,', 'start': 289.332, 'duration': 14.288}, {'end': 307.603, 'text': 'and if you print columns, it will say four.', 'start': 303.62, 'duration': 3.983}, {'end': 312.245, 'text': 'okay, so this is how you print rows and columns.', 'start': 307.603, 'duration': 4.642}], 'summary': 'Creating data frames using python dictionary and printing dimensions: 6 rows, 4 columns', 'duration': 98.637, 'max_score': 213.608, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/F6kmIpWWEdU/pics/F6kmIpWWEdU213608.jpg'}], 'start': 119.83, 'title': 'Importing weather data and data frames in python', 'summary': 'Covers importing weather data using pandas module, creating a data frame to store the imported data, and introducing creating and exploring data frames in python with an example of 6 rows and 4 columns.', 'chapters': [{'end': 187.73, 'start': 119.83, 'title': 'Importing and reading weather data', 'summary': 'Covers importing weather data from a csv file using pandas module, specifying the full path if the file is located elsewhere, and creating a data frame to store the imported data.', 'duration': 67.9, 'highlights': ['Creating a data frame using pandas module and reading CSV file', 'Specifying the full path for CSV file located outside the notebook']}, {'end': 312.245, 'start': 187.73, 'title': 'Introduction to data frames in python', 'summary': 'Introduces creating and exploring data frames in python, including using python dictionary to create data frames and obtaining the shape of the data frame, with an example of 6 rows and 4 columns.', 'duration': 124.515, 'highlights': ['The data frame in Python resembles an Excel sheet with columns and rows, and it can be created using a Python dictionary or by reading a CSV file.', 'Obtaining the shape of the data frame, represented as a tuple, allows one to retrieve the number of rows (6) and columns (4).', 'Demonstrating the creation of a data frame using a Python dictionary, where the keys represent the columns and the values represent the rows or column values.']}], 'duration': 192.415, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/F6kmIpWWEdU/pics/F6kmIpWWEdU119830.jpg', 'highlights': ['Creating a data frame using pandas module and reading CSV file', 'The data frame in Python resembles an Excel sheet with columns and rows, and it can be created using a Python dictionary or by reading a CSV file', 'Obtaining the shape of the data frame, represented as a tuple, allows one to retrieve the number of rows (6) and columns (4)', 'Specifying the full path for CSV file located outside the notebook', 'Demonstrating the creation of a data frame using a Python dictionary, where the keys represent the columns and the values represent the rows or column values']}, {'end': 516.914, 'segs': [{'end': 428.501, 'src': 'heatmap', 'start': 312.245, 'weight': 0, 'content': [{'end': 317.123, 'text': 'now we are going to use df.head.', 'start': 312.245, 'duration': 4.878}, {'end': 325.716, 'text': "So when you say df.head, it's going to print basically initial few rows.", 'start': 317.243, 'duration': 8.473}, {'end': 330.821, 'text': 'Sometimes you might have a big Excel sheet and you might have like hundreds of rows.', 'start': 325.936, 'duration': 4.885}, {'end': 333.402, 'text': "so you don't want to print all of those rows here.", 'start': 330.821, 'duration': 2.581}, {'end': 337.905, 'text': "okay, because if you do df it's gonna print everything.", 'start': 333.402, 'duration': 4.503}, {'end': 341.967, 'text': 'a df dot head, uh, gives a convenience of printing only few rows.', 'start': 337.905, 'duration': 4.062}, {'end': 351.471, 'text': "okay, now, if you want to, let's say print only two rows, then you can say df dot head two and it will print only two rows, as you can see here.", 'start': 341.967, 'duration': 9.504}, {'end': 359.18, 'text': 'okay, You can also do df.tail and it will print last five rows.', 'start': 351.471, 'duration': 7.709}, {'end': 362.043, 'text': 'So you can see here these are last five rows.', 'start': 359.2, 'duration': 2.843}, {'end': 369.453, 'text': "Now if you want to print again let's say last one row you will pass that number as a parameter in your tail function.", 'start': 362.104, 'duration': 7.349}, {'end': 383.253, 'text': 'Now, if you have ever used indexing and slicing with Python list, then you can use the same thing with DataFrame.', 'start': 371.149, 'duration': 12.104}, {'end': 391.095, 'text': "So you can say something like let's say, I want to print row number two to four, okay?", 'start': 383.273, 'duration': 7.822}, {'end': 398.331, 'text': 'Then what you can do is df two, 5.', 'start': 391.475, 'duration': 6.856}, {'end': 402.773, 'text': "so it includes row number 2, but it doesn't include row number 5.", 'start': 398.331, 'duration': 4.442}, {'end': 411.616, 'text': "that's why, to print row number 2 to 4, I am saying 2, column 5, and this is how you print it.", 'start': 402.773, 'duration': 8.843}, {'end': 421.976, 'text': 'okay, if you want to print everything, then you can use either this or this, whichever way is convenient to you.', 'start': 411.616, 'duration': 10.36}, {'end': 428.501, 'text': "Okay Now let's look into printing the columns.", 'start': 422.517, 'duration': 5.984}], 'summary': 'Presents methods to print specific rows and columns from a dataframe, like df.head and df.tail, with examples.', 'duration': 99.371, 'max_score': 312.245, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/F6kmIpWWEdU/pics/F6kmIpWWEdU312245.jpg'}, {'end': 516.914, 'src': 'heatmap', 'start': 485.216, 'weight': 1, 'content': [{'end': 489.819, 'text': "now let's look at the type.", 'start': 485.216, 'duration': 4.603}, {'end': 493.561, 'text': "so uh, so i'm.", 'start': 489.819, 'duration': 3.742}, {'end': 496.243, 'text': "by the way, i'm using, uh, this b shortcut.", 'start': 493.561, 'duration': 2.682}, {'end': 506.409, 'text': 'so when you say uh, when you press, b is gonna insert a new shell after your current cell, you can also use this plus icon to insert a new cell.', 'start': 496.243, 'duration': 10.166}, {'end': 510.692, 'text': 'okay, and you can look at all the shortcuts here.', 'start': 506.409, 'duration': 4.283}, {'end': 515.273, 'text': 'so if you say insert cell below, the shortcut was B.', 'start': 510.692, 'duration': 4.581}, {'end': 516.914, 'text': "so that's what I have been using.", 'start': 515.273, 'duration': 1.641}], 'summary': 'Demonstrating shortcut b to insert new shell in jupyter notebook.', 'duration': 31.698, 'max_score': 485.216, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/F6kmIpWWEdU/pics/F6kmIpWWEdU485216.jpg'}], 'start': 312.245, 'title': 'Printing and accessing data in dataframe', 'summary': 'Explains printing initial and last rows, specific number of rows and columns in a dataframe using df.head, df.tail, and df.columns, providing practical python list analogy and keyboard shortcuts for efficient use of jupyter notebook.', 'chapters': [{'end': 516.914, 'start': 312.245, 'title': 'Printing and accessing data in dataframe', 'summary': 'Explains how to print initial and last rows, specific number of rows, and columns in a dataframe, demonstrating the use of df.head, df.tail, and df.columns, providing a practical python list analogy, along with keyboard shortcuts for efficient use of jupyter notebook.', 'duration': 204.669, 'highlights': ['The chapter explains how to print initial and last rows, specific number of rows, and columns in a DataFrame', 'Demonstrates the use of df.head, df.tail, and df.columns', 'Provides a practical Python list analogy', 'Provides keyboard shortcuts for efficient use of Jupyter Notebook']}], 'duration': 204.669, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/F6kmIpWWEdU/pics/F6kmIpWWEdU312245.jpg', 'highlights': ['Demonstrates the use of df.head, df.tail, and df.columns', 'Provides keyboard shortcuts for efficient use of Jupyter Notebook', 'Provides a practical Python list analogy', 'Explains how to print initial and last rows, specific number of rows, and columns in a DataFrame']}, {'end': 738.615, 'segs': [{'end': 581.365, 'src': 'embed', 'start': 516.914, 'weight': 3, 'content': [{'end': 521.436, 'text': "all right, okay, now let's do the type.", 'start': 516.914, 'duration': 4.522}, {'end': 533.032, 'text': 'so the type of my DF event column is series.', 'start': 521.436, 'duration': 11.596}, {'end': 540.616, 'text': 'so remember that the columns in your data frames are basically of type pandas series.', 'start': 533.032, 'duration': 7.584}, {'end': 544.858, 'text': 'okay, so this is just just a reference information.', 'start': 540.616, 'duration': 4.242}, {'end': 553.082, 'text': 'now we saw that when you print, when you type df, it prints the entire table or entire data frame.', 'start': 544.858, 'duration': 8.224}, {'end': 556.571, 'text': 'What if you want to print only few columns?', 'start': 554.143, 'duration': 2.428}, {'end': 567.656, 'text': 'In order to do that you can do in bracket, one more bracket and then only mention your columns that you want to print.', 'start': 557.214, 'duration': 10.442}, {'end': 571.918, 'text': "so let's say i want to print event along with day.", 'start': 567.656, 'duration': 4.262}, {'end': 575.841, 'text': 'okay, so if you do that, it will print event and day.', 'start': 571.918, 'duration': 3.923}, {'end': 581.365, 'text': 'so this way you can print however many columns that you like.', 'start': 575.841, 'duration': 5.524}], 'summary': 'Data frame columns are of type pandas series. print specific columns using brackets.', 'duration': 64.451, 'max_score': 516.914, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/F6kmIpWWEdU/pics/F6kmIpWWEdU516914.jpg'}, {'end': 638.842, 'src': 'embed', 'start': 608.118, 'weight': 1, 'content': [{'end': 621.404, 'text': 'so when you have weather data like this, the thing that comes to your mind first is what was the maximum temperature in the given data set?', 'start': 608.118, 'duration': 13.286}, {'end': 627.871, 'text': 'okay, so if you want to do that, you will say DF temperature.', 'start': 621.404, 'duration': 6.467}, {'end': 637.2, 'text': 'so remember, DF temperature prints the entire temperature column, and if you want to find the maximum from all this number,', 'start': 627.871, 'duration': 9.329}, {'end': 638.842, 'text': 'all you do is just say max.', 'start': 637.2, 'duration': 1.642}], 'summary': 'Analyzing weather data to find maximum temperature using df temperature and max.', 'duration': 30.724, 'max_score': 608.118, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/F6kmIpWWEdU/pics/F6kmIpWWEdU608118.jpg'}, {'end': 703.004, 'src': 'heatmap', 'start': 676.375, 'weight': 0.748, 'content': [{'end': 679.998, 'text': 'okay, you can also print standard deviation.', 'start': 676.375, 'duration': 3.623}, {'end': 687.743, 'text': 'okay, you can also, uh, say df dot, describe.', 'start': 679.998, 'duration': 7.745}, {'end': 691.016, 'text': 'Now what describe is going to do?', 'start': 688.775, 'duration': 2.241}, {'end': 695.179, 'text': "is it's going to print the statistics on your data set?", 'start': 691.016, 'duration': 4.163}, {'end': 699.202, 'text': 'So you can see we have temperature and wind speed column.', 'start': 695.219, 'duration': 3.983}, {'end': 703.004, 'text': 'And these two columns contains integer data.', 'start': 699.802, 'duration': 3.202}], 'summary': 'The data set contains temperature and wind speed columns with integer data.', 'duration': 26.629, 'max_score': 676.375, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/F6kmIpWWEdU/pics/F6kmIpWWEdU676375.jpg'}, {'end': 738.615, 'src': 'embed', 'start': 707.607, 'weight': 0, 'content': [{'end': 709.809, 'text': 'So you can see things like count.', 'start': 707.607, 'duration': 2.202}, {'end': 717.637, 'text': 'mean, standard deviation mean, and then these are percentile.', 'start': 710.649, 'duration': 6.988}, {'end': 721.582, 'text': 'So 25 percentile, 50 percentile, and max.', 'start': 717.838, 'duration': 3.744}, {'end': 730.533, 'text': 'This function is pretty good in terms of quickly printing the statistics on your data.', 'start': 723.444, 'duration': 7.089}, {'end': 738.615, 'text': "now let's look at how can you conditionally select the data in your data frame.", 'start': 731.63, 'duration': 6.985}], 'summary': 'Describes functions to print statistics and select data in a data frame.', 'duration': 31.008, 'max_score': 707.607, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/F6kmIpWWEdU/pics/F6kmIpWWEdU707607.jpg'}], 'start': 516.914, 'title': 'Analyzing weather data in python', 'summary': 'Covers working with pandas dataframes and introduces how to analyze weather data in python, including understanding columns as pandas series, finding maximum temperature, calculating mean, minimum, and standard deviation, and using the describe function to obtain statistics.', 'chapters': [{'end': 608.118, 'start': 516.914, 'title': 'Working with pandas dataframes', 'summary': 'Covers working with pandas dataframes, including understanding the type of columns as pandas series, printing specific columns from a dataframe, and restricting the display of columns for analysis.', 'duration': 91.204, 'highlights': ['Understanding that columns in data frames are of type pandas series is important for working with Pandas DataFrames.', 'Printing specific columns from a DataFrame can be achieved by specifying the column names within brackets.', 'Restricting the display of columns in a DataFrame is useful when dealing with data analysis.']}, {'end': 738.615, 'start': 608.118, 'title': 'Analyzing weather data in python', 'summary': 'Introduces how to analyze weather data in python, including finding the maximum temperature, calculating mean, minimum, and standard deviation, and using the describe function to obtain statistics on the data set.', 'duration': 130.497, 'highlights': ["The 'max' function is used to find the maximum temperature in the given data set, providing a quick way to obtain this key metric.", "The 'describe' function prints statistics on the data set, including count, mean, standard deviation, percentiles, and max, offering a rapid overview of the data's characteristics.", 'The chapter also covers calculating the mean, minimum, and standard deviation of the data, providing a comprehensive set of statistical measures for analysis.']}], 'duration': 221.701, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/F6kmIpWWEdU/pics/F6kmIpWWEdU516914.jpg', 'highlights': ["The 'describe' function prints statistics on the data set, including count, mean, standard deviation, percentiles, and max, offering a rapid overview of the data's characteristics.", "The 'max' function is used to find the maximum temperature in the given data set, providing a quick way to obtain this key metric.", 'The chapter also covers calculating the mean, minimum, and standard deviation of the data, providing a comprehensive set of statistical measures for analysis.', 'Understanding that columns in data frames are of type pandas series is important for working with Pandas DataFrames.', 'Printing specific columns from a DataFrame can be achieved by specifying the column names within brackets.']}, {'end': 847.888, 'segs': [{'end': 847.888, 'src': 'heatmap', 'start': 785.378, 'weight': 0, 'content': [{'end': 795.615, 'text': 'Okay, what if you want to get a row where your temperature was maximum?', 'start': 785.378, 'duration': 10.237}, {'end': 805.738, 'text': 'so in that case you will say okay, give me DF where my DF dot temperature was maximum.', 'start': 795.615, 'duration': 10.123}, {'end': 810.98, 'text': 'so DF dot temperature dot max.', 'start': 805.738, 'duration': 5.242}, {'end': 814.061, 'text': 'you do that, you will see.', 'start': 812.681, 'duration': 1.38}, {'end': 824.423, 'text': "it's printing that particular row, okay, so you can use this syntax, or you can even do this, okay, whichever way you prefer.", 'start': 814.061, 'duration': 10.362}, {'end': 834.965, 'text': 'sometimes what happens is in your comma separated file, the column name might contain, might contain spaces like this, and if it contains spaces,', 'start': 824.423, 'duration': 10.542}, {'end': 839.426, 'text': 'then you will have to use this syntax.', 'start': 834.965, 'duration': 4.461}, {'end': 847.888, 'text': 'okay, So you can run operations like these to conditionally select data in your data frame.', 'start': 839.426, 'duration': 8.462}], 'summary': 'You can conditionally select data in a data frame by finding the maximum temperature and using specific syntax, even with columns containing spaces.', 'duration': 83.975, 'max_score': 785.378, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/F6kmIpWWEdU/pics/F6kmIpWWEdU785378.jpg'}], 'start': 738.615, 'title': 'Data frame conditional selection', 'summary': 'Explores how data frames in python support sql-like query operations, enabling users to select data based on conditions and perform operations such as selecting rows with temperatures greater than or equal to 32 and obtaining the row with the maximum temperature.', 'chapters': [{'end': 847.888, 'start': 738.615, 'title': 'Data frame conditional selection', 'summary': 'Explores how data frames in python support sql-like query operations, enabling users to select data based on conditions and perform operations such as selecting rows with temperatures greater than or equal to 32 and obtaining the row with the maximum temperature.', 'duration': 109.273, 'highlights': ['Data frame supports SQL-like operations for conditional data selection, allowing queries based on specific conditions, exemplified by selecting rows with temperatures greater than or equal to 32, resulting in three rows being printed.', 'Demonstrating the ability to select the row with the maximum temperature using the syntax DF.temperature.max, showcasing the practical application of conditional data selection in data frames.', 'Explaining the syntax required for column names containing spaces in comma-separated files, highlighting the necessity of using a specific syntax in such cases for operations like conditional data selection.']}], 'duration': 109.273, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/F6kmIpWWEdU/pics/F6kmIpWWEdU738615.jpg', 'highlights': ['Data frame supports SQL-like operations for conditional data selection, exemplified by selecting rows with temperatures greater than or equal to 32.', 'Demonstrating the ability to select the row with the maximum temperature using the syntax DF.temperature.max.', 'Explaining the syntax required for column names containing spaces in comma-separated files.']}, {'end': 1256.989, 'segs': [{'end': 946.405, 'src': 'embed', 'start': 877.264, 'weight': 0, 'content': [{'end': 889.036, 'text': 'Okay, so when you do that, it says that maximum temperature was 35 in my data set and the day was 2nd January 2017.', 'start': 877.264, 'duration': 11.772}, {'end': 893.157, 'text': 'now, as far as operations are concerned, i only covered few.', 'start': 889.036, 'duration': 4.121}, {'end': 896.998, 'text': 'i cover like min max, standard deviation and describe.', 'start': 893.157, 'duration': 3.841}, {'end': 899.619, 'text': 'but pandas has whole bunch of operations.', 'start': 896.998, 'duration': 2.621}, {'end': 913.062, 'text': "so if you do pandas uh, operations, then you will, uh, find a, let's say, do pandas series operations.", 'start': 899.619, 'duration': 13.443}, {'end': 918.864, 'text': 'then you will find a documentation where you will see all kind of operations.', 'start': 913.982, 'duration': 4.882}, {'end': 923.767, 'text': 'so we looked at mean here, so mean median max.', 'start': 918.864, 'duration': 4.903}, {'end': 931.431, 'text': "this is the entire list of operations that you can do and you can see it's pretty intense here.", 'start': 923.767, 'duration': 7.664}, {'end': 940.055, 'text': "the list is pretty long and you can do almost all kind of statistics on panda's data frame and series.", 'start': 931.431, 'duration': 8.624}, {'end': 946.405, 'text': 'next thing we are going to look into is a set index.', 'start': 941.34, 'duration': 5.065}], 'summary': 'In the data set, the maximum temperature was 35 on 2nd january 2017. pandas offers a wide range of operations for statistics on data frames and series.', 'duration': 69.141, 'max_score': 877.264, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/F6kmIpWWEdU/pics/F6kmIpWWEdU877264.jpg'}, {'end': 1012.367, 'src': 'embed', 'start': 977.453, 'weight': 2, 'content': [{'end': 988.733, 'text': "okay now, if you want to, let's say, change this index to be, let's say something else, for example day.", 'start': 977.453, 'duration': 11.28}, {'end': 996.69, 'text': 'okay, then you can use DF dot set index and say day.', 'start': 988.733, 'duration': 7.957}, {'end': 1004.503, 'text': 'okay, when you do that, what happens is you notice the difference between this data frame here and this here.', 'start': 998.399, 'duration': 6.104}, {'end': 1012.367, 'text': 'your data, the index, was your integer index 0 to 5, and now the index is your actual date.', 'start': 1004.503, 'duration': 7.864}], 'summary': 'Changing index to date using df.set_index.', 'duration': 34.914, 'max_score': 977.453, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/F6kmIpWWEdU/pics/F6kmIpWWEdU977453.jpg'}, {'end': 1114.716, 'src': 'heatmap', 'start': 1049.917, 'weight': 5, 'content': [{'end': 1055.44, 'text': 'well, the thing is, when you run index set index command, it returns a new data frame.', 'start': 1049.917, 'duration': 5.523}, {'end': 1058.922, 'text': "it doesn't modify the original one, okay.", 'start': 1055.44, 'duration': 3.482}, {'end': 1067.676, 'text': 'so to modify the original one you have to say in place equal to true, okay.', 'start': 1058.922, 'duration': 8.754}, {'end': 1077.102, 'text': 'and when you say in place equal to true, and when you print df now, you will see my df has my actual date as an index.', 'start': 1067.676, 'duration': 9.426}, {'end': 1081.525, 'text': 'okay, now i can use date as an index.', 'start': 1077.102, 'duration': 4.423}, {'end': 1082.085, 'text': 'okay, so 1, 3, 2017.', 'start': 1081.525, 'duration': 0.56}, {'end': 1083.606, 'text': 'so this will give me this particular row.', 'start': 1082.085, 'duration': 1.521}, {'end': 1095.732, 'text': 'okay, now, if you want to reset your index to be the original one,', 'start': 1089.198, 'duration': 6.534}, {'end': 1114.716, 'text': 'then you can call DF dot reset index function and That function also requires you to plot, pass in place equal to true, and After that,', 'start': 1095.732, 'duration': 18.984}], 'summary': 'Running index set index command returns a new data frame; use in place=true to modify original one.', 'duration': 32.168, 'max_score': 1049.917, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/F6kmIpWWEdU/pics/F6kmIpWWEdU1049917.jpg'}, {'end': 1233.669, 'src': 'embed', 'start': 1202.35, 'weight': 7, 'content': [{'end': 1205.972, 'text': 'you can get the values associated with it, okay.', 'start': 1202.35, 'duration': 3.622}, {'end': 1209.014, 'text': "so that's all I had for this tutorial.", 'start': 1205.972, 'duration': 3.042}, {'end': 1211.876, 'text': 'this is just a brief introduction of data frame.', 'start': 1209.014, 'duration': 2.862}, {'end': 1221.601, 'text': "data frame has many features that I have not covered in this tutorial, but I'm going to cover them in my future tutorials.", 'start': 1211.876, 'duration': 9.725}, {'end': 1227.025, 'text': 'our next tutorial is going to be a different ways of creating data frame.', 'start': 1221.601, 'duration': 5.424}, {'end': 1233.669, 'text': 'in this tutorial we saw how to create data frame using read CSV function and a dictionary,', 'start': 1227.025, 'duration': 6.644}], 'summary': 'Introduction to data frame, more features to be covered in future tutorials.', 'duration': 31.319, 'max_score': 1202.35, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/F6kmIpWWEdU/pics/F6kmIpWWEdU1202350.jpg'}], 'start': 848.228, 'title': 'Pandas data operations and data frame manipulation', 'summary': 'Covers printing specific columns and statistical operations in pandas, as well as changing data frame index to date for easier manipulation. it also explains manipulating data frames including index setting and resetting, highlighting the importance of the in place parameter and potential errors.', 'chapters': [{'end': 946.405, 'start': 848.228, 'title': 'Pandas data operations', 'summary': 'Covers printing specific columns, such as day and temperature, and explores various operations in pandas including mean, median, and max, with a wide range of statistical operations available for data frames and series.', 'duration': 98.177, 'highlights': ['The chapter covers printing specific columns, such as day and temperature, and explores various operations in Pandas including mean, median, and max, with a wide range of statistical operations available for data frames and series.', 'The maximum temperature in the dataset was 35 degrees Celsius, recorded on 2nd January 2017.', 'Pandas offers a comprehensive list of operations including mean, median, max, and a wide range of statistical operations for data frames and series.']}, {'end': 1049.917, 'start': 946.405, 'title': 'Changing data frame index to date', 'summary': 'Explains how to change the index of a data frame from integer to date, allowing for easier data manipulation and access using the loc function, demonstrated with an example of transforming an integer index to a date index.', 'duration': 103.512, 'highlights': ['The advantage of changing the index to date is the ability to use the loc function to access specific rows based on dates, enhancing data manipulation and retrieval.', 'Demonstrating the process of changing the data frame index from integer to date using the set_index function, which allows for easy access to data based on the new index.', 'Explanation of the default integer index assigned to a data frame and the process of changing it to a more meaningful index, such as dates, for better data organization and access.']}, {'end': 1256.989, 'start': 1049.917, 'title': 'Manipulating data frames in pandas', 'summary': 'Explains how to manipulate data frames in pandas, including setting and resetting index, and highlights the importance of in place parameter and potential errors. it also briefly mentions the upcoming tutorials on different ways of creating data frames.', 'duration': 207.072, 'highlights': ['The importance of using in place parameter when modifying the original data frame, illustrated by setting and resetting index, is emphasized, highlighting the need to avoid potential errors by executing the same statement multiple times.', 'The tutorial briefly mentions the upcoming topics on different ways of creating data frames, indicating the scope of future tutorials and the comprehensive coverage of data frame features.']}], 'duration': 408.761, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/F6kmIpWWEdU/pics/F6kmIpWWEdU848228.jpg', 'highlights': ['The maximum temperature in the dataset was 35 degrees Celsius, recorded on 2nd January 2017.', 'The chapter covers printing specific columns, such as day and temperature, and explores various operations in Pandas including mean, median, and max.', 'The advantage of changing the index to date is the ability to use the loc function to access specific rows based on dates, enhancing data manipulation and retrieval.', 'Pandas offers a comprehensive list of operations including mean, median, max, and a wide range of statistical operations for data frames and series.', 'Demonstrating the process of changing the data frame index from integer to date using the set_index function, which allows for easy access to data based on the new index.', 'The importance of using in place parameter when modifying the original data frame, illustrated by setting and resetting index, is emphasized, highlighting the need to avoid potential errors by executing the same statement multiple times.', 'Explanation of the default integer index assigned to a data frame and the process of changing it to a more meaningful index, such as dates, for better data organization and access.', 'The tutorial briefly mentions the upcoming topics on different ways of creating data frames, indicating the scope of future tutorials and the comprehensive coverage of data frame features.']}], 'highlights': ["The 'describe' function prints statistics on the data set, including count, mean, standard deviation, percentiles, and max, offering a rapid overview of the data's characteristics.", 'Data frame supports SQL-like operations for conditional data selection, exemplified by selecting rows with temperatures greater than or equal to 32.', 'The maximum temperature in the dataset was 35 degrees Celsius, recorded on 2nd January 2017.', 'Creating a data frame using pandas module and reading CSV file', 'Demonstrates the use of df.head, df.tail, and df.columns', 'The tutorial emphasizes the importance of data frame in pandas and provides guidance on using Jupyter notebook and anaconda for installation.', "The 'max' function is used to find the maximum temperature in the given data set, providing a quick way to obtain this key metric.", 'Understanding that columns in data frames are of type pandas series is important for working with Pandas DataFrames.', 'The advantage of changing the index to date is the ability to use the loc function to access specific rows based on dates, enhancing data manipulation and retrieval.', 'Jupyter notebook is recommended for data visualization, and its installation is facilitated by anaconda, which also installs pandas.']}