title
IO Basics - p.3 Data Analysis with Python and Pandas Tutorial
description
Welcome to Part 3 of Data Analysis with Pandas and Python. In this tutorial, we will begin discussing IO, or input/output, with Pandas, and begin with a realistic use-case. To get ample practice, a very useful website is Quandl. Quandl contains a plethora of free and paid data sources. What makes this location great is that the data is generally normalized, it's all in one place, and extracting the data is the same method. If you are using Python, and you access the Quandl data via their simple module, then the data is automatically returned to a dataframe. For the purposes of this tutorial, we're going to just manually download a CSV file instead, for learning purposes, since not every data source you find is going to have a nice and neat module for extracting the datasets.
Let's say we're interested in maybe purchasing or selling a home in Austin, Texas. The zipcode there is 77006. We could go to the local housing listings and see what the current prices are, but this doesn't really give us any real historical information, so let's just try to get some data on this. Let's query for "home value index 77006." Sure enough, we can see an index here. There's top, middle, lower tier, three bedroom, and so on. Let's say, sure, we got a a three bedroom house. Let's check that out. Turns out Quandl already provides graphs, but let's grab the dataset anyway, make our own graph, and maybe do some other analysis. Go to download, and choose CSV. Pandas is capable of IO with csv, excel data, hdf, sql, json, msgpack, html, gbq, stata, clipboard, and pickle data, and the list continues to grow. Check out the IO Tools documentation for the current list. Take that CSV and move it into the local directory (the directory that you are currently working in / where this .py script is).
sample code and text-based write up for this tutorial: http://pythonprogramming.net/input-output-data-analysis-python-pandas-tutorial/
http://pythonprogramming.net
https://twitter.com/sentdex
detail
{'title': 'IO Basics - p.3 Data Analysis with Python and Pandas Tutorial', 'heatmap': [{'end': 530.853, 'start': 505.46, 'weight': 1}, {'end': 981.71, 'start': 947.809, 'weight': 0.737}], 'summary': 'Tutorial covers pandas i-o for handling various data formats, exporting and downloading data, managing csv data, and includes practical examples and code demonstrations.', 'chapters': [{'end': 178.6, 'segs': [{'end': 53.58, 'src': 'embed', 'start': 23.608, 'weight': 0, 'content': [{'end': 26.449, 'text': 'And then we showed in the second video doing it with a dictionary.', 'start': 23.608, 'duration': 2.841}, {'end': 29.153, 'text': 'Again, no fancy IOs required.', 'start': 27.009, 'duration': 2.144}, {'end': 38.187, 'text': "But when you start thinking about how you're going to handle a CSV, a text, the HDF file, XLS and so on, or HTML SQL,", 'start': 29.313, 'duration': 8.874}, {'end': 48.536, 'text': 'Each file has to be handled a little different, but with pandas, typically the input output is still gonna be just one line.', 'start': 40.208, 'duration': 8.328}, {'end': 53.58, 'text': "it's gonna be into a data frame and you're ready to go no different than how you would handle anything else.", 'start': 48.536, 'duration': 5.044}], 'summary': 'Pandas simplifies handling various file types, providing ease and consistency with just one line for input and output.', 'duration': 29.972, 'max_score': 23.608, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9Z7wvippeko/pics/9Z7wvippeko23608.jpg'}, {'end': 110.087, 'src': 'embed', 'start': 85.534, 'weight': 1, 'content': [{'end': 91.339, 'text': "So anyway, you might want to think about making an account with them because we're going to use them relatively quickly, but not for this tutorial.", 'start': 85.534, 'duration': 5.805}, {'end': 98.158, 'text': 'So when you go to Quandl, basically what they have is just data sets galore for everything.', 'start': 92.273, 'duration': 5.885}, {'end': 104.062, 'text': "And what's nice is their data sets are pretty well normalized.", 'start': 100.379, 'duration': 3.683}, {'end': 110.087, 'text': 'Quandl also has an API, or a module, well an API, but also a module that works with their API.', 'start': 104.182, 'duration': 5.905}], 'summary': 'Consider making an account with quandl for accessing well-normalized data sets and their api.', 'duration': 24.553, 'max_score': 85.534, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9Z7wvippeko/pics/9Z7wvippeko85534.jpg'}, {'end': 158.182, 'src': 'embed', 'start': 128.943, 'weight': 2, 'content': [{'end': 133.365, 'text': "First, let's say you're looking for a house in Austin, Texas.", 'start': 128.943, 'duration': 4.422}, {'end': 135.326, 'text': 'Maybe we would look for housing prices 77006.', 'start': 133.825, 'duration': 1.501}, {'end': 137.507, 'text': "That's the zip code for Austin.", 'start': 135.326, 'duration': 2.181}, {'end': 144.032, 'text': 'Now, at least I think it is.', 'start': 141.73, 'duration': 2.302}, {'end': 145.513, 'text': "We'll find out soon enough.", 'start': 144.592, 'duration': 0.921}, {'end': 147.394, 'text': 'So here we go.', 'start': 146.854, 'duration': 0.54}, {'end': 152.498, 'text': "We've got Zillow home value index median price for that zip code.", 'start': 147.414, 'duration': 5.084}, {'end': 155.08, 'text': "That's from 2008 onward.", 'start': 152.678, 'duration': 2.402}, {'end': 156.701, 'text': "I'm hoping we can find a longer one.", 'start': 155.1, 'duration': 1.601}, {'end': 158.182, 'text': 'Not really.', 'start': 157.642, 'duration': 0.54}], 'summary': 'Searching for housing prices in zip code 77006, austin, texas using zillow.', 'duration': 29.239, 'max_score': 128.943, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9Z7wvippeko/pics/9Z7wvippeko128943.jpg'}], 'start': 1.718, 'title': 'Pandas i-o data analysis', 'summary': 'Covers pandas i-o for handling multiple data formats, using the example of accessing housing price data from quandl, preparing for future api usage, and exploring home value index for a specific zip code in austin, texas.', 'chapters': [{'end': 178.6, 'start': 1.718, 'title': 'Pandas i-o data analysis', 'summary': 'Covers pandas i-o for handling multiple data formats, using the example of accessing housing price data from quandl, preparing for future api usage, and exploring home value index for a specific zip code in austin, texas.', 'duration': 176.882, 'highlights': ['Pandas I-O allows input/output for multiple data formats like CSV, text, HDF file, XLS, HTML, and SQL. Pandas I-O enables handling various data formats such as CSV, text, HDF file, XLS, HTML, and SQL, with input/output typically requiring just one line.', 'Using Quandl for accessing housing price data, initially without requiring an account, and the availability of normalized data sets. The tutorial introduces using Quandl for accessing housing price data, initially without needing an account, and highlights the availability of normalized data sets.', 'Exploring home value index for a specific zip code in Austin, Texas, and observing recent significant home price increases. The chapter delves into exploring the home value index for a specific zip code in Austin, Texas, and notes the significant recent increases in home prices.']}], 'duration': 176.882, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9Z7wvippeko/pics/9Z7wvippeko1718.jpg', 'highlights': ['Pandas I-O enables handling various data formats such as CSV, text, HDF file, XLS, HTML, and SQL, with input/output typically requiring just one line.', 'Using Quandl for accessing housing price data, initially without needing an account, and highlights the availability of normalized data sets.', 'The chapter delves into exploring the home value index for a specific zip code in Austin, Texas, and notes the significant recent increases in home prices.']}, {'end': 386.798, 'segs': [{'end': 207.393, 'src': 'embed', 'start': 178.76, 'weight': 0, 'content': [{'end': 183.164, 'text': 'Export data, this would be from any of the.', 'start': 178.76, 'duration': 4.404}, {'end': 184.606, 'text': 'First of all, you have the code here.', 'start': 183.164, 'duration': 1.442}, {'end': 190.176, 'text': "and then this would be if you're using say you're using the python library, which we'll talk about later, you don't have to worry about that now,", 'start': 185.151, 'duration': 5.025}, {'end': 196.202, 'text': "but you would click on this and bam, that's the code you would get, and it automatically stuffs it into a panda's data frame.", 'start': 190.176, 'duration': 6.026}, {'end': 198.264, 'text': 'awesome now.', 'start': 196.202, 'duration': 2.062}, {'end': 202.548, 'text': "um, we'll leave this aside for now, and what we're looking for is download.", 'start': 198.264, 'duration': 4.284}, {'end': 207.393, 'text': "we're going to go to download, and then you can choose xml, json, excel, csv.", 'start': 202.548, 'duration': 4.845}], 'summary': 'Export data using code, then download in xml, json, excel, or csv.', 'duration': 28.633, 'max_score': 178.76, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9Z7wvippeko/pics/9Z7wvippeko178760.jpg'}, {'end': 267.199, 'src': 'embed', 'start': 216.495, 'weight': 2, 'content': [{'end': 223.14, 'text': "Drag over the tutorials right here and I'm just going to click and drag this into here.", 'start': 216.495, 'duration': 6.645}, {'end': 235.362, 'text': "Now, the next thing that I do want to cover actually, let's go to our friend google.com and let's say you are curious about something with pandas.", 'start': 224.26, 'duration': 11.102}, {'end': 238.543, 'text': 'Well, pandas actually has relatively decent documentation.', 'start': 235.382, 'duration': 3.161}, {'end': 244.025, 'text': 'So we could say pandas.io and here you go, you got your IO tools here.', 'start': 239.003, 'duration': 5.022}, {'end': 249.786, 'text': 'These are basically all of the data types that we can read in pickle, clipboard, strata, GBQ or Stata.', 'start': 244.465, 'duration': 5.321}, {'end': 256.471, 'text': 'HTML, message packages on SQL, HDF, Excel, CSV, and you can even put it back out.', 'start': 250.426, 'duration': 6.045}, {'end': 258.531, 'text': 'So this is to read from, this is to.', 'start': 256.551, 'duration': 1.98}, {'end': 261.094, 'text': 'So you can get all kinds of information here.', 'start': 259.593, 'duration': 1.501}, {'end': 263.496, 'text': 'Also, this is basically the documentation here.', 'start': 261.233, 'duration': 2.263}, {'end': 267.199, 'text': 'This is just all the stuff that Pandas can do for you.', 'start': 263.516, 'duration': 3.683}], 'summary': 'Tutorial demonstrates reading data types using pandas.io, including pickle, clipboard, strata, gbq, html, sql, hdf, excel, csv.', 'duration': 50.704, 'max_score': 216.495, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9Z7wvippeko/pics/9Z7wvippeko216495.jpg'}, {'end': 313.095, 'src': 'embed', 'start': 287.544, 'weight': 1, 'content': [{'end': 293.927, 'text': "If you're on Linux and maybe you're executing from the shell or something like this, you may need to give the full path.", 'start': 287.544, 'duration': 6.383}, {'end': 299.69, 'text': "I'm going to specify just a local path, but you may need to give the full path depending on how you're executing this code.", 'start': 294.067, 'duration': 5.623}, {'end': 305.652, 'text': 'So moving this over, the first thing that we would do to read the CSV, well, first of all, we actually have to import pandas.', 'start': 300.35, 'duration': 5.302}, {'end': 308.613, 'text': 'So import pandas as pd as usual.', 'start': 305.712, 'duration': 2.901}, {'end': 313.095, 'text': "Then we're going to say df equals pd.read underscore CSV.", 'start': 308.633, 'duration': 4.462}], 'summary': 'On linux, specify full path when executing code. import pandas, read csv using pd.read_csv.', 'duration': 25.551, 'max_score': 287.544, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9Z7wvippeko/pics/9Z7wvippeko287544.jpg'}, {'end': 386.798, 'src': 'embed', 'start': 360.894, 'weight': 3, 'content': [{'end': 367.444, 'text': "Boom Now, of course, we see that date is not our index, and we're being assigned this kind of meh index.", 'start': 360.894, 'duration': 6.55}, {'end': 373.649, 'text': 'So what we can do is we can close out of here, and we can do our df.setIndex.', 'start': 368.045, 'duration': 5.604}, {'end': 377.491, 'text': "And don't forget, we have to say inPlace equals true or redefine the data frame.", 'start': 373.869, 'duration': 3.622}, {'end': 381.594, 'text': 'So df.set underscore index.', 'start': 377.952, 'duration': 3.642}, {'end': 384.397, 'text': "We're going to set it to the date column.", 'start': 381.675, 'duration': 2.722}, {'end': 386.798, 'text': "And we're going to say inPlace equals true.", 'start': 384.877, 'duration': 1.921}], 'summary': 'Using df.setindex to set date as index with inplace=true.', 'duration': 25.904, 'max_score': 360.894, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9Z7wvippeko/pics/9Z7wvippeko360894.jpg'}], 'start': 178.76, 'title': 'Data export, download, and data processing with pandas', 'summary': 'Covers the process of exporting and downloading data using python library pandas, and reading various data types such as pickle, clipboard, strata, gbq, stata, html, message packages, sql, hdf, excel, and csv, along with instructions on reading csv files and setting the index.', 'chapters': [{'end': 244.025, 'start': 178.76, 'title': 'Data export and download process', 'summary': 'Explains the process of exporting data and downloading it in various formats using python library pandas, and also mentions the availability of documentation.', 'duration': 65.265, 'highlights': ['The process of exporting data and downloading it in various formats using Python library pandas is explained, including the automatic conversion of code into a pandas data frame. Exporting data, downloading in various formats, automatic conversion of code into pandas data frame', 'The availability of documentation for the pandas library, particularly the IO tools, is mentioned. Mention of decent documentation for pandas library, reference to IO tools']}, {'end': 386.798, 'start': 244.465, 'title': 'Reading and processing data with pandas', 'summary': 'Covers reading various data types such as pickle, clipboard, strata, gbq, stata, html, message packages, sql, hdf, excel, and csv using pandas, with instructions on reading csv files and setting the index.', 'duration': 142.333, 'highlights': ['Pandas can read various data types including pickle, clipboard, strata, GBQ, Stata, HTML, message packages, SQL, HDF, Excel, and CSV, with the ability to output the data as well. Pandas can read a wide range of data types, including pickle, clipboard, strata, GBQ, Stata, HTML, message packages, SQL, HDF, Excel, and CSV, and has the capability to output the data as well.', "Instructions for reading a CSV file using Pandas, including importing the library, specifying the file path, and printing the data frame's head. The process of reading a CSV file using Pandas involves importing the library, specifying the file path, and printing the data frame's head to display the data.", 'Demonstrating how to set the index of the data frame to a specific column using df.set_index method. The tutorial illustrates how to set the index of the data frame to a specific column using the df.set_index method in Pandas.']}], 'duration': 208.038, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9Z7wvippeko/pics/9Z7wvippeko178760.jpg', 'highlights': ['Covers the process of exporting and downloading data using Python library pandas in various formats, including automatic conversion of code into a pandas data frame.', "Instructions for reading a CSV file using Pandas, involving importing the library, specifying the file path, and printing the data frame's head.", 'Pandas can read various data types such as pickle, clipboard, strata, GBQ, Stata, HTML, message packages, SQL, HDF, Excel, and CSV, with the ability to output the data as well.', 'Demonstrates how to set the index of the data frame to a specific column using the df.set_index method in Pandas.', 'Mention of decent documentation for pandas library, particularly the IO tools.']}, {'end': 1031.222, 'segs': [{'end': 419.824, 'src': 'embed', 'start': 388.035, 'weight': 0, 'content': [{'end': 391.819, 'text': "And then let's go ahead and df.to underscore csv.", 'start': 388.035, 'duration': 3.784}, {'end': 398.906, 'text': "And we're going to save it to, we're just going to call it new csv2.csv.", 'start': 392.86, 'duration': 6.046}, {'end': 404.251, 'text': 'Okay So that will actually save to a csv our information.', 'start': 399.566, 'duration': 4.685}, {'end': 408.435, 'text': 'Let me pull up where we are.', 'start': 404.271, 'duration': 4.164}, {'end': 410.477, 'text': "Let's see the data analysis.", 'start': 408.455, 'duration': 2.022}, {'end': 411.098, 'text': 'There we go.', 'start': 410.717, 'duration': 0.381}, {'end': 412.459, 'text': "So we'll save and run that.", 'start': 411.538, 'duration': 0.921}, {'end': 419.023, 'text': 'And that outputs it to a CSV, which we can see new CSV2 here.', 'start': 414.72, 'duration': 4.303}, {'end': 419.824, 'text': 'We could open that up.', 'start': 419.063, 'duration': 0.761}], 'summary': 'Saving data to a new csv file named new_csv2.csv.', 'duration': 31.789, 'max_score': 388.035, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9Z7wvippeko/pics/9Z7wvippeko388035.jpg'}, {'end': 530.853, 'src': 'heatmap', 'start': 493.677, 'weight': 3, 'content': [{'end': 501.199, 'text': "Well, HDF is a little different, but read HTML or, I don't know, Excel, CSV, whatever.", 'start': 493.677, 'duration': 7.522}, {'end': 505.3, 'text': 'You can specify the index column when you read it in.', 'start': 502.099, 'duration': 3.201}, {'end': 512.962, 'text': 'So we could take this, copy, paste, and then we could say index underscore call equals zero.', 'start': 505.46, 'duration': 7.502}, {'end': 514.063, 'text': "Now we're going to print it.", 'start': 512.982, 'duration': 1.081}, {'end': 515.624, 'text': "We'll print the old one and the new one.", 'start': 514.222, 'duration': 1.402}, {'end': 519.285, 'text': 'And then, so this is the one without it, reading from CSV2.', 'start': 516.424, 'duration': 2.861}, {'end': 520.307, 'text': 'This is the new one.', 'start': 519.586, 'duration': 0.721}, {'end': 525.851, 'text': "Bang So now we actually have the new, you know, we're specifying date as the index.", 'start': 520.626, 'duration': 5.225}, {'end': 530.853, 'text': 'Next, we have this value column.', 'start': 527.409, 'duration': 3.444}], 'summary': 'Using hdf, index column specified when reading in, resulting in new data with specified index and value column.', 'duration': 26.63, 'max_score': 493.677, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9Z7wvippeko/pics/9Z7wvippeko493677.jpg'}, {'end': 585.096, 'src': 'embed', 'start': 561.38, 'weight': 4, 'content': [{'end': 571.886, 'text': 'If you want to rename every single column, the way that you would do that is you would say df.columns equals, and now we just have that one.', 'start': 561.38, 'duration': 10.506}, {'end': 574.208, 'text': "Don't forget, because we set the index.", 'start': 571.926, 'duration': 2.282}, {'end': 575.749, 'text': 'An index is not a column.', 'start': 574.348, 'duration': 1.401}, {'end': 578.311, 'text': "It looks like a column, but it's not a column anymore.", 'start': 575.789, 'duration': 2.522}, {'end': 580.653, 'text': "We've revoked its column status.", 'start': 578.351, 'duration': 2.302}, {'end': 585.096, 'text': 'So df.columns, now we have one column, so we just need to name one column.', 'start': 581.253, 'duration': 3.843}], 'summary': 'To rename every single column, use df.columns equals, as we have only one column now.', 'duration': 23.716, 'max_score': 561.38, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9Z7wvippeko/pics/9Z7wvippeko561380.jpg'}, {'end': 644.1, 'src': 'embed', 'start': 613.439, 'weight': 1, 'content': [{'end': 617.722, 'text': 'Now, the next thing that we want to do is save it to a CSV file.', 'start': 613.439, 'duration': 4.283}, {'end': 619.644, 'text': "So let's say we want to save that.", 'start': 617.802, 'duration': 1.842}, {'end': 627.269, 'text': "So we would say df.to underscore CSV and we'll save this as new CSV3.csv.", 'start': 619.744, 'duration': 7.525}, {'end': 631.807, 'text': 'Fine and good.', 'start': 630.686, 'duration': 1.121}, {'end': 633.349, 'text': 'Cool We should have a new CSV.', 'start': 631.927, 'duration': 1.422}, {'end': 633.909, 'text': 'Here it is.', 'start': 633.409, 'duration': 0.5}, {'end': 635.351, 'text': 'New CSV 3.', 'start': 634.17, 'duration': 1.181}, {'end': 635.991, 'text': 'Open that up.', 'start': 635.351, 'duration': 0.64}, {'end': 638.934, 'text': 'New CSV 2 is refreshed.', 'start': 637.713, 'duration': 1.221}, {'end': 639.415, 'text': "That's okay.", 'start': 638.994, 'duration': 0.421}, {'end': 641.317, 'text': "Here's new CSV 3.", 'start': 639.715, 'duration': 1.602}, {'end': 644.1, 'text': 'And you can see it has this new column header as Austin HPI.', 'start': 641.317, 'duration': 2.783}], 'summary': "Data is saved to a new csv file named new_csv3.csv with a new column header 'austin hpi'.", 'duration': 30.661, 'max_score': 613.439, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9Z7wvippeko/pics/9Z7wvippeko613439.jpg'}, {'end': 701.866, 'src': 'embed', 'start': 671.638, 'weight': 6, 'content': [{'end': 674.566, 'text': 'And now we actually have no headers there at all.', 'start': 671.638, 'duration': 2.928}, {'end': 675.549, 'text': "It's just the data.", 'start': 674.586, 'duration': 0.963}, {'end': 682.773, 'text': 'Cool Now, what if we want to read a CSV back in that has no header, right? Because the other ones have had headers.', 'start': 675.97, 'duration': 6.803}, {'end': 686.616, 'text': "So what if you're reading a CSV that has no header? So you would say something like this.", 'start': 682.813, 'duration': 3.803}, {'end': 693.1, 'text': 'Like, what if we have df equals pd dot read underscore CSV.', 'start': 686.796, 'duration': 6.304}, {'end': 696.562, 'text': "We're going to read from new CSV4 dot CSV.", 'start': 693.52, 'duration': 3.042}, {'end': 698.604, 'text': "And then we're going to say names.", 'start': 697.243, 'duration': 1.361}, {'end': 701.866, 'text': 'These are the names of the columns that we have.', 'start': 698.644, 'duration': 3.222}], 'summary': 'Demonstrating reading a csv with no headers using python pandas.', 'duration': 30.228, 'max_score': 671.638, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9Z7wvippeko/pics/9Z7wvippeko671638.jpg'}, {'end': 745.77, 'src': 'embed', 'start': 723.955, 'weight': 2, 'content': [{'end': 732.318, 'text': 'Now we can print df.head, save and run that, pull this up, and that is from the most recent one.', 'start': 723.955, 'duration': 8.363}, {'end': 737.24, 'text': 'Without any headers, we named the columns whatever the heck we wanted, and we set the index to be zero.', 'start': 732.498, 'duration': 4.742}, {'end': 740.144, 'text': 'Cool So we brought all that information in.', 'start': 737.841, 'duration': 2.303}, {'end': 745.77, 'text': "Now, what we can do, however, is you can say you're reading a CSV.", 'start': 740.844, 'duration': 4.926}], 'summary': "Dataframe 'df.head' printed, columns named, and index set to zero.", 'duration': 21.815, 'max_score': 723.955, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9Z7wvippeko/pics/9Z7wvippeko723955.jpg'}, {'end': 840.36, 'src': 'embed', 'start': 808.615, 'weight': 7, 'content': [{'end': 813.198, 'text': 'So, how would we do that? So, we would say df.to underscore HTML.', 'start': 808.615, 'duration': 4.583}, {'end': 817.641, 'text': 'And then we would convert that to example.html.', 'start': 813.998, 'duration': 3.643}, {'end': 819.282, 'text': 'Run that.', 'start': 818.902, 'duration': 0.38}, {'end': 821.993, 'text': 'And sure enough, there you have it.', 'start': 820.672, 'duration': 1.321}, {'end': 825.334, 'text': 'Example.html We can open that in Chrome.', 'start': 822.133, 'duration': 3.201}, {'end': 826.934, 'text': 'Where is Chrome? Here we go.', 'start': 825.754, 'duration': 1.18}, {'end': 828.095, 'text': 'Bring that over.', 'start': 827.435, 'duration': 0.66}, {'end': 828.955, 'text': 'And here it is.', 'start': 828.435, 'duration': 0.52}, {'end': 830.496, 'text': 'This is your data frame.', 'start': 829.495, 'duration': 1.001}, {'end': 832.577, 'text': "That's pretty cool.", 'start': 831.896, 'duration': 0.681}, {'end': 840.36, 'text': 'So anyway, and then if we open that, we can open that in Notepad++ so you can see it indeed converted it to HTML.', 'start': 833.157, 'duration': 7.203}], 'summary': 'Demonstration of converting a dataframe to html and opening it in chrome and notepad++.', 'duration': 31.745, 'max_score': 808.615, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9Z7wvippeko/pics/9Z7wvippeko808615.jpg'}, {'end': 981.71, 'src': 'heatmap', 'start': 947.809, 'weight': 0.737, 'content': [{'end': 954.272, 'text': "Let's do 77006H underscore HPI.", 'start': 947.809, 'duration': 6.463}, {'end': 958.514, 'text': "And then we'll say in place equals true.", 'start': 955.433, 'duration': 3.081}, {'end': 961.675, 'text': 'And then finally print df.head.', 'start': 959.234, 'duration': 2.441}, {'end': 963.236, 'text': 'Save and run that.', 'start': 962.575, 'duration': 0.661}, {'end': 966.258, 'text': "boom, you've renamed a single column.", 'start': 964.076, 'duration': 2.182}, {'end': 967.439, 'text': 'We retained date.', 'start': 966.378, 'duration': 1.061}, {'end': 974.584, 'text': "Now, of course, like I was saying, you would obviously want your date to be the index, but we only had two columns that we're working with.", 'start': 968.019, 'duration': 6.565}, {'end': 981.71, 'text': 'So I wanted to retain date just so you could see that you could rename one column at a time if you really wanted to.', 'start': 974.604, 'duration': 7.106}], 'summary': 'Renamed a single column in the dataframe and retained date.', 'duration': 33.901, 'max_score': 947.809, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9Z7wvippeko/pics/9Z7wvippeko947809.jpg'}], 'start': 388.035, 'title': 'Managing csv data', 'summary': 'Covers saving and reading csv data, including setting the index column and renaming columns. it also demonstrates file conversion, handling csv files with and without headers, and converting dataframes to html format, with practical examples and code demonstrations.', 'chapters': [{'end': 543.306, 'start': 388.035, 'title': 'Saving and reading csv data', 'summary': 'Demonstrates saving data to a csv file, reading it back into a data frame, and setting the index column, with the key point that a csv file does not have the attribute index, and the method to specify the index column when reading it in.', 'duration': 155.271, 'highlights': ['Demonstrates saving data to a CSV file using df.to_csv method. The process of saving data to a CSV file is shown.', 'Shows how to read data from a CSV file using pd.read_csv method and print the dataframe head. The process of reading data from a CSV file and displaying the dataframe head is illustrated.', 'Explains the need to specify the index column when reading data from a CSV file. The importance of specifying the index column when reading data from a CSV file is highlighted.']}, {'end': 1031.222, 'start': 543.326, 'title': 'Renaming columns and file conversion', 'summary': 'Covers the process of renaming columns in a dataframe, saving the dataframe to a csv file, handling csv files with and without headers, and converting a dataframe to html format, providing practical examples and code demonstrations.', 'duration': 487.896, 'highlights': ["The process of renaming columns in a dataframe is demonstrated, including renaming a single column using the 'df.rename' method with 'inplace=True'. Demonstrates how to rename columns in a dataframe, and specifically how to rename a single column using the 'df.rename' method with 'inplace=True'.", "The process of saving a dataframe to a CSV file is explained, including the use of 'df.to_csv' method with different parameters like 'header=False' to exclude column headers in the saved file. Explains the process of saving a dataframe to a CSV file and demonstrates how to exclude column headers using the 'header=False' parameter in the 'df.to_csv' method.", "The handling of CSV files with and without headers is discussed, covering how to read a CSV file with no header and specifying column names using 'pd.read_csv' method with 'names' parameter. Discusses the process of handling CSV files with and without headers, including reading a CSV file with no header and specifying column names using 'pd.read_csv' method with 'names' parameter.", "The conversion of a dataframe to HTML format is demonstrated using the 'df.to_html' method, and the resulting HTML file is opened and displayed. Demonstrates the conversion of a dataframe to HTML format using the 'df.to_html' method and displays the resulting HTML file."]}], 'duration': 643.187, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9Z7wvippeko/pics/9Z7wvippeko388035.jpg', 'highlights': ['Covers saving and reading csv data with practical examples and code demonstrations.', 'Demonstrates saving data to a CSV file using df.to_csv method.', 'Shows how to read data from a CSV file using pd.read_csv method and print the dataframe head.', 'Explains the need to specify the index column when reading data from a CSV file.', "The process of renaming columns in a dataframe is demonstrated, including renaming a single column using the 'df.rename' method with 'inplace=True'.", "The process of saving a dataframe to a CSV file is explained, including the use of 'df.to_csv' method with different parameters like 'header=False' to exclude column headers in the saved file.", "The handling of CSV files with and without headers is discussed, covering how to read a CSV file with no header and specifying column names using 'pd.read_csv' method with 'names' parameter.", "The conversion of a dataframe to HTML format is demonstrated using the 'df.to_html' method, and the resulting HTML file is opened and displayed."]}], 'highlights': ['Pandas I-O enables handling various data formats such as CSV, text, HDF file, XLS, HTML, and SQL, with input/output typically requiring just one line.', 'Covers the process of exporting and downloading data using Python library pandas in various formats, including automatic conversion of code into a pandas data frame.', 'Covers saving and reading csv data with practical examples and code demonstrations.', 'Using Quandl for accessing housing price data, initially without needing an account, and highlights the availability of normalized data sets.', "Instructions for reading a CSV file using Pandas, involving importing the library, specifying the file path, and printing the data frame's head.", 'The chapter delves into exploring the home value index for a specific zip code in Austin, Texas, and notes the significant recent increases in home prices.', 'Pandas can read various data types such as pickle, clipboard, strata, GBQ, Stata, HTML, message packages, SQL, HDF, Excel, and CSV, with the ability to output the data as well.', 'Demonstrates how to set the index of the data frame to a specific column using the df.set_index method in Pandas.', 'Mention of decent documentation for pandas library, particularly the IO tools.', 'Demonstrates saving data to a CSV file using df.to_csv method.', 'Shows how to read data from a CSV file using pd.read_csv method and print the dataframe head.', 'Explains the need to specify the index column when reading data from a CSV file.', "The process of renaming columns in a dataframe is demonstrated, including renaming a single column using the 'df.rename' method with 'inplace=True'.", "The process of saving a dataframe to a CSV file is explained, including the use of 'df.to_csv' method with different parameters like 'header=False' to exclude column headers in the saved file.", "The handling of CSV files with and without headers is discussed, covering how to read a CSV file with no header and specifying column names using 'pd.read_csv' method with 'names' parameter.", "The conversion of a dataframe to HTML format is demonstrated using the 'df.to_html' method, and the resulting HTML file is opened and displayed."]}