title
Tutorial 7- Pandas-Reading JSON,Reading HTML, Read PICKLE, Read EXCEL Files- Part 3
description
Hello All,
Welcome to the Python Crash Course. In this video we will understand about Pandas library, how to read JSON ,HTML, PICKLE and Eexcel files.
github url : https://github.com/krishnaik06/Machine-Learning-in-90-days
Support me in Patreon: https://www.patreon.com/join/2340909?
Connect with me here:
Twitter: https://twitter.com/Krishnaik06
Facebook: https://www.facebook.com/krishnaik06
instagram: https://www.instagram.com/krishnaik06
If you like music support my brother's channel
https://www.youtube.com/channel/UCdupFqYIc6VMO-pXVlvmM4Q
Buy the Best book of Machine Learning, Deep Learning with python sklearn and tensorflow from below
amazon url:
https://www.amazon.in/Hands-Machine-Learning-Scikit-Learn-Tensor/dp/9352135210/ref=as_sl_pc_qf_sp_asin_til?tag=krishnaik06-21&linkCode=w00&linkId=a706a13cecffd115aef76f33a760e197&creativeASIN=9352135210
You can buy my book on Finance with Machine Learning and Deep Learning from the below url
amazon url: https://www.amazon.in/Hands-Python-Finance-implementing-strategies/dp/1789346371/ref=as_sl_pc_qf_sp_asin_til?tag=krishnaik06-21&linkCode=w00&linkId=ac229c9a45954acc19c1b2fa2ca96e23&creativeASIN=1789346371
Subscribe my unboxing Channel
https://www.youtube.com/channel/UCjWY5hREA6FFYrthD0rZNIw
Below are the various playlist created on ML,Data Science and Deep Learning. Please subscribe and support the channel. Happy Learning!
Deep Learning Playlist: https://www.youtube.com/watch?v=DKSZHN7jftI&list=PLZoTAELRMXVPGU70ZGsckrMdr0FteeRUi
Data Science Projects playlist: https://www.youtube.com/watch?v=5Txi0nHIe0o&list=PLZoTAELRMXVNUcr7osiU7CCm8hcaqSzGw
NLP playlist: https://www.youtube.com/watch?v=6ZVf1jnEKGI&list=PLZoTAELRMXVMdJ5sqbCK2LiM0HhQVWNzm
Statistics Playlist: https://www.youtube.com/watch?v=GGZfVeZs_v4&list=PLZoTAELRMXVMhVyr3Ri9IQ-t5QPBtxzJO
Feature Engineering playlist: https://www.youtube.com/watch?v=NgoLMsaZ4HU&list=PLZoTAELRMXVPwYGE2PXD3x0bfKnR0cJjN
Computer Vision playlist: https://www.youtube.com/watch?v=mT34_yu5pbg&list=PLZoTAELRMXVOIBRx0andphYJ7iakSg3Lk
Data Science Interview Question playlist: https://www.youtube.com/watch?v=820Qr4BH0YM&list=PLZoTAELRMXVPkl7oRvzyNnyj1HS4wt2K-
You can buy my book on Finance with Machine Learning and Deep Learning from the below url
amazon url: https://www.amazon.in/Hands-Python-Finance-implementing-strategies/dp/1789346371/ref=sr_1_1?keywords=krish+naik&qid=1560943725&s=gateway&sr=8-1
🙏🙏🙏🙏🙏🙏🙏🙏
YOU JUST NEED TO DO
3 THINGS to support my channel
LIKE
SHARE
&
SUBSCRIBE
TO MY YOUTUBE CHANNEL
detail
{'title': 'Tutorial 7- Pandas-Reading JSON,Reading HTML, Read PICKLE, Read EXCEL Files- Part 3', 'heatmap': [{'end': 986.47, 'start': 944.321, 'weight': 0.775}], 'summary': "A tutorial video titled 'tutorial 7- pandas-reading json,reading html, read pickle, read excel files- part 3' covers using pandas to read and manipulate json data, emphasizing simplicity and practicality, converting json data to a dataframe and csv, using to_json function, creating json from a pandas dataframe, extracting tables from an html page, web scraping for data extraction, reading data from excel files, and pickling in pandas for data storage and its importance in machine learning algorithms.", 'chapters': [{'end': 129.312, 'segs': [{'end': 48.437, 'src': 'embed', 'start': 21.995, 'weight': 0, 'content': [{'end': 28.121, 'text': "Now we'll try to read some JSON data, or it may be a JSON file, any kind of JSON information.", 'start': 21.995, 'duration': 6.126}, {'end': 31.664, 'text': 'Let us see how we can basically read with the help of pandas again.', 'start': 28.121, 'duration': 3.543}, {'end': 34.146, 'text': 'Suppose, This is my JSON over here.', 'start': 32.104, 'duration': 2.042}, {'end': 41.152, 'text': 'Okay, So I have employee name and JSON is basically just like a key value pairs, a combination of key value pairs of JSON data.', 'start': 34.146, 'duration': 7.006}, {'end': 48.437, 'text': 'Now, here you can basically see employee underscore name is equal to James, email is equal to James at the rate gmail.com.', 'start': 41.792, 'duration': 6.645}], 'summary': 'Demonstration of reading json data using pandas for employee details.', 'duration': 26.442, 'max_score': 21.995, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/xx4Vkc5RrWY/pics/xx4Vkc5RrWY21995.jpg'}, {'end': 129.312, 'src': 'embed', 'start': 96.42, 'weight': 1, 'content': [{'end': 100.241, 'text': 'Instead, the last keyword that you see over here, it will be taken as this.', 'start': 96.42, 'duration': 3.821}, {'end': 104.082, 'text': 'Whenever there is a nested structure, that will get displayed directly over here.', 'start': 100.721, 'duration': 3.361}, {'end': 109.943, 'text': 'But if you want to display this further down, we have to basically take up this particular data frame.', 'start': 104.682, 'duration': 5.261}, {'end': 112.744, 'text': 'read all the information, because these are just key value pairs.', 'start': 109.943, 'duration': 2.801}, {'end': 114.324, 'text': 'You can basically use dictionaries.', 'start': 112.784, 'duration': 1.54}, {'end': 119.065, 'text': 'In short, this can be also termed as dictionaries, right? Because I have key value pairs.', 'start': 114.624, 'duration': 4.441}, {'end': 121.686, 'text': 'I can read that and I can again create different columns.', 'start': 119.125, 'duration': 2.561}, {'end': 125.547, 'text': "And again, in the exploratory data analysis, I'll be showing you more examples.", 'start': 121.746, 'duration': 3.801}, {'end': 129.312, 'text': 'but understand that this is pretty much simple.', 'start': 126.207, 'duration': 3.105}], 'summary': 'Key value pairs can be displayed as nested structures or dictionaries, making data analysis simple.', 'duration': 32.892, 'max_score': 96.42, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/xx4Vkc5RrWY/pics/xx4Vkc5RrWY96420.jpg'}], 'start': 1, 'title': 'Pandas json data', 'summary': 'Covers using pandas to read and manipulate json data, detailing the process of reading json into a dataframe and handling nested structures, emphasizing simplicity and practicality.', 'chapters': [{'end': 129.312, 'start': 1, 'title': 'Pandas json data tutorial', 'summary': 'Covers how to read and manipulate json data using pandas, detailing the process of reading json data into a dataframe and handling nested structures, emphasizing the simplicity and practicality of the process.', 'duration': 128.312, 'highlights': ["The tutorial demonstrates how to read JSON data into a dataframe using 'pd.read_json', simplifying the process for data manipulation and analysis.", 'It also explains the handling of nested structures in JSON data, highlighting that the nested structure will be displayed directly in the dataframe, and provides insights into further manipulation using dictionaries.', 'The chapter emphasizes the simplicity of the process and hints at more examples to be covered in exploratory data analysis.']}], 'duration': 128.312, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/xx4Vkc5RrWY/pics/xx4Vkc5RrWY1000.jpg', 'highlights': ["The tutorial simplifies reading JSON into a dataframe with 'pd.read_json'.", 'It details handling nested structures, displaying them directly in the dataframe.', 'Emphasizes simplicity and hints at more examples in exploratory data analysis.']}, {'end': 690.507, 'segs': [{'end': 154.488, 'src': 'embed', 'start': 129.312, 'weight': 0, 'content': [{'end': 136.903, 'text': 'any kind of json information can be read through the help of read underscore json and it can be directly converted into a data frame.', 'start': 129.312, 'duration': 7.591}, {'end': 140.725, 'text': 'now the other thing is that suppose i want to also read.', 'start': 136.903, 'duration': 3.822}, {'end': 142.906, 'text': 'so there is something called as wine.data.', 'start': 140.725, 'duration': 2.181}, {'end': 145.586, 'text': 'i want to read from this particular data information.', 'start': 142.906, 'duration': 2.68}, {'end': 147.267, 'text': 'it is in json format.', 'start': 145.586, 'duration': 1.681}, {'end': 151.707, 'text': 'you just go to this particular url and you just download this dot data file.', 'start': 147.267, 'duration': 4.44}, {'end': 154.488, 'text': 'you will be seeing that it is basically in a json format.', 'start': 151.707, 'duration': 2.781}], 'summary': 'Json data can be read using read_json and converted into a dataframe. wine data is also available in json format.', 'duration': 25.176, 'max_score': 129.312, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/xx4Vkc5RrWY/pics/xx4Vkc5RrWY129312.jpg'}, {'end': 260.45, 'src': 'embed', 'start': 194.227, 'weight': 1, 'content': [{'end': 199.51, 'text': "if not, what i'll do is that in in the upcoming videos, i'm also going to show you the tutorial of mongodb.", 'start': 194.227, 'duration': 5.283}, {'end': 207.354, 'text': 'in that particular tutorial, we will be basically discussing how nested json can be basically extracted and loaded in,', 'start': 199.51, 'duration': 7.844}, {'end': 210.556, 'text': 'loaded using the pandas library itself.', 'start': 207.354, 'duration': 3.202}, {'end': 213.877, 'text': 'So let us go and still more.', 'start': 210.976, 'duration': 2.901}, {'end': 214.778, 'text': "we'll discuss about this.", 'start': 213.877, 'duration': 0.901}, {'end': 217.759, 'text': 'Now once we have read this JSON we have converted into a data frame.', 'start': 214.858, 'duration': 2.901}, {'end': 221.581, 'text': 'Now what we can do is that we can basically convert this into a CSV file.', 'start': 217.799, 'duration': 3.782}, {'end': 225.643, 'text': "So over here you can see that I'm converting this JSON data to CSV file.", 'start': 222.041, 'duration': 3.602}, {'end': 228.284, 'text': "Again I'm saying to underscore CSV wine dot CSV right.", 'start': 225.683, 'duration': 2.601}, {'end': 230.165, 'text': 'So this is my wine dot CSV file.', 'start': 228.624, 'duration': 1.541}, {'end': 238.654, 'text': 'So as soon as I execute this, you can see that if I go and open this particular location, wine.csv is actually created.', 'start': 231.049, 'duration': 7.605}, {'end': 244.257, 'text': 'Pretty much simple because two undersource csv is always present, right? And it is for every other information.', 'start': 239.114, 'duration': 5.143}, {'end': 252.146, 'text': 'You may also have a scenario that you need to convert your whole data frame into JSON.', 'start': 246.784, 'duration': 5.362}, {'end': 256.387, 'text': 'So at that time you can use an inbuilt function called as to underscore JSON.', 'start': 252.526, 'duration': 3.861}, {'end': 260.45, 'text': 'So if you go and press shift tab over here, again you can see that all the information are here.', 'start': 256.408, 'duration': 4.042}], 'summary': 'Tutorial on extracting and converting nested json to csv and json using pandas library.', 'duration': 66.223, 'max_score': 194.227, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/xx4Vkc5RrWY/pics/xx4Vkc5RrWY194227.jpg'}, {'end': 609.843, 'src': 'embed', 'start': 586.771, 'weight': 4, 'content': [{'end': 594.599, 'text': 'now, when i say html content, that basically means we are basically reading the tables from an html page, and this is also web scrapping technique,', 'start': 586.771, 'duration': 7.828}, {'end': 604.539, 'text': 'guys. so if i go and click this particular link okay, and this is pretty much famous i have actually taken this from the documentation page of Pandas.', 'start': 594.599, 'duration': 9.94}, {'end': 609.843, 'text': "If you go and hit this particular link, you'll be seeing that these are failed bank lists.", 'start': 605.219, 'duration': 4.624}], 'summary': 'Using web scraping to read tables from html content, specifically failed bank lists from pandas documentation page.', 'duration': 23.072, 'max_score': 586.771, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/xx4Vkc5RrWY/pics/xx4Vkc5RrWY586771.jpg'}], 'start': 129.312, 'title': 'Json data and pandas manipulation', 'summary': "Covers reading json data, converting to a data frame and csv, using to_json function, creating json from a pandas dataframe, emphasizing 'orient' parameter, and extracting tables from an html page using the pandas library.", 'chapters': [{'end': 303.192, 'start': 129.312, 'title': 'Reading and converting json data', 'summary': 'Covers reading json data through read_json, direct conversion to a data frame, conversion of json to csv, and using to_json function, with a focus on the ease of access and conversion process.', 'duration': 173.88, 'highlights': ['The chapter covers the process of reading JSON data through read_json and converting it directly into a data frame, demonstrating the ease of access and conversion process.', 'The tutorial demonstrates the conversion of JSON data to a CSV file, showcasing the simplicity of the process using to_csv function, resulting in the creation of a wine.csv file.', 'It emphasizes the use of to_json function to convert the entire data frame into a JSON string, allowing for easy transformation and understanding of the parameters involved.']}, {'end': 690.507, 'start': 303.252, 'title': 'Pandas data manipulation and json creation', 'summary': "Demonstrates the process of creating json from a pandas dataframe, emphasizing the use of the 'orient' parameter and the read_html function to extract tables from an html page using the pandas library.", 'duration': 387.255, 'highlights': ["The chapter discusses the process of creating JSON from a Pandas DataFrame, highlighting the use of the 'orient' parameter to define the structure of the JSON output and emphasizing the importance of creating JSONs with respect to records.", "The instructor explains the significance of using the 'orient' parameter, particularly the 'records' option, to ensure the creation of JSON records and demonstrates the process of executing the function to achieve the desired JSON output.", 'The lecture delves into the read_html function, illustrating its capability to extract tables from an HTML page and return a list of tables, thus showcasing its utility in web scraping and data extraction from web pages using the pandas library.', 'The instructor demonstrates the practical application of the read_html function by showcasing its usage to extract tables from a specific HTML page and explains the process of obtaining a list of tables from the web page using the function.']}], 'duration': 561.195, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/xx4Vkc5RrWY/pics/xx4Vkc5RrWY129312.jpg', 'highlights': ['The chapter covers the process of reading JSON data through read_json and converting it directly into a data frame, demonstrating the ease of access and conversion process.', 'The tutorial demonstrates the conversion of JSON data to a CSV file, showcasing the simplicity of the process using to_csv function, resulting in the creation of a wine.csv file.', 'It emphasizes the use of to_json function to convert the entire data frame into a JSON string, allowing for easy transformation and understanding of the parameters involved.', "The chapter discusses the process of creating JSON from a Pandas DataFrame, highlighting the use of the 'orient' parameter to define the structure of the JSON output and emphasizing the importance of creating JSONs with respect to records.", 'The lecture delves into the read_html function, illustrating its capability to extract tables from an HTML page and return a list of tables, thus showcasing its utility in web scraping and data extraction from web pages using the pandas library.']}, {'end': 1169.916, 'segs': [{'end': 733.588, 'src': 'embed', 'start': 690.547, 'weight': 0, 'content': [{'end': 699.753, 'text': 'So the first table that you actually saw is the same information right?. The bank name, the city cert acquiring institution. closing date everything.', 'start': 690.547, 'duration': 9.206}, {'end': 703.716, 'text': 'Now inside this particular URL, guys, you see they are pagination also.', 'start': 699.873, 'duration': 3.843}, {'end': 705.898, 'text': 'But this does not care about pagination.', 'start': 704.116, 'duration': 1.782}, {'end': 709.5, 'text': 'It will pick up all the details that are present within that table tag.', 'start': 705.938, 'duration': 3.562}, {'end': 711.742, 'text': 'That is the most important thing that you need to understand.', 'start': 709.74, 'duration': 2.002}, {'end': 720.424, 'text': 'right. so if you just scroll down and see all the details, all the table information is basically extracted here you can see that it is five, five, six rows.', 'start': 712.102, 'duration': 8.322}, {'end': 722.745, 'text': 'if you want to verify over here, see in this.', 'start': 720.424, 'duration': 2.321}, {'end': 724.085, 'text': 'it shows 25 entries.', 'start': 722.745, 'duration': 1.34}, {'end': 733.588, 'text': 'right, so 25, if i just make it as 100, let us just make it as 100, okay, and let us see.', 'start': 724.085, 'duration': 9.503}], 'summary': 'Extracted table data from url, 556 rows, pagination not considered.', 'duration': 43.041, 'max_score': 690.547, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/xx4Vkc5RrWY/pics/xx4Vkc5RrWY690547.jpg'}, {'end': 781.606, 'src': 'embed', 'start': 751.273, 'weight': 3, 'content': [{'end': 756.076, 'text': 'just like a web scrapping, but only web scrapping, with respect to our tables in a page.', 'start': 751.273, 'duration': 4.803}, {'end': 761.298, 'text': 'okay. so this is mobile country code and if you scroll down over here, you have various tables over here.', 'start': 756.076, 'duration': 5.222}, {'end': 762.098, 'text': 'again. see this.', 'start': 761.298, 'duration': 0.8}, {'end': 762.879, 'text': 'these all are tables.', 'start': 762.098, 'duration': 0.781}, {'end': 765.96, 'text': 'So if I want to find out this particular information, what I can do?', 'start': 763.399, 'duration': 2.561}, {'end': 776.344, 'text': 'I can basically use this URL use pd.read,html, give the URL over here and there are parameters like match.', 'start': 765.96, 'duration': 10.384}, {'end': 781.606, 'text': 'Now what this match will do, see, what this match will do, this is very important to understand.', 'start': 777.125, 'duration': 4.481}], 'summary': "Demonstrating web scraping using python's pd.read_html for tables on a page.", 'duration': 30.333, 'max_score': 751.273, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/xx4Vkc5RrWY/pics/xx4Vkc5RrWY751273.jpg'}, {'end': 826.078, 'src': 'embed', 'start': 799.105, 'weight': 2, 'content': [{'end': 802.327, 'text': "and let me just show you what matching i'm doing.", 'start': 799.105, 'duration': 3.222}, {'end': 804.948, 'text': "i'm matching with the help of country, right.", 'start': 802.327, 'duration': 2.621}, {'end': 812.112, 'text': 'so if i go over here, if i see the country, the matching country will basically be taken from somewhere here.', 'start': 804.948, 'duration': 7.164}, {'end': 813.313, 'text': 'you can see that.', 'start': 812.112, 'duration': 1.201}, {'end': 820.597, 'text': 'uh, it will at least match one of the columns, probably, so it will be able to extract the for which table we are basically assigning.', 'start': 813.313, 'duration': 7.284}, {'end': 826.078, 'text': "okay, So I don't see over here in the table tag itself, but let us see some more details.", 'start': 820.597, 'duration': 5.481}], 'summary': 'Matching based on country for table assignment.', 'duration': 26.973, 'max_score': 799.105, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/xx4Vkc5RrWY/pics/xx4Vkc5RrWY799105.jpg'}, {'end': 1026.199, 'src': 'heatmap', 'start': 944.321, 'weight': 4, 'content': [{'end': 948.927, 'text': 'Suppose in an Excel file, you have multiple sheets, right? So you can also number it.', 'start': 944.321, 'duration': 4.606}, {'end': 953.232, 'text': 'You can give it a zero sheet name or first sheet name, whatever data you want to basically pick up.', 'start': 948.967, 'duration': 4.265}, {'end': 957.997, 'text': "Over here, if you don't provide any sheet name, it is going to take up the zeroth sheet name.", 'start': 953.892, 'duration': 4.105}, {'end': 960.56, 'text': 'And here you can basically execute it.', 'start': 958.598, 'duration': 1.962}, {'end': 961.902, 'text': 'You can again see the head part.', 'start': 960.6, 'duration': 1.302}, {'end': 965.803, 'text': 'Now, there is also one more technique in this called as pickling.', 'start': 962.622, 'duration': 3.181}, {'end': 968.104, 'text': 'Now, pickling is very, very important, guys,', 'start': 966.003, 'duration': 2.101}, {'end': 974.306, 'text': "because you'll be seeing that when we'll be creating our machine learning algorithm models later on we'll be converting that into pickles.", 'start': 968.104, 'duration': 6.202}, {'end': 976.006, 'text': 'So what exactly is pickles?', 'start': 974.766, 'duration': 1.24}, {'end': 980.828, 'text': 'All Pandas objects are equipped with two underscore pickle methods,', 'start': 976.587, 'duration': 4.241}, {'end': 986.47, 'text': 'which uses Python C pickle module to save data structure to disk using the pickle format.', 'start': 980.828, 'duration': 5.642}, {'end': 992.253, 'text': 'So this in short is saving the whole data structure of any type of work you basically do.', 'start': 987.21, 'duration': 5.043}, {'end': 994.694, 'text': 'Suppose you are creating a machine learning algorithm.', 'start': 992.353, 'duration': 2.341}, {'end': 998.376, 'text': 'Suppose you have basically created an Excel file or CSV file.', 'start': 995.174, 'duration': 3.202}, {'end': 1004.219, 'text': 'If you convert that into pickle, that whole data structure is basically created and you can also load that particular file.', 'start': 998.736, 'duration': 5.483}, {'end': 1006.52, 'text': "Again, we'll be discussing a lot about pickle,", 'start': 1004.659, 'duration': 1.861}, {'end': 1010.603, 'text': "because in the machine learning algorithm we'll be converting each and every machine learning algorithm into pickle.", 'start': 1006.52, 'duration': 4.083}, {'end': 1014.965, 'text': "And then we'll be deploying that machine learning algorithm into any cloud servers also.", 'start': 1011.143, 'duration': 3.822}, {'end': 1020.832, 'text': "by and with the help of flask again we'll be integrating it with the front and front web app.", 'start': 1015.365, 'duration': 5.467}, {'end': 1022.574, 'text': 'so this is about pickle.', 'start': 1020.832, 'duration': 1.742}, {'end': 1026.199, 'text': 'so suppose i have actually read the df underscore excel file over here.', 'start': 1022.574, 'duration': 3.625}], 'summary': 'Pickle is an important technique in saving and converting machine learning models, using python c pickle module to save data structure to disk.', 'duration': 58.095, 'max_score': 944.321, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/xx4Vkc5RrWY/pics/xx4Vkc5RrWY944321.jpg'}, {'end': 1108.308, 'src': 'embed', 'start': 1084.029, 'weight': 7, 'content': [{'end': 1090.734, 'text': 'suppose you have a huge data set and if you are doing a lot of pre-processing into that data set, you can also convert that data set into pickle,', 'start': 1084.029, 'duration': 6.705}, {'end': 1094.756, 'text': 'because it need not be like if your kernel gets restarted in the jupyter notebook.', 'start': 1090.734, 'duration': 4.022}, {'end': 1098.979, 'text': 'then you also have to execute all the instructions from starting in order to prevent that.', 'start': 1094.756, 'duration': 4.223}, {'end': 1105.145, 'text': 'what you can basically do is that you can convert this into a pickle, okay, and you can basically load it later on.', 'start': 1098.979, 'duration': 6.166}, {'end': 1106.126, 'text': 'so this is all about this.', 'start': 1105.145, 'duration': 0.981}, {'end': 1108.308, 'text': 'particular videos, guys, i hope you understood it.', 'start': 1106.126, 'duration': 2.182}], 'summary': 'Convert large dataset to pickle to avoid reprocessing and restarts in jupyter notebook.', 'duration': 24.279, 'max_score': 1084.029, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/xx4Vkc5RrWY/pics/xx4Vkc5RrWY1084029.jpg'}], 'start': 690.547, 'title': 'Data extraction and file handling in python', 'summary': 'Covers web scraping for extracting table data from a url using parameters like match, with an example of mobile country code data. it also discusses reading data from excel files using read_excel and the concept of pickling in pandas for data storage, emphasizing its importance in machine learning algorithms.', 'chapters': [{'end': 923.212, 'start': 690.547, 'title': 'Web scraping for table data', 'summary': 'Discusses web scraping for table data, demonstrating the extraction of table information from a url and the use of parameters like match to extract specific table details, with an example of extracting mobile country code data.', 'duration': 232.665, 'highlights': ['The chapter discusses web scraping for table data, demonstrating the extraction of table information from a URL and the use of parameters like match to extract specific table details, with an example of extracting mobile country code data.', 'The web scraping process successfully extracted 556 entries from the table, demonstrating its effectiveness in collecting large amounts of data.', 'The demonstration includes using parameters like match to extract specific table details, such as matching the country column to extract relevant information, showcasing the flexibility and customization offered by web scraping techniques.', 'The chapter emphasizes the importance of understanding the structure and content of the table data, including using parameters like header to identify the table headers and specifying column names for data extraction.']}, {'end': 1169.916, 'start': 923.843, 'title': 'Reading excel files and using pickle in pandas', 'summary': 'Covers the usage of read_excel to import data from excel files, the concept of pickling in pandas to save data structures, and the importance of using pickles in machine learning algorithms for data storage and deployment.', 'duration': 246.073, 'highlights': ['The chapter explains the usage of read_excel to import data from Excel files, specifying sheet names, and converting the data structure into pickles for storage and future use.', 'It emphasizes the importance of pickling in machine learning algorithms, as all Pandas objects are equipped with pickle methods to save data structures to disk.', 'The transcript discusses the significance of pickling in machine learning algorithms, highlighting its role in storing and deploying machine learning models on cloud servers.', 'It mentions the advantages of converting datasets into pickles, enabling easy loading of pre-processed data without the need to re-execute instructions from the beginning in case of system restarts.']}], 'duration': 479.369, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/xx4Vkc5RrWY/pics/xx4Vkc5RrWY690547.jpg', 'highlights': ['The web scraping process successfully extracted 556 entries from the table, demonstrating its effectiveness in collecting large amounts of data.', 'The chapter emphasizes the importance of understanding the structure and content of the table data, including using parameters like header to identify the table headers and specifying column names for data extraction.', 'The demonstration includes using parameters like match to extract specific table details, such as matching the country column to extract relevant information, showcasing the flexibility and customization offered by web scraping techniques.', 'The chapter discusses web scraping for table data, demonstrating the extraction of table information from a URL and the use of parameters like match to extract specific table details, with an example of extracting mobile country code data.', 'The chapter explains the usage of read_excel to import data from Excel files, specifying sheet names, and converting the data structure into pickles for storage and future use.', 'It emphasizes the importance of pickling in machine learning algorithms, as all Pandas objects are equipped with pickle methods to save data structures to disk.', 'The transcript discusses the significance of pickling in machine learning algorithms, highlighting its role in storing and deploying machine learning models on cloud servers.', 'It mentions the advantages of converting datasets into pickles, enabling easy loading of pre-processed data without the need to re-execute instructions from the beginning in case of system restarts.']}], 'highlights': ['The web scraping process successfully extracted 556 entries from the table, demonstrating its effectiveness in collecting large amounts of data.', "The tutorial simplifies reading JSON into a dataframe with 'pd.read_json'.", 'The chapter covers the process of reading JSON data through read_json and converting it directly into a data frame, demonstrating the ease of access and conversion process.', "The chapter discusses the process of creating JSON from a Pandas DataFrame, highlighting the use of the 'orient' parameter to define the structure of the JSON output and emphasizing the importance of creating JSONs with respect to records.", 'The tutorial demonstrates the conversion of JSON data to a CSV file, showcasing the simplicity of the process using to_csv function, resulting in the creation of a wine.csv file.', 'It emphasizes the use of to_json function to convert the entire data frame into a JSON string, allowing for easy transformation and understanding of the parameters involved.', 'The chapter emphasizes the importance of understanding the structure and content of the table data, including using parameters like header to identify the table headers and specifying column names for data extraction.', 'The demonstration includes using parameters like match to extract specific table details, such as matching the country column to extract relevant information, showcasing the flexibility and customization offered by web scraping techniques.', 'The chapter discusses web scraping for table data, demonstrating the extraction of table information from a URL and the use of parameters like match to extract specific table details, with an example of extracting mobile country code data.', 'The chapter explains the usage of read_excel to import data from Excel files, specifying sheet names, and converting the data structure into pickles for storage and future use.', 'It emphasizes the importance of pickling in machine learning algorithms, as all Pandas objects are equipped with pickle methods to save data structures to disk.', 'The transcript discusses the significance of pickling in machine learning algorithms, highlighting its role in storing and deploying machine learning models on cloud servers.', 'It mentions the advantages of converting datasets into pickles, enabling easy loading of pre-processed data without the need to re-execute instructions from the beginning in case of system restarts.']}