title
Tutorial 5- Pandas, Data Frame and Data Series Part-1

description
Hello All, Welcome to the Python Crash Course. In this video we will understand about Pandas library, Dataframes and Data Series github url : https://github.com/krishnaik06/Machine-Learning-in-90-days Support me in Patreon: https://www.patreon.com/join/2340909? Connect with me here: Twitter: https://twitter.com/Krishnaik06 Facebook: https://www.facebook.com/krishnaik06 instagram: https://www.instagram.com/krishnaik06 If you like music support my brother's channel https://www.youtube.com/channel/UCdupFqYIc6VMO-pXVlvmM4Q Buy the Best book of Machine Learning, Deep Learning with python sklearn and tensorflow from below amazon url: https://www.amazon.in/Hands-Machine-Learning-Scikit-Learn-Tensor/dp/9352135210/ref=as_sl_pc_qf_sp_asin_til?tag=krishnaik06-21&linkCode=w00&linkId=a706a13cecffd115aef76f33a760e197&creativeASIN=9352135210 You can buy my book on Finance with Machine Learning and Deep Learning from the below url amazon url: https://www.amazon.in/Hands-Python-Finance-implementing-strategies/dp/1789346371/ref=as_sl_pc_qf_sp_asin_til?tag=krishnaik06-21&linkCode=w00&linkId=ac229c9a45954acc19c1b2fa2ca96e23&creativeASIN=1789346371 Subscribe my unboxing Channel https://www.youtube.com/channel/UCjWY5hREA6FFYrthD0rZNIw Below are the various playlist created on ML,Data Science and Deep Learning. Please subscribe and support the channel. Happy Learning! Deep Learning Playlist: https://www.youtube.com/watch?v=DKSZHN7jftI&list=PLZoTAELRMXVPGU70ZGsckrMdr0FteeRUi Data Science Projects playlist: https://www.youtube.com/watch?v=5Txi0nHIe0o&list=PLZoTAELRMXVNUcr7osiU7CCm8hcaqSzGw NLP playlist: https://www.youtube.com/watch?v=6ZVf1jnEKGI&list=PLZoTAELRMXVMdJ5sqbCK2LiM0HhQVWNzm Statistics Playlist: https://www.youtube.com/watch?v=GGZfVeZs_v4&list=PLZoTAELRMXVMhVyr3Ri9IQ-t5QPBtxzJO Feature Engineering playlist: https://www.youtube.com/watch?v=NgoLMsaZ4HU&list=PLZoTAELRMXVPwYGE2PXD3x0bfKnR0cJjN Computer Vision playlist: https://www.youtube.com/watch?v=mT34_yu5pbg&list=PLZoTAELRMXVOIBRx0andphYJ7iakSg3Lk Data Science Interview Question playlist: https://www.youtube.com/watch?v=820Qr4BH0YM&list=PLZoTAELRMXVPkl7oRvzyNnyj1HS4wt2K- You can buy my book on Finance with Machine Learning and Deep Learning from the below url amazon url: https://www.amazon.in/Hands-Python-Finance-implementing-strategies/dp/1789346371/ref=sr_1_1?keywords=krish+naik&qid=1560943725&s=gateway&sr=8-1 🙏🙏🙏🙏🙏🙏🙏🙏 YOU JUST NEED TO DO 3 THINGS to support my channel LIKE SHARE & SUBSCRIBE TO MY YOUTUBE CHANNEL

detail
{'title': 'Tutorial 5- Pandas, Data Frame and Data Series Part-1', 'heatmap': [{'end': 354.146, 'start': 324.62, 'weight': 0.883}, {'end': 868.06, 'start': 857.018, 'weight': 0.707}], 'summary': 'Tutorial covers the importance of pandas and numpy for exploratory data analysis, emphasizing the need for thorough practice. it outlines the plan to cover pandas in three parts, including creating and using data frames with pandas and numpy, focusing on 5x4 arrays and data frame distinctions, and exploring data frames in python with access and indexing columns, and using iloc and loc functions.', 'chapters': [{'end': 119.863, 'segs': [{'end': 44.291, 'src': 'embed', 'start': 18.421, 'weight': 0, 'content': [{'end': 23.264, 'text': 'remember guys, numpy and pandas are one of the backbone for exploratory data analysis.', 'start': 18.421, 'duration': 4.843}, {'end': 26.706, 'text': 'so it is very, very important you practice both these things very nicely.', 'start': 23.264, 'duration': 3.442}, {'end': 29.958, 'text': 'I want to deep dive and discuss about Panda.', 'start': 27.836, 'duration': 2.122}, {'end': 36.984, 'text': "So I will be creating, there'll be three parts of this particular video, or probably I can, if it is possible, I'll finish in two part.", 'start': 29.978, 'duration': 7.006}, {'end': 44.291, 'text': "So in the first part, we'll understand what exactly is Pandas library, what it does, what is Data Flames, what is Data Series.", 'start': 37.505, 'duration': 6.786}], 'summary': 'Numpy and pandas are crucial for data analysis. the video will cover pandas in three parts and focus on its key features.', 'duration': 25.87, 'max_score': 18.421, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/QUClKFFn1Vk/pics/QUClKFFn1Vk18421.jpg'}, {'end': 86.973, 'src': 'embed', 'start': 59.859, 'weight': 3, 'content': [{'end': 65.744, 'text': 'because, understand, guys again, NumPy and Pandas.', 'start': 59.859, 'duration': 5.885}, {'end': 67.986, 'text': 'it is must for exploratory data analysis.', 'start': 65.744, 'duration': 2.242}, {'end': 69.668, 'text': 'So you need to be very, very good at it.', 'start': 68.006, 'duration': 1.662}, {'end': 71.608, 'text': 'okay. so let us begin.', 'start': 70.288, 'duration': 1.32}, {'end': 72.369, 'text': 'so what is pandas?', 'start': 71.608, 'duration': 0.761}, {'end': 76.61, 'text': 'pandas is an open source bhd licensed library providing high performance,', 'start': 72.369, 'duration': 4.241}, {'end': 81.011, 'text': 'easy to use data structure and data analysis tool for python programming languages.', 'start': 76.61, 'duration': 4.401}, {'end': 86.973, 'text': 'okay, and many of the people, many, everybody, uses pandas extensively, okay.', 'start': 81.011, 'duration': 5.962}], 'summary': 'Pandas is essential for data analysis, widely used by many, offering high performance and easy to use data structure.', 'duration': 27.114, 'max_score': 59.859, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/QUClKFFn1Vk/pics/QUClKFFn1Vk59859.jpg'}, {'end': 129.928, 'src': 'embed', 'start': 103.231, 'weight': 2, 'content': [{'end': 107.034, 'text': 'yes, remember, have you seen any data set in machine learning language?', 'start': 103.231, 'duration': 3.803}, {'end': 109.375, 'text': 'right in machine learning or in deep learning?', 'start': 107.034, 'duration': 2.341}, {'end': 115.78, 'text': "so suppose you, you'll be basically given a csv file or an excel sheet where you'll be having different, different features.", 'start': 109.375, 'duration': 6.405}, {'end': 118.061, 'text': "you'll be having many roles right now.", 'start': 115.78, 'duration': 2.281}, {'end': 119.863, 'text': 'if i want to read that particular data,', 'start': 118.061, 'duration': 1.802}, {'end': 129.928, 'text': 'i can basically use pandas with this and pandas will actually load that data and after loading that data it gets usually converted into a data frame.', 'start': 119.863, 'duration': 10.065}], 'summary': 'In machine learning, a dataset typically comprises various features and can be loaded into a data frame using pandas.', 'duration': 26.697, 'max_score': 103.231, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/QUClKFFn1Vk/pics/QUClKFFn1Vk103231.jpg'}], 'start': 0.689, 'title': 'Introduction to pandas library', 'summary': 'Discusses the importance of pandas and numpy for exploratory data analysis, emphasizing the need for thorough practice. it outlines the plan to cover pandas in three parts, including an explanation of pandas as an open-source library and the process of importing pandas and numpy. it also highlights the significance of data frames in machine learning and data analysis.', 'chapters': [{'end': 119.863, 'start': 0.689, 'title': 'Introduction to pandas library', 'summary': 'Discusses the importance of pandas and numpy for exploratory data analysis, emphasizing the need for thorough practice. it outlines the plan to cover pandas in three parts, including an explanation of pandas as an open-source library and the process of importing pandas and numpy. it also highlights the significance of data frames in machine learning and data analysis.', 'duration': 119.174, 'highlights': ['Pandas and Numpy are crucial for exploratory data analysis, requiring strong proficiency (mentions Pandas and Numpy importance)', 'Pandas is an open source library providing high performance, easy to use data structure and data analysis tool for python programming languages (describes Pandas as an open source library)', 'The chapter outlines the plan to cover Pandas in three parts, including an explanation of Pandas as an open-source library and the process of importing Pandas and Numpy (plan to cover Pandas in three parts)', 'The chapter emphasizes the need for thorough practice in Pandas and Numpy for exploratory data analysis (emphasizes the need for practice in Pandas and Numpy)', 'Data frames are essential in machine learning and data analysis for organizing and processing data from CSV files or Excel sheets (mentions the significance of data frames in machine learning and data analysis)']}], 'duration': 119.174, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/QUClKFFn1Vk/pics/QUClKFFn1Vk689.jpg', 'highlights': ['Pandas and Numpy are crucial for exploratory data analysis, requiring strong proficiency', 'The chapter emphasizes the need for thorough practice in Pandas and Numpy for exploratory data analysis', 'Data frames are essential in machine learning and data analysis for organizing and processing data from CSV files or Excel sheets', 'The chapter outlines the plan to cover Pandas in three parts, including an explanation of Pandas as an open-source library and the process of importing Pandas and Numpy', 'Pandas is an open source library providing high performance, easy to use data structure and data analysis tool for python programming languages']}, {'end': 535.645, 'segs': [{'end': 141.172, 'src': 'embed', 'start': 119.863, 'weight': 0, 'content': [{'end': 129.928, 'text': 'i can basically use pandas with this and pandas will actually load that data and after loading that data it gets usually converted into a data frame.', 'start': 119.863, 'duration': 10.065}, {'end': 140.011, 'text': 'so data frame is a combination of both columns and rows and it will basically show you a representation format wherein how your data exactly looks like in the extension.', 'start': 129.928, 'duration': 10.083}, {'end': 141.172, 'text': 'in the same way it will be loaded.', 'start': 140.011, 'duration': 1.161}], 'summary': 'Pandas loads and converts data into a data frame, displaying rows and columns in a representation format.', 'duration': 21.309, 'max_score': 119.863, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/QUClKFFn1Vk/pics/QUClKFFn1Vk119863.jpg'}, {'end': 233.983, 'src': 'embed', 'start': 203.23, 'weight': 1, 'content': [{'end': 207.593, 'text': 'Okay The third thing is column index and the data types that you can basically put.', 'start': 203.23, 'duration': 4.363}, {'end': 211.675, 'text': "So over here, what I've done is that I've used np.arrange.", 'start': 208.033, 'duration': 3.642}, {'end': 217.778, 'text': "This is basically a NumPy and I've told you if I want to select or if I want to randomly.", 'start': 211.875, 'duration': 5.903}, {'end': 221.06, 'text': 'sorry if I want to create arrays with some values like.', 'start': 217.778, 'duration': 3.282}, {'end': 223.658, 'text': "I've given the range form 0 to 20..", 'start': 221.06, 'duration': 2.598}, {'end': 226.399, 'text': "and I'm saying dot reshape to 5, comma 4.", 'start': 223.658, 'duration': 2.741}, {'end': 233.983, 'text': 'right, when I do this, that basically means that whatever array is basically created, it is getting created with respect to two dimensional array 5,', 'start': 226.399, 'duration': 7.584}], 'summary': 'Using np.arrange to create 5x4 2d array.', 'duration': 30.753, 'max_score': 203.23, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/QUClKFFn1Vk/pics/QUClKFFn1Vk203230.jpg'}, {'end': 354.146, 'src': 'heatmap', 'start': 322.097, 'weight': 2, 'content': [{'end': 324.6, 'text': 'What I will do is that I will just show you that inbuilt function.', 'start': 322.097, 'duration': 2.503}, {'end': 338.302, 'text': 'So I can write df.to df.to df.to csv.', 'start': 324.62, 'duration': 13.682}, {'end': 339.663, 'text': 'so this is the inbuilt function.', 'start': 338.302, 'duration': 1.361}, {'end': 345.844, 'text': "i'll just write test1.csv and once i execute it, this will get executed.", 'start': 339.663, 'duration': 6.181}, {'end': 350.525, 'text': 'if you want to see the file, just go and click on open here you will be seeing your test1 file.', 'start': 345.844, 'duration': 4.681}, {'end': 351.725, 'text': 'you can see over here.', 'start': 350.525, 'duration': 1.2}, {'end': 354.146, 'text': 'the test1.csv is basically created.', 'start': 351.725, 'duration': 2.421}], 'summary': 'Demonstrated inbuilt function to write data to test1.csv file.', 'duration': 32.049, 'max_score': 322.097, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/QUClKFFn1Vk/pics/QUClKFFn1Vk322097.jpg'}], 'start': 119.863, 'title': 'Creating and using pandas data frames', 'summary': 'Covers creating and using data frames with pandas and numpy, emphasizing loading data, transforming it into data frames, and creating 2d data frames with specific indexing and access methods, focusing on 5x4 arrays and data frame distinctions.', 'chapters': [{'end': 180.87, 'start': 119.863, 'title': 'Using pandas to create data frames', 'summary': 'Covers creating data frames using pandas, including the process of loading data, converting it into a data frame, and creating data frames using the pd.dataframe function after importing pandas and numpy, with an emphasis on the combination of columns and rows and the data to be included.', 'duration': 61.007, 'highlights': ['Pandas loads data and converts it into a data frame, which represents the data in a combination of columns and rows, showcasing its exact format and how it will be loaded.', 'To create data frames, the inbuilt function pd.dataframe is used after importing pandas and numpy, and it requires specifying the data to be included.']}, {'end': 535.645, 'start': 181.351, 'title': 'Creating 2d data frames in pandas', 'summary': 'Explains how to create 2d data frames in pandas, using np.arrange to generate a 5x4 array, specifying row and column indexes, and accessing elements through lock and iloc methods, with a focus on data frame and series distinctions.', 'duration': 354.294, 'highlights': ['The chapter discusses the creation of 2D data frames in Pandas, using np.arrange to generate a 5x4 array and specifying row and column indexes, emphasizing the importance of index and column values for performing different indexing techniques.', 'The process of accessing elements in the data frame through lock and iloc methods is explained, with a demonstration of retrieving specific values using row and column indexes, highlighting the distinction between data frames and series based on the number of columns and rows in the data structure.', 'The demonstration of converting data frames into CSV files using the to_csv function and the significance of understanding the structure of data frames and series to differentiate between them, with a focus on the shape and composition of the data structures.']}], 'duration': 415.782, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/QUClKFFn1Vk/pics/QUClKFFn1Vk119863.jpg', 'highlights': ['Pandas loads data and converts it into a data frame, representing data in columns and rows.', 'The chapter discusses creating 2D data frames in Pandas, using np.arrange to generate a 5x4 array.', 'Demonstration of converting data frames into CSV files using the to_csv function and understanding the structure of data frames and series.']}, {'end': 1008.395, 'segs': [{'end': 565.608, 'src': 'embed', 'start': 536.405, 'weight': 2, 'content': [{'end': 539.608, 'text': 'so let us go ahead and try to use ilock now.', 'start': 536.405, 'duration': 3.203}, {'end': 541.389, 'text': 'okay, first we have checked with lock.', 'start': 539.608, 'duration': 1.781}, {'end': 546.073, 'text': "now i'll go to use ilock in ilock also, if you remember numpy arrays right.", 'start': 541.389, 'duration': 4.684}, {'end': 547.735, 'text': 'similarly, this will also work.', 'start': 546.073, 'duration': 1.662}, {'end': 550.237, 'text': 'the left hand side you will basically be giving your rows.', 'start': 547.735, 'duration': 2.502}, {'end': 552.779, 'text': "the right hand side you'll be basically giving your column indexes.", 'start': 550.237, 'duration': 2.542}, {'end': 554.401, 'text': 'okay, the row indexes and column indexes.', 'start': 552.779, 'duration': 1.622}, {'end': 558.724, 'text': 'so once i execute over here, this colon basically means all the rows and all the columns.', 'start': 554.881, 'duration': 3.843}, {'end': 560.225, 'text': 'always remember this thing.', 'start': 558.724, 'duration': 1.501}, {'end': 565.608, 'text': 'but one thing that you need to remember, guys, one thing very, very important thing that you need to remember now.', 'start': 560.225, 'duration': 5.383}], 'summary': 'Using ilock with numpy arrays to specify row and column indexes for data retrieval.', 'duration': 29.203, 'max_score': 536.405, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/QUClKFFn1Vk/pics/QUClKFFn1Vk536405.jpg'}, {'end': 754.945, 'src': 'embed', 'start': 731.3, 'weight': 3, 'content': [{'end': 737.761, 'text': 'in short, if we are converting that into arrays, it is going to just skip the column name or column indexes and the row indexes.', 'start': 731.3, 'duration': 6.461}, {'end': 741.222, 'text': 'so this is how you can convert data frames into array.', 'start': 737.761, 'duration': 3.461}, {'end': 744.723, 'text': 'so you just have to write df, dot, ilog, colon, one, colon, dot values.', 'start': 741.222, 'duration': 3.501}, {'end': 750.444, 'text': 'okay, so once you execute this, dot values is basically converting these all values into an array.', 'start': 744.943, 'duration': 5.501}, {'end': 754.945, 'text': "okay, and this we will also be doing when we'll be doing our machine learning algorithms.", 'start': 750.444, 'duration': 4.501}], 'summary': 'Data frames can be converted to arrays using df.iloc[:, :].values for machine learning algorithms.', 'duration': 23.645, 'max_score': 731.3, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/QUClKFFn1Vk/pics/QUClKFFn1Vk731300.jpg'}, {'end': 818.705, 'src': 'embed', 'start': 791.691, 'weight': 0, 'content': [{'end': 798.854, 'text': 'okay?. So, first of all, the first thing that you should be very, very familiar with is that how to check the null condition.', 'start': 791.691, 'duration': 7.163}, {'end': 803.642, 'text': "I'll be discussing a lot about this as we go ahead with different different operations.", 'start': 799.797, 'duration': 3.845}, {'end': 806.425, 'text': 'I want to make you familiar with some of the inbuilt function.', 'start': 804.222, 'duration': 2.203}, {'end': 811.771, 'text': 'If I write df.isNull and if I say .', 'start': 806.905, 'duration': 4.866}, {'end': 814.222, 'text': 'sum. Execute it.', 'start': 811.771, 'duration': 2.451}, {'end': 818.705, 'text': 'you can see that it is basically saying at okay, in column 1, you have 0 null values.', 'start': 814.222, 'duration': 4.483}], 'summary': 'Learn to check null conditions and use inbuilt functions like df.isnull to identify null values.', 'duration': 27.014, 'max_score': 791.691, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/QUClKFFn1Vk/pics/QUClKFFn1Vk791691.jpg'}, {'end': 860.639, 'src': 'embed', 'start': 833.697, 'weight': 1, 'content': [{'end': 837.08, 'text': 'Then after that you can also use dot value counts.', 'start': 833.697, 'duration': 3.383}, {'end': 840.143, 'text': 'now, dot value counts, suppose if I use DF dot head.', 'start': 837.08, 'duration': 3.063}, {'end': 841.244, 'text': 'Okay, so this is my DF.', 'start': 840.143, 'duration': 1.101}, {'end': 843.715, 'text': 'I want to find out.', 'start': 842.295, 'duration': 1.42}, {'end': 845.956, 'text': 'suppose in my column one I have categorical features.', 'start': 843.715, 'duration': 2.241}, {'end': 849.116, 'text': 'I want to find out how many unique categories I have.', 'start': 846.016, 'duration': 3.1}, {'end': 852.777, 'text': 'So in that case, I can basically use dot value underscore counts.', 'start': 850.097, 'duration': 2.68}, {'end': 856.998, 'text': 'And this will actually give us like, okay, 2L is present one time.', 'start': 853.377, 'duration': 3.621}, {'end': 859.218, 'text': '4 is present one time.', 'start': 857.018, 'duration': 2.2}, {'end': 860.639, 'text': 'This is present one time like this.', 'start': 859.458, 'duration': 1.181}], 'summary': 'Using dot value_counts to find unique categories in a dataframe column.', 'duration': 26.942, 'max_score': 833.697, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/QUClKFFn1Vk/pics/QUClKFFn1Vk833697.jpg'}, {'end': 890.022, 'src': 'heatmap', 'start': 857.018, 'weight': 0.707, 'content': [{'end': 859.218, 'text': '4 is present one time.', 'start': 857.018, 'duration': 2.2}, {'end': 860.639, 'text': 'This is present one time like this.', 'start': 859.458, 'duration': 1.181}, {'end': 865.74, 'text': 'Okay And there is also a unique function, I guess.', 'start': 860.839, 'duration': 4.901}, {'end': 868.06, 'text': 'So there is something called as unique.', 'start': 866.14, 'duration': 1.92}, {'end': 874.742, 'text': "Here you can basically see that it took a little bit of time to execute, but I don't know why.", 'start': 870.899, 'duration': 3.843}, {'end': 878.285, 'text': 'How many unique values are there? It is basically shown.', 'start': 875.563, 'duration': 2.722}, {'end': 881.227, 'text': 'If any value is repeated, it will not show you that value.', 'start': 878.405, 'duration': 2.822}, {'end': 884.51, 'text': 'It will just capture the uniqueness in this particular column in short.', 'start': 881.247, 'duration': 3.263}, {'end': 890.022, 'text': 'so, uh, this is how it is done.', 'start': 887.56, 'duration': 2.462}], 'summary': 'The unique function identified one unique value in the column.', 'duration': 33.004, 'max_score': 857.018, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/QUClKFFn1Vk/pics/QUClKFFn1Vk857018.jpg'}, {'end': 909.701, 'src': 'embed', 'start': 881.247, 'weight': 4, 'content': [{'end': 884.51, 'text': 'It will just capture the uniqueness in this particular column in short.', 'start': 881.247, 'duration': 3.263}, {'end': 890.022, 'text': 'so, uh, this is how it is done.', 'start': 887.56, 'duration': 2.462}, {'end': 891.724, 'text': 'one more thing about indexing, guys.', 'start': 890.022, 'duration': 1.702}, {'end': 895.728, 'text': 'i just showed you with the help of lock, and i lock right.', 'start': 891.724, 'duration': 4.004}, {'end': 898.651, 'text': 'apart from that, you can also call the column name directly.', 'start': 895.728, 'duration': 2.923}, {'end': 903.355, 'text': 'suppose, if i want to have know the column three, uh, data frame.', 'start': 898.651, 'duration': 4.704}, {'end': 909.701, 'text': "so i can just write column three over here and i'll be getting it and remember this.", 'start': 903.355, 'duration': 6.346}], 'summary': 'Demonstrating indexing in a data frame, accessing unique column values.', 'duration': 28.454, 'max_score': 881.247, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/QUClKFFn1Vk/pics/QUClKFFn1Vk881247.jpg'}], 'start': 536.405, 'title': 'Exploring data frames in python', 'summary': 'Covers accessing and indexing columns, exploring data frames in python, and using iloc and loc functions with upcoming discussion on inbuilt functions like info and describe.', 'chapters': [{'end': 881.227, 'start': 536.405, 'title': 'Working with data frames and indexing in python', 'summary': 'Covers working with data frames in python, including indexing using iloc, converting data frames to arrays, and using built-in functions like isnull and value_counts for data analysis.', 'duration': 344.822, 'highlights': ['Indexing using iloc in Python allows specifying row and column indexes, such as getting specific rows or columns by index, with examples like df.iloc[0:3] for rows and df.iloc[:, 0:2] for columns.', 'Python uses zero-based indexing, while R uses one-based indexing, which is important to remember when working with data frames.', 'Converting data frames to arrays in Python can be done using df.iloc[:, 1:].values, which is useful for machine learning algorithms and provides an array representation of the data frame.', 'The built-in function df.isNull().sum() can be used to check for null values in each column of a data frame, providing a quick way to assess data quality and cleanliness.', 'The function df.value_counts() can be used to find the frequency of unique values in a specific column of a data frame, which is useful for understanding the distribution of categorical features in the data.']}, {'end': 1008.395, 'start': 881.247, 'title': 'Exploring data frames in python', 'summary': 'Explores accessing and indexing columns in a data frame in python, highlighting the use of iloc and loc functions, and the upcoming discussion on inbuilt functions like info and describe.', 'duration': 127.148, 'highlights': ['The chapter discusses accessing and indexing columns in a data frame using iloc and loc functions, emphasizing the use of iloc and loc to access specific columns and the need to use a list when specifying multiple columns.', 'It mentions the upcoming discussion on inbuilt functions like info and describe, and the plan to cover reading data sets from CSV files in the next part of the session.', 'The speaker encourages viewers to like and subscribe to the channel, ending the session with a positive message.']}], 'duration': 471.99, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/QUClKFFn1Vk/pics/QUClKFFn1Vk536405.jpg', 'highlights': ['The built-in function df.isNull().sum() can be used to check for null values in each column of a data frame, providing a quick way to assess data quality and cleanliness.', 'The function df.value_counts() can be used to find the frequency of unique values in a specific column of a data frame, which is useful for understanding the distribution of categorical features in the data.', 'Indexing using iloc in Python allows specifying row and column indexes, such as getting specific rows or columns by index, with examples like df.iloc[0:3] for rows and df.iloc[:, 0:2] for columns.', 'Converting data frames to arrays in Python can be done using df.iloc[:, 1:].values, which is useful for machine learning algorithms and provides an array representation of the data frame.', 'The chapter discusses accessing and indexing columns in a data frame using iloc and loc functions, emphasizing the use of iloc and loc to access specific columns and the need to use a list when specifying multiple columns.']}], 'highlights': ['Pandas and Numpy are crucial for exploratory data analysis, requiring strong proficiency', 'The chapter emphasizes the need for thorough practice in Pandas and Numpy for exploratory data analysis', 'Data frames are essential in machine learning and data analysis for organizing and processing data from CSV files or Excel sheets', 'The chapter outlines the plan to cover Pandas in three parts, including an explanation of Pandas as an open-source library and the process of importing Pandas and Numpy', 'Pandas is an open source library providing high performance, easy to use data structure and data analysis tool for python programming languages', 'Pandas loads data and converts it into a data frame, representing data in columns and rows', 'The chapter discusses creating 2D data frames in Pandas, using np.arrange to generate a 5x4 array', 'Demonstration of converting data frames into CSV files using the to_csv function and understanding the structure of data frames and series', 'The built-in function df.isNull().sum() can be used to check for null values in each column of a data frame, providing a quick way to assess data quality and cleanliness', 'The function df.value_counts() can be used to find the frequency of unique values in a specific column of a data frame, which is useful for understanding the distribution of categorical features in the data', 'Indexing using iloc in Python allows specifying row and column indexes, such as getting specific rows or columns by index, with examples like df.iloc[0:3] for rows and df.iloc[:, 0:2] for columns', 'Converting data frames to arrays in Python can be done using df.iloc[:, 1:].values, which is useful for machine learning algorithms and provides an array representation of the data frame', 'The chapter discusses accessing and indexing columns in a data frame using iloc and loc functions, emphasizing the use of iloc and loc to access specific columns and the need to use a list when specifying multiple columns']}