title
Tutorial 9- Seaborn Tutorial- Distplot, Joinplot, Pairplot Part 1

description
Hello All, Welcome to the Python Crash Course. In this video we will understand about Seaborn github url : https://github.com/krishnaik06/Machine-Learning-in-90-days Support me in Patreon: https://www.patreon.com/join/2340909? Connect with me here: Twitter: https://twitter.com/Krishnaik06 Facebook: https://www.facebook.com/krishnaik06 instagram: https://www.instagram.com/krishnaik06 If you like music support my brother's channel https://www.youtube.com/channel/UCdupFqYIc6VMO-pXVlvmM4Q Buy the Best book of Machine Learning, Deep Learning with python sklearn and tensorflow from below amazon url: https://www.amazon.in/Hands-Machine-Learning-Scikit-Learn-Tensor/dp/9352135210/ref=as_sl_pc_qf_sp_asin_til?tag=krishnaik06-21&linkCode=w00&linkId=a706a13cecffd115aef76f33a760e197&creativeASIN=9352135210 You can buy my book on Finance with Machine Learning and Deep Learning from the below url amazon url: https://www.amazon.in/Hands-Python-Finance-implementing-strategies/dp/1789346371/ref=as_sl_pc_qf_sp_asin_til?tag=krishnaik06-21&linkCode=w00&linkId=ac229c9a45954acc19c1b2fa2ca96e23&creativeASIN=1789346371 Subscribe my unboxing Channel https://www.youtube.com/channel/UCjWY5hREA6FFYrthD0rZNIw Below are the various playlist created on ML,Data Science and Deep Learning. Please subscribe and support the channel. Happy Learning! Deep Learning Playlist: https://www.youtube.com/watch?v=DKSZHN7jftI&list=PLZoTAELRMXVPGU70ZGsckrMdr0FteeRUi Data Science Projects playlist: https://www.youtube.com/watch?v=5Txi0nHIe0o&list=PLZoTAELRMXVNUcr7osiU7CCm8hcaqSzGw NLP playlist: https://www.youtube.com/watch?v=6ZVf1jnEKGI&list=PLZoTAELRMXVMdJ5sqbCK2LiM0HhQVWNzm Statistics Playlist: https://www.youtube.com/watch?v=GGZfVeZs_v4&list=PLZoTAELRMXVMhVyr3Ri9IQ-t5QPBtxzJO Feature Engineering playlist: https://www.youtube.com/watch?v=NgoLMsaZ4HU&list=PLZoTAELRMXVPwYGE2PXD3x0bfKnR0cJjN Computer Vision playlist: https://www.youtube.com/watch?v=mT34_yu5pbg&list=PLZoTAELRMXVOIBRx0andphYJ7iakSg3Lk Data Science Interview Question playlist: https://www.youtube.com/watch?v=820Qr4BH0YM&list=PLZoTAELRMXVPkl7oRvzyNnyj1HS4wt2K- You can buy my book on Finance with Machine Learning and Deep Learning from the below url amazon url: https://www.amazon.in/Hands-Python-Finance-implementing-strategies/dp/1789346371/ref=sr_1_1?keywords=krish+naik&qid=1560943725&s=gateway&sr=8-1 🙏🙏🙏🙏🙏🙏🙏🙏 YOU JUST NEED TO DO 3 THINGS to support my channel LIKE SHARE & SUBSCRIBE TO MY YOUTUBE CHANNEL

detail
{'title': 'Tutorial 9- Seaborn Tutorial- Distplot, Joinplot, Pairplot Part 1', 'heatmap': [{'end': 784.218, 'start': 747.705, 'weight': 0.844}, {'end': 1212.478, 'start': 1171.691, 'weight': 0.943}], 'summary': 'Tutorial series covers the importance of seaborn for exploratory data analysis, visualization of multidimensional data, correlation analysis, and visual analysis using joint plot and pair plot for understanding data in python, emphasizing statistical tools and correlation visualization with practical examples.', 'chapters': [{'end': 39.615, 'segs': [{'end': 51.406, 'src': 'embed', 'start': 17.989, 'weight': 0, 'content': [{'end': 24.371, 'text': 'So it is very, very much important to understand this because this is the backbone of exploratory data analysis.', 'start': 17.989, 'duration': 6.382}, {'end': 32.753, 'text': 'The best use of Seaborn is that you will be able to get a lot of statistical tools, which will actually help you to understand more about the data.', 'start': 25.011, 'duration': 7.742}, {'end': 39.615, 'text': 'So before moving into Seaborn, let me just mention some of the things that are very, very important to understand.', 'start': 33.333, 'duration': 6.282}, {'end': 51.406, 'text': 'So, to begin with, guys, always remember that whenever you get a data set, whenever you get a data set in machine learning or in deep learning,', 'start': 40.657, 'duration': 10.749}], 'summary': 'Understanding seaborn is crucial for exploratory data analysis and provides statistical tools to gain insights from data.', 'duration': 33.417, 'max_score': 17.989, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/UsglokDLa2o/pics/UsglokDLa2o17989.jpg'}], 'start': 1.51, 'title': 'Seaborn for data visualization', 'summary': 'Discusses the importance of seaborn for exploratory data analysis in python, emphasizing its use of statistical tools and the foundation it provides for understanding data.', 'chapters': [{'end': 39.615, 'start': 1.51, 'title': 'Seaborn for data visualization', 'summary': 'Discusses the importance of seaborn for exploratory data analysis in python, highlighting its use of statistical tools and the foundation it provides for understanding data.', 'duration': 38.105, 'highlights': ['The best use of Seaborn is that you will be able to get a lot of statistical tools, which will actually help you to understand more about the data.', 'Exploratory data analysis is the backbone of understanding data, and Seaborn plays a key role in this process.', 'The discussion focuses on the next visualization library, Seaborn, and its importance for exploratory data analysis in Python.']}], 'duration': 38.105, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/UsglokDLa2o/pics/UsglokDLa2o1510.jpg', 'highlights': ['Seaborn provides statistical tools for understanding data.', 'Seaborn is crucial for exploratory data analysis in Python.', 'Seaborn is emphasized for its role in understanding data.']}, {'end': 322.664, 'segs': [{'end': 86.967, 'src': 'embed', 'start': 40.657, 'weight': 0, 'content': [{'end': 51.406, 'text': 'So, to begin with, guys, always remember that whenever you get a data set, whenever you get a data set in machine learning or in deep learning,', 'start': 40.657, 'duration': 10.749}, {'end': 53.828, 'text': 'there are some things that are very, very important.', 'start': 51.406, 'duration': 2.422}, {'end': 59.112, 'text': 'In a data set, you will be having features like F1, F2, F3, F4.', 'start': 54.048, 'duration': 5.064}, {'end': 65.998, 'text': 'And based on the classification and regression problem, you will be dividing your data set.', 'start': 60.253, 'duration': 5.745}, {'end': 68.52, 'text': 'And this is with respect to supervised machine learning.', 'start': 66.038, 'duration': 2.482}, {'end': 70.762, 'text': 'supervised machine learning.', 'start': 69.462, 'duration': 1.3}, {'end': 74.743, 'text': "I'll discuss about what is supervised machine learning and unsupervised machine learning,", 'start': 70.782, 'duration': 3.961}, {'end': 78.304, 'text': 'as I enter into machine learning and deep learning techniques.', 'start': 74.743, 'duration': 3.561}, {'end': 86.967, 'text': 'But just understand that whenever I have this kind of features, initially you have to divide your features into independent and dependent feature.', 'start': 78.805, 'duration': 8.162}], 'summary': "In machine learning, it's important to divide features into independent and dependent features for supervised learning.", 'duration': 46.31, 'max_score': 40.657, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/UsglokDLa2o/pics/UsglokDLa2o40657.jpg'}, {'end': 202.4, 'src': 'embed', 'start': 175.532, 'weight': 4, 'content': [{'end': 179.253, 'text': 'I have to basically draw a 4D diagram in order to see it in a visualization.', 'start': 175.532, 'duration': 3.721}, {'end': 183.575, 'text': 'Now guys, still 3D diagram, we can definitely have a look.', 'start': 180.034, 'duration': 3.541}, {'end': 187.296, 'text': 'We have various tools to basically see the 3D diagram.', 'start': 183.695, 'duration': 3.601}, {'end': 190.516, 'text': 'But when it goes to 4D diagrams, it becomes very, very difficult.', 'start': 187.336, 'duration': 3.18}, {'end': 193.177, 'text': 'You know, we cannot just prepare a 4D diagram.', 'start': 190.676, 'duration': 2.501}, {'end': 202.4, 'text': 'It will be very, very difficult because we as a human being can basically see somewhere around 3d diagram properly, right, not a 4d diagram.', 'start': 193.197, 'duration': 9.203}], 'summary': 'Visualizing 4d diagrams is challenging for humans.', 'duration': 26.868, 'max_score': 175.532, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/UsglokDLa2o/pics/UsglokDLa2o175532.jpg'}, {'end': 290.407, 'src': 'embed', 'start': 217.326, 'weight': 2, 'content': [{'end': 227.633, 'text': 'okay, univariate analysis, sorry, if I consider just one feature, then this is basically called as univariate analysis.', 'start': 217.326, 'duration': 10.307}, {'end': 234.337, 'text': 'So suppose, if I have two features like F1 and F2, this basically is called as bivariate analysis,', 'start': 228.034, 'duration': 6.303}, {'end': 240.731, 'text': "because I'm actually seeing the way how this F1 and F2 relationship is.", 'start': 234.337, 'duration': 6.394}, {'end': 245.176, 'text': "So I'm trying to verify how F1 and F2 relationship will basically occur.", 'start': 241.212, 'duration': 3.964}, {'end': 251.083, 'text': 'And this usually will create a two-dimensional diagram to basically see how the data is basically behaving.', 'start': 246.057, 'duration': 5.026}, {'end': 253.425, 'text': 'How the data is behaving in case of F1 and F2.', 'start': 251.103, 'duration': 2.322}, {'end': 259.209, 'text': 'and this kind of univariate and bivariate analysis can be easily done with seaborn.', 'start': 254.446, 'duration': 4.763}, {'end': 266.832, 'text': 'now the question rises that if i have more than uh, two features, if i have more than three features, what do i do for that?', 'start': 259.209, 'duration': 7.623}, {'end': 272.936, 'text': 'so in that, uh, now this seaborn library will basically help us to solve this problem.', 'start': 266.832, 'duration': 6.104}, {'end': 278.799, 'text': 'it will help us to analyze if we have multiple features, many number of features in our data set.', 'start': 272.936, 'duration': 5.863}, {'end': 281.48, 'text': 'so for that we will be using various kind of plots.', 'start': 278.799, 'duration': 2.681}, {'end': 286.405, 'text': 'so we have something called as dist plot, joint plot and pair plot.', 'start': 281.48, 'duration': 4.925}, {'end': 286.785, 'text': 'over here.', 'start': 286.405, 'duration': 0.38}, {'end': 290.407, 'text': 'specifically, this joint plot will help us to find out.', 'start': 286.785, 'duration': 3.622}], 'summary': 'Univariate and bivariate analysis done with seaborn, for multiple features use dist plot, joint plot, and pair plot.', 'duration': 73.081, 'max_score': 217.326, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/UsglokDLa2o/pics/UsglokDLa2o217326.jpg'}], 'start': 40.657, 'title': 'Basics of data set in machine learning and visualization of multidimensional data', 'summary': 'Emphasizes the importance of features in a data set, the division of features into independent and dependent, and discusses challenges of visualizing data with more than three dimensions, univariate and bivariate analysis, and the use of seaborn library for analyzing multi-feature datasets.', 'chapters': [{'end': 155.303, 'start': 40.657, 'title': 'Basics of data set in machine learning', 'summary': 'Emphasizes the importance of features in a data set and the division of features into independent and dependent, highlighting the relevance of features in supervised machine learning.', 'duration': 114.646, 'highlights': ['Understanding the division of features into independent and dependent is crucial in supervised machine learning, as it impacts the prediction and behavior of the data set.', 'The significance of features like F1, F2, F3, F4 in a data set is highlighted, emphasizing their impact on the machine learning and deep learning techniques.', 'The visualization of features in a 2D plot enables the observation of their behavior and interaction, providing valuable insights for analysis.']}, {'end': 322.664, 'start': 155.303, 'title': 'Visualization of multidimensional data', 'summary': 'Discusses the challenges of visualizing data with more than three dimensions, the concepts of univariate and bivariate analysis, and the use of seaborn library for analyzing multi-feature datasets through dist plot, joint plot, and pair plot.', 'duration': 167.361, 'highlights': ['Seaborn library assists in analyzing datasets with multiple features, offering dist plot, joint plot, and pair plot for visualization. The seaborn library provides tools like dist plot, joint plot, and pair plot to analyze datasets with multiple features.', 'The chapter explains the challenges of visualizing data with more than three dimensions and the limitations of human perception in understanding four-dimensional diagrams. It highlights the challenges of visualizing data with more than three dimensions and the difficulty in comprehending four-dimensional diagrams due to human perceptual limitations.', 'Concepts of univariate and bivariate analysis are introduced, depicting the visualization of single and two-feature relationships through diagrams. It introduces the concepts of univariate and bivariate analysis, illustrating the visualization of single and two-feature relationships through diagrams.']}], 'duration': 282.007, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/UsglokDLa2o/pics/UsglokDLa2o40657.jpg', 'highlights': ['Understanding the division of features into independent and dependent is crucial in supervised machine learning, impacting the prediction and behavior of the data set.', 'The significance of features like F1, F2, F3, F4 in a data set is highlighted, emphasizing their impact on machine learning and deep learning techniques.', 'The visualization of features in a 2D plot enables the observation of their behavior and interaction, providing valuable insights for analysis.', 'Seaborn library assists in analyzing datasets with multiple features, offering dist plot, joint plot, and pair plot for visualization.', 'The chapter explains the challenges of visualizing data with more than three dimensions and the limitations of human perception in understanding four-dimensional diagrams.', 'Concepts of univariate and bivariate analysis are introduced, depicting the visualization of single and two-feature relationships through diagrams.']}, {'end': 609.662, 'segs': [{'end': 389.191, 'src': 'embed', 'start': 364.475, 'weight': 0, 'content': [{'end': 371.302, 'text': "But in exploratory data analysis, I'll be taking up a data set from Kaggle and I'll be using Seaborn for doing the visualization stuff.", 'start': 364.475, 'duration': 6.827}, {'end': 375.769, 'text': "So over here, I'll be loading a tips data set over here.", 'start': 372.008, 'duration': 3.761}, {'end': 378.949, 'text': 'And in this particular data frame, let me show you what all features you have.', 'start': 375.869, 'duration': 3.08}, {'end': 382.97, 'text': 'You have like total bill, tip, sex, smoker, date, time and size.', 'start': 379.349, 'duration': 3.621}, {'end': 389.191, 'text': 'Now, this is the data set from a restaurant and where many people have come to eat food.', 'start': 383.47, 'duration': 5.721}], 'summary': 'Using seaborn for visualizing a kaggle dataset on restaurant tips and customer details.', 'duration': 24.716, 'max_score': 364.475, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/UsglokDLa2o/pics/UsglokDLa2o364475.jpg'}, {'end': 426.982, 'src': 'embed', 'start': 405.624, 'weight': 1, 'content': [{'end': 414.832, 'text': 'we should be able to create a model wherein we should be able to assume what tip it will be based on the other features, like total bill, sex smoker,', 'start': 405.624, 'duration': 9.208}, {'end': 415.713, 'text': 'date time and size.', 'start': 414.832, 'duration': 0.881}, {'end': 416.893, 'text': 'this is what is our problem.', 'start': 415.713, 'duration': 1.18}, {'end': 418.615, 'text': 'state now in this.', 'start': 416.893, 'duration': 1.722}, {'end': 426.982, 'text': 'if you, if you just see from the data set, the major thing that you have to note over here is that your tip is actually your dependent feature,', 'start': 418.615, 'duration': 8.367}], 'summary': 'Create a model to predict tip based on total bill, sex, smoker, date time, and size.', 'duration': 21.358, 'max_score': 405.624, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/UsglokDLa2o/pics/UsglokDLa2o405624.jpg'}, {'end': 476.921, 'src': 'embed', 'start': 450.877, 'weight': 2, 'content': [{'end': 460.333, 'text': 'okay, so all the all the features that i have over here, this and this will be my independent feature and this will be my dependent feature.', 'start': 450.877, 'duration': 9.456}, {'end': 464.724, 'text': 'That is the first thing that you need to understand in an exploratory data analysis.', 'start': 460.534, 'duration': 4.19}, {'end': 473.521, 'text': "now what we will do is that we'll try to use this uh, joint plot, uh and other other things,", 'start': 465.839, 'duration': 7.682}, {'end': 476.921, 'text': "but before that you'll be seeing something called as correlation.", 'start': 473.521, 'duration': 3.4}], 'summary': 'Understanding the independent and dependent features is crucial for exploratory data analysis.', 'duration': 26.044, 'max_score': 450.877, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/UsglokDLa2o/pics/UsglokDLa2o450877.jpg'}], 'start': 322.664, 'title': 'Data visualization and correlation analysis', 'summary': 'Covers using seaborn to visualize a restaurant dataset, aiming to predict tips, and discusses correlation analysis, including the use of joint plot and heat map to find correlation between features.', 'chapters': [{'end': 450.877, 'start': 322.664, 'title': 'Pair plot and dist plot in seaborn', 'summary': 'Explains the process of using seaborn to visualize and analyze a restaurant dataset with features like total bill, tip, sex, smoker, date, time, and size, aiming to create a model to predict tips based on other features.', 'duration': 128.213, 'highlights': ["The chapter explains the process of using Seaborn to visualize and analyze a restaurant dataset with features like total bill, tip, sex, smoker, date, time, and size. It describes the usage of Seaborn library to load and analyze the 'tips' dataset from a restaurant, including features such as total bill, tip, sex, smoker, date, time, and size.", 'Aiming to create a model to predict tips based on other features. The goal is to create a model that can predict tips based on independent features like total bill, sex, smoker, date, time, and size, treating tip as the dependent feature.', 'Explanation of the dependent and independent features in the dataset. The tip is identified as the dependent feature, while total bill, sex, smoker, date, and time are recognized as independent features, crucial for understanding the dataset.']}, {'end': 609.662, 'start': 450.877, 'title': 'Correlation analysis in data exploration', 'summary': 'Discusses the importance of understanding independent and dependent features in exploratory data analysis, the use of joint plot and heat map to find correlation between features, and the limitations and range of correlation values.', 'duration': 158.785, 'highlights': ['Importance of understanding independent and dependent features Understanding the distinction between independent and dependent features is crucial in exploratory data analysis.', "Use of joint plot and heat map to find correlation between features The chapter introduces the use of joint plot and seaborn's heat map property to identify the correlation between different features in the data.", 'Limitations and range of correlation values Correlation can only be determined for integer or floating point values, not for categorical features, and the correlation coefficient ranges from -1 to +1, specifically known as the Pearson correlation.']}], 'duration': 286.998, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/UsglokDLa2o/pics/UsglokDLa2o322664.jpg', 'highlights': ['The chapter explains the process of using Seaborn to visualize and analyze a restaurant dataset with features like total bill, tip, sex, smoker, date, time, and size.', 'Aiming to create a model to predict tips based on other features.', 'Importance of understanding independent and dependent features.', 'Use of joint plot and heat map to find correlation between features.', 'Explanation of the dependent and independent features in the dataset.']}, {'end': 810.814, 'segs': [{'end': 649.559, 'src': 'embed', 'start': 624.997, 'weight': 0, 'content': [{'end': 630.681, 'text': 'Here you can basically find out the correlation with respect to total bill and total bill, the correlation will always be one.', 'start': 624.997, 'duration': 5.684}, {'end': 635.604, 'text': 'With respect to total bill and tip, you can see that there is 67% correlation.', 'start': 631.621, 'duration': 3.983}, {'end': 636.764, 'text': 'That basically indicates that.', 'start': 635.664, 'duration': 1.1}, {'end': 640.527, 'text': 'And I always remember, if this is increasing, this basically indicates that.', 'start': 636.804, 'duration': 3.723}, {'end': 649.559, 'text': 'This basically indicates that if your total bill is increasing, your tip will also increase.', 'start': 642.175, 'duration': 7.384}], 'summary': 'Total bill and tip have a 67% correlation, indicating that an increase in total bill leads to an increase in tip.', 'duration': 24.562, 'max_score': 624.997, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/UsglokDLa2o/pics/UsglokDLa2o624997.jpg'}, {'end': 790.8, 'src': 'heatmap', 'start': 747.705, 'weight': 1, 'content': [{'end': 757.312, 'text': 'Suppose, if there are two independent features and the correlation of those two independent features are same or they are higher towards 1,', 'start': 747.705, 'duration': 9.607}, {'end': 759.413, 'text': 'towards 90%.', 'start': 757.312, 'duration': 2.101}, {'end': 763.096, 'text': 'At that time, it is possible that you just use one feature instead of just using two.', 'start': 759.413, 'duration': 3.683}, {'end': 765.255, 'text': 'So that is very, very important.', 'start': 764.071, 'duration': 1.184}, {'end': 767.343, 'text': 'So this was the first thing about heat map.', 'start': 765.275, 'duration': 2.068}, {'end': 770.052, 'text': 'Now about the joint plot.', 'start': 767.824, 'duration': 2.228}, {'end': 774.256, 'text': 'Now, joint plot actually helps you to do the univariate analysis.', 'start': 771.035, 'duration': 3.221}, {'end': 777.737, 'text': "Now, when I say univariate, guys, again, I'm just taking two features over here.", 'start': 774.636, 'duration': 3.101}, {'end': 784.218, 'text': 'You can also call it as bivariate, okay? But make sure that you just have two features in case of joint plot.', 'start': 778.137, 'duration': 6.081}, {'end': 787.819, 'text': 'So in joint plot you just have to use sns.jointplot.', 'start': 784.338, 'duration': 3.481}, {'end': 790.8, 'text': 'x is equal to tip, y is equal to total bill, okay?', 'start': 787.819, 'duration': 2.981}], 'summary': 'Correlated independent features with correlation of 90% can be represented using just one feature, as explained in the joint plot analysis.', 'duration': 43.095, 'max_score': 747.705, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/UsglokDLa2o/pics/UsglokDLa2o747705.jpg'}], 'start': 609.662, 'title': 'Correlation analysis and visualization', 'summary': 'Explores the correlation analysis in restaurant data, revealing a 67% positive correlation between total bill and tip, and discusses visualizing correlation using a heat map and univariate analysis using sns.heatmap and sns.jointplot.', 'chapters': [{'end': 680.436, 'start': 609.662, 'title': 'Correlation analysis in restaurant data', 'summary': 'Explains the correlation analysis between total bill, tip, and size, revealing a 67% positive correlation between total bill and tip, indicating that as the total bill increases, the tip also increases, and also highlights the positive correlation between tip and size.', 'duration': 70.774, 'highlights': ['The correlation between total bill and tip is 67%, indicating that as the total bill increases, the tip also increases.', 'Understanding a correlation value of 0.67 means that as the total bill increases, the tip also increases.', 'It is noted that a negative correlation would indicate that if the total bill is decreasing, the tip will also be decreasing.', 'Additionally, there is a positive correlation between tip and size.']}, {'end': 810.814, 'start': 681.28, 'title': 'Visualizing correlation and univariate analysis', 'summary': 'Discusses visualizing correlation using a heat map, highlighting the importance of correlation in feature selection, and introduces joint plot for univariate analysis using sns.heatmap and sns.jointplot, with examples of correlation visualization and univariate analysis.', 'duration': 129.534, 'highlights': ['Correlation visualization using heat map is important in exploratory data analysis for feature selection, where highly correlated features may be redundant, and sns.heatmap helps in visualizing the correlation matrix. The color variance in the heat map indicates the level of correlation, with highly correlated features shown in one color and less correlated features in another. Correlation values around 0.5 are highlighted, emphasizing the significance of correlation in feature selection.', 'The significance of correlation in feature selection is emphasized, with a recommendation to use only one feature if two independent features have a high correlation, close to 1 or 90%. High correlation between independent features, close to 1 or 90%, suggests the possibility of using only one feature instead of both, underscoring the importance of correlation in feature selection.', "Introduction of joint plot for univariate analysis, using sns.jointplot with examples of comparing 'tip' and 'total bill' as variables. The joint plot allows for univariate analysis by comparing two features, such as 'tip' and 'total bill', using sns.jointplot and the 'data' parameter, providing insights into the relationship between the selected features."]}], 'duration': 201.152, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/UsglokDLa2o/pics/UsglokDLa2o609662.jpg', 'highlights': ['Positive correlation between total bill and tip is 67%', 'Correlation visualization using heat map is important in exploratory data analysis', 'High correlation between independent features, close to 1 or 90%, suggests the possibility of using only one feature instead of both', 'Joint plot allows for univariate analysis by comparing two features']}, {'end': 1300.922, 'segs': [{'end': 839.864, 'src': 'embed', 'start': 811.094, 'weight': 0, 'content': [{'end': 813.395, 'text': 'And then you also have a feature called as kind.', 'start': 811.094, 'duration': 2.301}, {'end': 815.575, 'text': 'In kind if you give hex.', 'start': 813.935, 'duration': 1.64}, {'end': 820.676, 'text': 'okay, if you give hex, that basically means this features that will get displayed in between, right?', 'start': 815.575, 'duration': 5.101}, {'end': 823.057, 'text': 'The structure that will show.', 'start': 821.636, 'duration': 1.421}, {'end': 827.858, 'text': 'instead of points, it will be showing in this color, basically in this hexagonal shape, okay?', 'start': 823.057, 'duration': 4.801}, {'end': 832.481, 'text': 'And here you can see that in joint plot you also have more properties.', 'start': 829.58, 'duration': 2.901}, {'end': 839.864, 'text': "You'll be able to see, suppose in the X axis, you have something like tip over here, right? So all the histograms will be shown over here.", 'start': 832.901, 'duration': 6.963}], 'summary': "The 'kind' feature can display data in hexagonal shape instead of points, with more properties in joint plot.", 'duration': 28.77, 'max_score': 811.094, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/UsglokDLa2o/pics/UsglokDLa2o811094.jpg'}, {'end': 874.126, 'src': 'embed', 'start': 847.708, 'weight': 1, 'content': [{'end': 857.993, 'text': 'So the major concentration that you see over here, where is the histogram buildings higher? You can see in this particular region.', 'start': 847.708, 'duration': 10.285}, {'end': 864.177, 'text': 'So if I see this region over here and if I see this, these two are getting combined over here.', 'start': 858.033, 'duration': 6.144}, {'end': 869.722, 'text': 'So here you can see that they are higher concentration of points.', 'start': 866.118, 'duration': 3.604}, {'end': 874.126, 'text': 'Because my histogram heights are very higher in this particular region.', 'start': 870.182, 'duration': 3.944}], 'summary': 'Higher concentration of points in a particular region with taller histogram buildings.', 'duration': 26.418, 'max_score': 847.708, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/UsglokDLa2o/pics/UsglokDLa2o847708.jpg'}, {'end': 1005.061, 'src': 'embed', 'start': 974.036, 'weight': 2, 'content': [{'end': 980.798, 'text': 'Again, guys, I should not call exactly this like a univariate, but this can be called as a bivariate because I have two features into this.', 'start': 974.036, 'duration': 6.762}, {'end': 989.166, 'text': 'Now, similarly, if your features are more than two, more than two independent features, at that time, we will basically be using a pair plot.', 'start': 982.079, 'duration': 7.087}, {'end': 995.191, 'text': 'Pair plot is also known as scatter plot in which one variable in the same row data is matched with another variable.', 'start': 989.626, 'duration': 5.565}, {'end': 996.713, 'text': 'On pair plot.', 'start': 995.792, 'duration': 0.921}, {'end': 1005.061, 'text': 'how we are fixing this problem is that if you have more than two independent features, it will just combine.', 'start': 996.713, 'duration': 8.348}], 'summary': 'The pair plot is used for analyzing more than two independent features, combining them in a scatter plot.', 'duration': 31.025, 'max_score': 974.036, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/UsglokDLa2o/pics/UsglokDLa2o974036.jpg'}, {'end': 1212.478, 'src': 'heatmap', 'start': 1171.691, 'weight': 0.943, 'content': [{'end': 1177.895, 'text': "Suppose I'm taking the tips column and I want to see that how many people paid in what kind of?", 'start': 1171.691, 'duration': 6.204}, {'end': 1180.996, 'text': 'in the terms of histograms that I can basically use sns.distplot.', 'start': 1177.895, 'duration': 3.101}, {'end': 1182.837, 'text': "Pretty much simple, I'll just give this value.", 'start': 1181.176, 'duration': 1.661}, {'end': 1185.319, 'text': "Here it is, you'll be able to see the histogram.", 'start': 1183.498, 'duration': 1.821}, {'end': 1188.3, 'text': 'And again, guys, remember histogram.', 'start': 1185.999, 'duration': 2.301}, {'end': 1197.186, 'text': "I've explained in my previous class, also based on your x-axis, it will be showing you the density like how it is present.", 'start': 1188.3, 'duration': 8.886}, {'end': 1200.828, 'text': 'By default, if the KD is present, it will show you with respect to density.', 'start': 1197.506, 'duration': 3.322}, {'end': 1204.691, 'text': 'okay, if kd is not present, it will basically show you the count.', 'start': 1201.388, 'duration': 3.303}, {'end': 1206.733, 'text': 'okay, if i make the kd is false.', 'start': 1204.691, 'duration': 2.042}, {'end': 1210.896, 'text': "see this i'm making over here as kd is false and i'm making bins h10.", 'start': 1206.733, 'duration': 4.163}, {'end': 1212.478, 'text': 'now it will show you the count.', 'start': 1210.896, 'duration': 1.582}], 'summary': 'Using sns.distplot to visualize payment distribution in histograms with count and density', 'duration': 40.787, 'max_score': 1171.691, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/UsglokDLa2o/pics/UsglokDLa2o1171691.jpg'}, {'end': 1260.008, 'src': 'embed', 'start': 1232.243, 'weight': 3, 'content': [{'end': 1236.606, 'text': 'okay, that is pretty much important to understand kernel density estimation.', 'start': 1232.243, 'duration': 4.363}, {'end': 1240.15, 'text': 'so this is how our diagram gets displayed over here.', 'start': 1236.606, 'duration': 3.544}, {'end': 1242.091, 'text': 'by default, the kd is true.', 'start': 1240.15, 'duration': 1.941}, {'end': 1250.737, 'text': "so you are getting this, uh, this kind of diagram, and when kd is true here, instead of count, you'll be getting in percentage.", 'start': 1242.091, 'duration': 8.646}, {'end': 1253.881, 'text': 'okay, percentage like point one percent, point two percent.', 'start': 1250.737, 'duration': 3.144}, {'end': 1260.008, 'text': 'if i go and see in this region from this to this, how much percentage is present somewhere around this person like that,', 'start': 1253.881, 'duration': 6.127}], 'summary': 'Understanding kernel density estimation and displaying diagrams with percentage data.', 'duration': 27.765, 'max_score': 1232.243, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/UsglokDLa2o/pics/UsglokDLa2o1232243.jpg'}], 'start': 811.094, 'title': 'Visual analysis in joint plot and seaborn data visualization', 'summary': "Explains features of joint plot: histograms, concentration of points, outliers, and parameters like 'kind'. it also discusses pair plot for visualizing correlation between multiple independent features, use of scatter plots, classification of points, and the importance of understanding parameters for creating histograms using distplot.", 'chapters': [{'end': 973.976, 'start': 811.094, 'title': 'Visual analysis in joint plot', 'summary': "Explains the features of joint plot, including the display of histograms, concentration of points, outliers, and the use of different parameters like 'kind' to visualize data, with an emphasis on the importance of understanding the concentration of points and the presence of outliers.", 'duration': 162.882, 'highlights': ['The joint plot feature allows for the visualization of histograms, revealing the concentration of points in specific regions, with a majority of the bills falling between 10 to 20 and very few outliers with bills greater than 50 and tip sizes more than 10.', "The 'kind' parameter in joint plot can be used to display histograms in different formats such as hexagonal shapes or regression lines, providing insights into the probability density function and the best fit line for the data.", 'Understanding the concentration of points and identifying outliers is crucial for analyzing the data effectively and drawing meaningful insights from the visualizations.']}, {'end': 1300.922, 'start': 974.036, 'title': 'Seaborn data visualization', 'summary': 'Discusses the use of pair plot for visualizing the correlation between multiple independent features, including the use of scatter plots to show correlation values, classification of points, and the importance of understanding the parameters for creating histograms using distplot, as well as a mention of upcoming topics on category plots.', 'duration': 326.886, 'highlights': ['Pair plot is used to visualize correlation between multiple independent features, and scatter plots help in showing correlation values and classifying points based on features. The pair plot is used to visualize the correlation between multiple independent features, and scatter plots help in showing correlation values and classifying points based on features such as sex.', 'Understanding the parameters for creating histograms using distplot is crucial, including the use of kernel density estimation and the influence on the display of data. Understanding the parameters for creating histograms using distplot is crucial, including the use of kernel density estimation and its influence on the display of data, such as showing counts or percentages.', 'Mention of upcoming topics on category plots, including box plot, violent plot, count plot, and bar. Mention of upcoming topics on category plots, including box plot, violent plot, count plot, and bar, which will be discussed in the next video.']}], 'duration': 489.828, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/UsglokDLa2o/pics/UsglokDLa2o811094.jpg', 'highlights': ["The 'kind' parameter in joint plot can be used to display histograms in different formats such as hexagonal shapes or regression lines, providing insights into the probability density function and the best fit line for the data.", 'Understanding the concentration of points and identifying outliers is crucial for analyzing the data effectively and drawing meaningful insights from the visualizations.', 'Pair plot is used to visualize correlation between multiple independent features, and scatter plots help in showing correlation values and classifying points based on features such as sex.', 'Understanding the parameters for creating histograms using distplot is crucial, including the use of kernel density estimation and its influence on the display of data, such as showing counts or percentages.']}], 'highlights': ['Seaborn provides statistical tools for understanding data.', 'Seaborn is crucial for exploratory data analysis in Python.', 'Seaborn is emphasized for its role in understanding data.', 'Understanding the division of features into independent and dependent is crucial in supervised machine learning, impacting the prediction and behavior of the data set.', 'The significance of features like F1, F2, F3, F4 in a data set is highlighted, emphasizing their impact on machine learning and deep learning techniques.', 'The visualization of features in a 2D plot enables the observation of their behavior and interaction, providing valuable insights for analysis.', 'Seaborn library assists in analyzing datasets with multiple features, offering dist plot, joint plot, and pair plot for visualization.', 'The chapter explains the challenges of visualizing data with more than three dimensions and the limitations of human perception in understanding four-dimensional diagrams.', 'Concepts of univariate and bivariate analysis are introduced, depicting the visualization of single and two-feature relationships through diagrams.', 'The chapter explains the process of using Seaborn to visualize and analyze a restaurant dataset with features like total bill, tip, sex, smoker, date, time, and size.', 'Aiming to create a model to predict tips based on other features.', 'Importance of understanding independent and dependent features.', 'Use of joint plot and heat map to find correlation between features.', 'Explanation of the dependent and independent features in the dataset.', 'Positive correlation between total bill and tip is 67%', 'Correlation visualization using heat map is important in exploratory data analysis', 'High correlation between independent features, close to 1 or 90%, suggests the possibility of using only one feature instead of both', 'Joint plot allows for univariate analysis by comparing two features', "The 'kind' parameter in joint plot can be used to display histograms in different formats such as hexagonal shapes or regression lines, providing insights into the probability density function and the best fit line for the data.", 'Understanding the concentration of points and identifying outliers is crucial for analyzing the data effectively and drawing meaningful insights from the visualizations.', 'Pair plot is used to visualize correlation between multiple independent features, and scatter plots help in showing correlation values and classifying points based on features such as sex.', 'Understanding the parameters for creating histograms using distplot is crucial, including the use of kernel density estimation and its influence on the display of data, such as showing counts or percentages.']}