title

How to Learn Statistics for Data Science As A Self Starter- Follow My Way

description

Statistical methods are mainly useful to ensure that your data are interpreted correctly. And that apparent relationships are really “significant” or meaningful and it is not simply happen by chance. Actually, the statistical analysis helps to find meaning to the meaningless numbers.
Python AI Tech news : https://www.youtube.com/playlist?list=PLTDARY42LDV6iFI3Dd3oNf6oIvuJs_EKX
Feature Engineering Playlist: https://www.youtube.com/playlist?list=PLZoTAELRMXVPwYGE2PXD3x0bfKnR0cJjN
Stats Playlist: https://www.youtube.com/playlist?list=PLZoTAELRMXVMhVyr3Ri9IQ-t5QPBtxzJO
Complete ML Playlist: https://www.youtube.com/playlist?list=PLZoTAELRMXVPBTrWtJkn3wWQxZkmTXGwe
Stats Syllabus: https://drive.google.com/file/d/1gVUdA2rqnlEiRWQHwryh9VI3Dj52GWe7/view?usp=sharing
⭐ Kite is a free AI-powered coding assistant that will help you code faster and smarter. The Kite plugin integrates with all the top editors and IDEs to give you smart completions and documentation while you’re typing. I've been using Kite for a few months and I love it! https://www.kite.com/get-kite/?utm_medium=referral&utm_source=youtube&utm_campaign=krishnaik&utm_content=description-only
Please join as a member in my channel to get additional benefits like materials in Data Science, live streaming for Members and many more
https://www.youtube.com/channel/UCNU_lfiiWBdtULKOw6X0Dig/join
Please do subscribe my other channel too
https://www.youtube.com/channel/UCjWY5hREA6FFYrthD0rZNIw
Connect with me here:
Twitter: https://twitter.com/Krishnaik06
Facebook: https://www.facebook.com/krishnaik06
instagram: https://www.instagram.com/krishnaik06

detail

{'title': 'How to Learn Statistics for Data Science As A Self Starter- Follow My Way', 'heatmap': [{'end': 201.168, 'start': 184.259, 'weight': 1}], 'summary': 'Learn statistics for data science with 70-80% coverage of basic, intermediate, and advanced concepts, emphasizing practical implementation alongside theoretical understanding for machine learning and deep learning. also, cover foundational statistical concepts, and the importance of statistics in machine learning with 80% coverage in uploaded videos.', 'chapters': [{'end': 240.787, 'segs': [{'end': 43.985, 'src': 'embed', 'start': 14.292, 'weight': 1, 'content': [{'end': 16.933, 'text': 'hello all my name is krishnayak and welcome to my youtube channel.', 'start': 14.292, 'duration': 2.641}, {'end': 22.396, 'text': 'so guys, today, in this video we are going to understand how to learn statistics for data science.', 'start': 16.933, 'duration': 5.463}, {'end': 27.158, 'text': 'as a self-starter now i have been uploading many videos on statistics.', 'start': 22.396, 'duration': 4.762}, {'end': 37.463, 'text': 'i have been taking many live streams in my youtube channel where i am doing different kind of feature engineering stuff and explaining you different kind of projects and all and always.', 'start': 27.158, 'duration': 10.305}, {'end': 43.985, 'text': 'i hope everybody knows that what role statistics actually play right, it is very, very important to understand,', 'start': 37.463, 'duration': 6.522}], 'summary': "Krishnayak's youtube channel focuses on statistics for data science, with numerous videos and live streams.", 'duration': 29.693, 'max_score': 14.292, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/zRUliXuwJCQ/pics/zRUliXuwJCQ14292.jpg'}, {'end': 67.589, 'src': 'embed', 'start': 43.985, 'weight': 0, 'content': [{'end': 50.386, 'text': 'because all these use cases that we actually work on right, let it be with respect to machine learning or deep learning,', 'start': 43.985, 'duration': 6.401}, {'end': 57.228, 'text': 'it deals with data and in order to understand the data we probably will be requiring statistical knowledge a lot.', 'start': 50.386, 'duration': 6.842}, {'end': 60.247, 'text': 'Now there are many people who knows that.', 'start': 58.186, 'duration': 2.061}, {'end': 61.787, 'text': 'okay, statistics is important.', 'start': 60.247, 'duration': 1.54}, {'end': 63.247, 'text': 'But how do we start?', 'start': 62.347, 'duration': 0.9}, {'end': 65.147, 'text': 'What is the process that we need to start?', 'start': 63.568, 'duration': 1.579}, {'end': 67.589, 'text': 'First of all, what topics we should have a look?', 'start': 65.248, 'duration': 2.341}], 'summary': 'Understanding data for machine learning and deep learning requires statistical knowledge and a structured process.', 'duration': 23.604, 'max_score': 43.985, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/zRUliXuwJCQ/pics/zRUliXuwJCQ43985.jpg'}, {'end': 132.771, 'src': 'embed', 'start': 107.689, 'weight': 4, 'content': [{'end': 112.715, 'text': 'So theoretical and practical is pretty much important, because once you understand how it works,', 'start': 107.689, 'duration': 5.026}, {'end': 116.179, 'text': 'implementation with Python or R programming language is pretty much important.', 'start': 112.715, 'duration': 3.464}, {'end': 119.762, 'text': 'Even in my playlist I have explained in that specific way.', 'start': 117.16, 'duration': 2.602}, {'end': 122.184, 'text': "Right So I'll go through that at the last.", 'start': 120.302, 'duration': 1.882}, {'end': 124.285, 'text': "But let's start over here.", 'start': 122.624, 'duration': 1.661}, {'end': 130.589, 'text': 'So what I have done is that in this presentation I have divided this whole statistic concepts into three main parts.', 'start': 124.705, 'duration': 5.884}, {'end': 132.771, 'text': 'And again guys statistics is huge.', 'start': 130.628, 'duration': 2.143}], 'summary': 'Understanding theoretical and practical aspects of statistics is essential for implementation with python or r programming languages, as highlighted in the presentation dividing statistical concepts into three main parts.', 'duration': 25.082, 'max_score': 107.689, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/zRUliXuwJCQ/pics/zRUliXuwJCQ107689.jpg'}, {'end': 213.551, 'src': 'heatmap', 'start': 184.259, 'weight': 2, 'content': [{'end': 189.404, 'text': 'so here i have divided this whole status, statistical topics into three main parts.', 'start': 184.259, 'duration': 5.145}, {'end': 196.826, 'text': 'one is basic stats, then you have intermediate stats and then you have advanced stats.', 'start': 190.384, 'duration': 6.442}, {'end': 201.168, 'text': 'so here in the basic stats, this is pretty much important for beginners.', 'start': 196.826, 'duration': 4.342}, {'end': 207.329, 'text': "again, i'm telling you, guys, just don't learn by just understanding how it works.", 'start': 201.168, 'duration': 6.161}, {'end': 213.551, 'text': "basically, i'm saying that, theoretically, you should not try to understand, just don't see the definition and you think that, okay,", 'start': 207.329, 'duration': 6.222}], 'summary': 'The statistical topics are divided into basic, intermediate, and advanced stats, emphasizing the importance for beginners to not just learn by understanding how it works.', 'duration': 29.292, 'max_score': 184.259, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/zRUliXuwJCQ/pics/zRUliXuwJCQ184259.jpg'}, {'end': 248.29, 'src': 'embed', 'start': 224.31, 'weight': 3, 'content': [{'end': 231.177, 'text': 'Even in the future when you will be doing machine learning algorithms and we will be doing multi classification problems.', 'start': 224.31, 'duration': 6.867}, {'end': 233.76, 'text': 'At that time probabilities will be definitely coming.', 'start': 231.737, 'duration': 2.023}, {'end': 237.884, 'text': 'So basic knowledge on probability is very very important guys.', 'start': 234.16, 'duration': 3.724}, {'end': 240.787, 'text': 'So probability you have to actually be good.', 'start': 238.124, 'duration': 2.663}, {'end': 248.29, 'text': 'Okay, then, coming to the terms that I have written over here, the topics I have written over here, introduction to basic terms like variables,', 'start': 241.249, 'duration': 7.041}], 'summary': 'Understanding probability is crucial for machine learning algorithms and multi-classification problems.', 'duration': 23.98, 'max_score': 224.31, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/zRUliXuwJCQ/pics/zRUliXuwJCQ224310.jpg'}], 'start': 14.292, 'title': 'Learning statistics for data science', 'summary': 'Emphasizes the importance of statistics in data science, covering basic, intermediate, and advanced concepts with 70-80% topics, and highlighting the practical implementation alongside theoretical understanding for machine learning and deep learning.', 'chapters': [{'end': 65.147, 'start': 14.292, 'title': 'Learning statistics for data science', 'summary': 'Discusses the importance of statistics in data science and the process of learning it, emphasizing its crucial role in understanding and working with data for machine learning and deep learning.', 'duration': 50.855, 'highlights': ['Statistics plays a crucial role in understanding and working with data for machine learning or deep learning, highlighting the importance of statistical knowledge. (Relevance: 5)', 'The video aims to help self-starters understand how to learn statistics for data science, indicating the target audience and purpose of the content. (Relevance: 4)', 'The speaker has been actively creating content on statistics, including videos and live streams, demonstrating expertise and experience in the field. (Relevance: 3)']}, {'end': 240.787, 'start': 65.248, 'title': 'Statistics concepts for data science', 'summary': 'Discusses the importance of understanding statistical concepts for data scientists, emphasizing the relevance of basic, intermediate, and advanced statistics, with 70-80% of the topics covered, and highlights the significance of practical implementation alongside theoretical understanding.', 'duration': 175.539, 'highlights': ['The chapter emphasizes the importance of understanding basic, intermediate, and advanced statistics for data scientists, with 70-80% of the topics already covered, and highlights the significance of practical implementation alongside theoretical understanding.', 'Probability is highlighted as the foundational concept for beginners, essential for understanding future machine learning algorithms and multi-classification problems.', 'The speaker stresses the importance of practical implementation alongside theoretical understanding when learning statistical concepts.', 'The speaker notes the regular occurrence of the discussed statistical topics in various data science projects based on their experience.']}], 'duration': 226.495, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/zRUliXuwJCQ/pics/zRUliXuwJCQ14292.jpg', 'highlights': ['Statistics plays a crucial role in understanding and working with data for machine learning or deep learning, highlighting the importance of statistical knowledge. (Relevance: 5)', 'The video aims to help self-starters understand how to learn statistics for data science, indicating the target audience and purpose of the content. (Relevance: 4)', 'The chapter emphasizes the importance of understanding basic, intermediate, and advanced statistics for data scientists, with 70-80% of the topics already covered, and highlights the significance of practical implementation alongside theoretical understanding.', 'Probability is highlighted as the foundational concept for beginners, essential for understanding future machine learning algorithms and multi-classification problems.', 'The speaker stresses the importance of practical implementation alongside theoretical understanding when learning statistical concepts.']}, {'end': 732.337, 'segs': [{'end': 348.829, 'src': 'embed', 'start': 277.562, 'weight': 0, 'content': [{'end': 280.583, 'text': 'is basically variance and standard deviation.', 'start': 277.562, 'duration': 3.021}, {'end': 288.486, 'text': 'All the topics that I have mentioned over here, guys, this will be actually forming the base for the intermediate and the advanced level.', 'start': 280.903, 'duration': 7.583}, {'end': 290.987, 'text': 'To begin with, you can actually start with this.', 'start': 288.826, 'duration': 2.161}, {'end': 294.028, 'text': 'Again, it is not compulsory that you just have to follow my playlist.', 'start': 291.227, 'duration': 2.801}, {'end': 295.669, 'text': 'Just go and search in the Google.', 'start': 294.468, 'duration': 1.201}, {'end': 297.95, 'text': 'Just type the same term what I have actually written over here.', 'start': 295.709, 'duration': 2.241}, {'end': 305.457, 'text': "and all these things i'll be putting up in the notebook notepad file and i'll also be sharing the link of my google drive in the description of this video,", 'start': 298.65, 'duration': 6.807}, {'end': 307.599, 'text': 'if you really want to check out all the topics.', 'start': 305.457, 'duration': 2.142}, {'end': 314.022, 'text': "okay, I'm just trying to show you how I learned statistics, how it was good.", 'start': 307.599, 'duration': 6.423}, {'end': 319.267, 'text': 'Because initially when I started, I used to randomly pick up some of the other topics when I was actually working.', 'start': 314.062, 'duration': 5.205}, {'end': 324.272, 'text': 'But yes, there is a different procedure also where you can learn stats,', 'start': 319.948, 'duration': 4.324}, {'end': 328.096, 'text': 'which will actually make your task easier as you go towards the advanced level.', 'start': 324.272, 'duration': 3.824}, {'end': 330.618, 'text': 'Okay, so this is pretty much important.', 'start': 328.957, 'duration': 1.661}, {'end': 332.659, 'text': 'You should try to focus in this specific way.', 'start': 330.858, 'duration': 1.801}, {'end': 336.481, 'text': 'Again, understand over here what all things, we have covered some basic things.', 'start': 333.36, 'duration': 3.121}, {'end': 338.643, 'text': 'We have covered variables, random variables.', 'start': 336.862, 'duration': 1.781}, {'end': 339.563, 'text': 'what is population?', 'start': 338.643, 'duration': 0.92}, {'end': 340.324, 'text': 'what is sample?', 'start': 339.563, 'duration': 0.761}, {'end': 348.128, 'text': 'mean population mean population distributions, measure of central tendency, like mean median mode range, measure of dispersion variance,', 'start': 340.324, 'duration': 7.804}, {'end': 348.829, 'text': 'standard deviation.', 'start': 348.128, 'duration': 0.701}], 'summary': 'Learning statistics, covering topics like variance, standard deviation, mean, median, mode, and range, forms the base for intermediate and advanced levels.', 'duration': 71.267, 'max_score': 277.562, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/zRUliXuwJCQ/pics/zRUliXuwJCQ277562.jpg'}, {'end': 447.405, 'src': 'embed', 'start': 423.448, 'weight': 8, 'content': [{'end': 430.032, 'text': 'There is something called as kernel density estimation because of which you are actually able to create a curve in the form of a PDF.', 'start': 423.448, 'duration': 6.584}, {'end': 436.617, 'text': 'What is CDF, cumulative density functions? This terminology you just not need to understand theoretically.', 'start': 430.633, 'duration': 5.984}, {'end': 439.779, 'text': 'You should also try to implement with the help of Python or R programming language.', 'start': 436.637, 'duration': 3.142}, {'end': 441.34, 'text': 'and all these things.', 'start': 440.339, 'duration': 1.001}, {'end': 447.405, 'text': 'guys, literally, i have taken each and everything from my playlist in feature engineering and my live stream playlist.', 'start': 441.34, 'duration': 6.065}], 'summary': 'Kernel density estimation and cdf are important for creating pdf curves. implement with python or r.', 'duration': 23.957, 'max_score': 423.448, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/zRUliXuwJCQ/pics/zRUliXuwJCQ423448.jpg'}, {'end': 503.188, 'src': 'embed', 'start': 476.184, 'weight': 1, 'content': [{'end': 480.088, 'text': "Suppose I'll tell you an example in my machine learning itself, right?", 'start': 476.184, 'duration': 3.904}, {'end': 483.671, 'text': 'And one of the live session I had actually done with respect to skewness of data, right?', 'start': 480.148, 'duration': 3.523}, {'end': 493.98, 'text': 'Usually, in some of the regression and the classification problem statement it says that some of the input features you know most of the input features should be normally distributed.', 'start': 484.552, 'duration': 9.428}, {'end': 499.045, 'text': 'Now suppose, if your data is not normally distributed, if it is having some right skew or left skew,', 'start': 494.661, 'duration': 4.384}, {'end': 501.026, 'text': 'how do you convert that into normally distributed?', 'start': 499.045, 'duration': 1.981}, {'end': 503.188, 'text': 'So that is the importance.', 'start': 501.767, 'duration': 1.421}], 'summary': 'Importance of handling skewness in machine learning for normally distributed features.', 'duration': 27.004, 'max_score': 476.184, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/zRUliXuwJCQ/pics/zRUliXuwJCQ476184.jpg'}, {'end': 629.693, 'src': 'embed', 'start': 601.761, 'weight': 3, 'content': [{'end': 606.024, 'text': 'If the independent features are highly correlated with the output features, that are important.', 'start': 601.761, 'duration': 4.263}, {'end': 608.206, 'text': 'You have to keep that independent features.', 'start': 606.505, 'duration': 1.701}, {'end': 616.352, 'text': 'But what if, if those 40 independent features are highly correlated to each other? What do you do? You can answer me the comment guys.', 'start': 608.666, 'duration': 7.686}, {'end': 617.033, 'text': 'I will give you a hint.', 'start': 616.372, 'duration': 0.661}, {'end': 618.254, 'text': 'You can do two things.', 'start': 617.473, 'duration': 0.781}, {'end': 625.99, 'text': 'Okay Now when I say correlation suppose if I say that 44 features are highly correlated more than 90% with each other.', 'start': 618.942, 'duration': 7.048}, {'end': 629.693, 'text': 'Okay More than 90% basically means 0.9.', 'start': 626.89, 'duration': 2.803}], 'summary': 'Highly correlated independent features need to be kept. if 40 features are highly correlated with each other (>90%), two options exist.', 'duration': 27.932, 'max_score': 601.761, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/zRUliXuwJCQ/pics/zRUliXuwJCQ601761.jpg'}, {'end': 709.536, 'src': 'embed', 'start': 680.943, 'weight': 4, 'content': [{'end': 683.684, 'text': 'Then you have to try to understand what is Pearson correlation coefficient.', 'start': 680.943, 'duration': 2.741}, {'end': 687.565, 'text': 'you have to understand what is Spearman rank correlation coefficient.', 'start': 683.684, 'duration': 3.881}, {'end': 688.445, 'text': 'hypothesis testing.', 'start': 687.565, 'duration': 0.88}, {'end': 691.726, 'text': 'hypothesis testing is a null hypothesis, alternate hypothesis.', 'start': 688.445, 'duration': 3.281}, {'end': 693.746, 'text': 'but apart from this, you also have to see that.', 'start': 691.726, 'duration': 2.02}, {'end': 695.947, 'text': 'how do you perform hypothesis testing?', 'start': 693.746, 'duration': 2.201}, {'end': 696.907, 'text': 'with the help of statistics?', 'start': 695.947, 'duration': 0.96}, {'end': 700.708, 'text': 'So there is a concept of p-value, chi-square p-test, sorry t-test.', 'start': 697.187, 'duration': 3.521}, {'end': 703.47, 'text': 'all those things has been implemented.', 'start': 701.628, 'duration': 1.842}, {'end': 709.536, 'text': 'in my playlist, i will be showing you where those videos are actually present, but this is important.', 'start': 703.47, 'duration': 6.066}], 'summary': 'Learn about pearson and spearman correlation, hypothesis testing, and statistics implementation with p-value, chi-square and t-test.', 'duration': 28.593, 'max_score': 680.943, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/zRUliXuwJCQ/pics/zRUliXuwJCQ680943.jpg'}], 'start': 241.249, 'title': 'Statistics fundamentals and learning approach', 'summary': 'Covers foundational statistical concepts including variables, measures of central tendency and dispersion, emphasizing the importance of focusing on specific topics for easier learning and sharing additional resources via google drive.', 'chapters': [{'end': 332.659, 'start': 241.249, 'title': 'Statistics fundamentals and learning approach', 'summary': 'Covers foundational statistical concepts including variables, measures of central tendency and dispersion, forming the base for intermediate and advanced levels, emphasizing the importance of focusing on specific topics for easier learning and sharing additional resources via google drive.', 'duration': 91.41, 'highlights': ['The chapter emphasizes foundational statistical concepts such as variables, measures of central tendency and dispersion, which form the base for intermediate and advanced levels of learning.', 'The importance of focusing on specific topics for easier learning and a different procedure to learn statistics which makes the task easier as one progresses towards the advanced level.', 'The speaker plans to share additional resources via Google Drive, providing further learning materials for the mentioned topics.', "Encourages learners to search for the mentioned terms on Google and not to follow the speaker's playlist compulsorily, promoting independent learning."]}, {'end': 732.337, 'start': 333.36, 'title': 'Stats & distributions', 'summary': 'Covers key statistical concepts including normal distribution, central limit theorem, skewness, covariance, correlation, and hypothesis testing, emphasizing practical applications and implementation in python or r.', 'duration': 398.977, 'highlights': ['Understanding normal distribution and its practical application in data analysis and feature engineering, emphasizing its significance in representing data distribution and its relevance to real-life examples. Importance of normal distribution in representing data distribution, practical application in data analysis and feature engineering.', 'Importance of practical implementation of standard normal distribution, Z-score, probability density function, cumulative distribution function, and plotting graphs using Python or R programming language. Significance of practical implementation in understanding statistical concepts, usage of Python or R programming for implementation.', 'Practical understanding of skewness in data and its impact on machine learning problems, emphasizing the need to convert skewed data into a normal distribution for better model performance. Impact of skewed data on machine learning models, significance of converting skewed data into normal distribution for model performance.', 'Explaining the significance of covariance and correlation in feature selection, emphasizing the importance of identifying and handling highly correlated independent features to improve model performance. Significance of covariance and correlation in feature selection, importance of handling highly correlated independent features for improved model performance.', 'Explanation of hypothesis testing, including null hypothesis, alternate hypothesis, p-value, and t-test, with an emphasis on practical implementation using statistics and the importance of following a structured learning path for comprehensive understanding. Importance of structured learning path for hypothesis testing, practical implementation using statistics, emphasis on null hypothesis, alternate hypothesis, p-value, and t-test.']}], 'duration': 491.088, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/zRUliXuwJCQ/pics/zRUliXuwJCQ241249.jpg', 'highlights': ['The chapter emphasizes foundational statistical concepts such as variables, measures of central tendency and dispersion, which form the base for intermediate and advanced levels of learning.', 'Understanding normal distribution and its practical application in data analysis and feature engineering, emphasizing its significance in representing data distribution and its relevance to real-life examples.', 'Practical understanding of skewness in data and its impact on machine learning problems, emphasizing the need to convert skewed data into a normal distribution for better model performance.', 'Explaining the significance of covariance and correlation in feature selection, emphasizing the importance of identifying and handling highly correlated independent features to improve model performance.', 'Explanation of hypothesis testing, including null hypothesis, alternate hypothesis, p-value, and t-test, with an emphasis on practical implementation using statistics and the importance of following a structured learning path for comprehensive understanding.', 'The importance of focusing on specific topics for easier learning and a different procedure to learn statistics which makes the task easier as one progresses towards the advanced level.', 'The speaker plans to share additional resources via Google Drive, providing further learning materials for the mentioned topics.', "Encourages learners to search for the mentioned terms on Google and not to follow the speaker's playlist compulsorily, promoting independent learning.", 'Importance of practical implementation of standard normal distribution, Z-score, probability density function, cumulative distribution function, and plotting graphs using Python or R programming language. Significance of practical implementation in understanding statistical concepts, usage of Python or R programming for implementation.']}, {'end': 1181.336, 'segs': [{'end': 792.011, 'src': 'embed', 'start': 732.337, 'weight': 0, 'content': [{'end': 738.823, 'text': 'then, when you follow my machine learning algorithms, like how i have taught, how i have implemented or how i have done the live stream there,', 'start': 732.337, 'duration': 6.486}, {'end': 744.207, 'text': 'you will be able to understand that where we have actually used this statistical concepts itself,', 'start': 738.823, 'duration': 5.384}, {'end': 747.61, 'text': 'mostly in the exploratory data analysis feature engineering feature selection part.', 'start': 744.207, 'duration': 3.403}, {'end': 750.132, 'text': 'guys, we may be using this mostly.', 'start': 747.61, 'duration': 2.522}, {'end': 750.753, 'text': 'trust me in that.', 'start': 750.132, 'duration': 0.621}, {'end': 758.406, 'text': 'Hardly it will be used in some of the machine learning algorithms, but internally, machine learning algorithm works with different kind of maths.', 'start': 752.643, 'duration': 5.763}, {'end': 765.789, 'text': 'But statistics will definitely be useful in your exploratory data analysis, feature engineering and feature selection part.', 'start': 759.566, 'duration': 6.223}, {'end': 771.151, 'text': 'So correlation is one of the process where it will be very very helpful to remove your unwanted features.', 'start': 766.149, 'duration': 5.002}, {'end': 773.332, 'text': 'again right now.', 'start': 772.131, 'duration': 1.201}, {'end': 775.355, 'text': 'coming to the next part, that is advanced stat.', 'start': 773.332, 'duration': 2.023}, {'end': 777.517, 'text': 'now here i have wrote about qq plot.', 'start': 775.355, 'duration': 2.162}, {'end': 778.138, 'text': 'first topic.', 'start': 777.517, 'duration': 0.621}, {'end': 784.184, 'text': 'qq plot basically means that you basically need to understand whether your random variable that you have selected, whether it follows, uh,', 'start': 778.138, 'duration': 6.046}, {'end': 785.305, 'text': 'normal distribution or not.', 'start': 784.184, 'duration': 1.121}, {'end': 792.011, 'text': 'if it does not follow normal distribution or not, then what you do is that you try to convert that into Law,', 'start': 785.305, 'duration': 6.706}], 'summary': 'Understanding statistical concepts for exploratory data analysis and feature selection in machine learning.', 'duration': 59.674, 'max_score': 732.337, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/zRUliXuwJCQ/pics/zRUliXuwJCQ732337.jpg'}, {'end': 989.51, 'src': 'embed', 'start': 961.317, 'weight': 3, 'content': [{'end': 965.038, 'text': "okay, now, the next thing is that you'll say that, krish, this is fine.", 'start': 961.317, 'duration': 3.721}, {'end': 965.619, 'text': 'you all say.', 'start': 965.038, 'duration': 0.581}, {'end': 969.58, 'text': 'all the topics over here mentioned this, but from where to study?', 'start': 965.619, 'duration': 3.961}, {'end': 971.781, 'text': "right? i'll give you that, guys.", 'start': 969.58, 'duration': 2.201}, {'end': 978.283, 'text': "from this all i've uploaded around 80 percentage of my videos right.", 'start': 971.781, 'duration': 6.502}, {'end': 980.704, 'text': 'so here is my playlist with respect to feature engineering.', 'start': 978.283, 'duration': 2.421}, {'end': 982.865, 'text': "so i'll start with statistics.", 'start': 981.304, 'duration': 1.561}, {'end': 989.51, 'text': 'in statistics you can see that population, sample, gaussian or normal distribution, log, normal distribution, covariance, mean median mode,', 'start': 982.865, 'duration': 6.645}], 'summary': 'Around 80% videos uploaded on feature engineering, covering statistics topics like population, sample, gaussian distribution, and more.', 'duration': 28.193, 'max_score': 961.317, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/zRUliXuwJCQ/pics/zRUliXuwJCQ961317.jpg'}], 'start': 732.337, 'title': 'Statistics and qq plot in machine learning', 'summary': 'Emphasizes the importance of statistics in machine learning, covering exploratory data analysis, feature engineering, and feature selection. it also explores the concept of qq plot for assessing normal distribution, providing practical applications in feature engineering and machine learning with 80% coverage in uploaded videos.', 'chapters': [{'end': 773.332, 'start': 732.337, 'title': 'Statistics in machine learning', 'summary': 'Highlights the importance of statistics in machine learning, particularly in exploratory data analysis, feature engineering, and feature selection, emphasizing its usefulness in removing unwanted features through correlation.', 'duration': 40.995, 'highlights': ['Statistics is crucial in exploratory data analysis, feature engineering, and feature selection, especially in the process of removing unwanted features through correlation.', 'Machine learning algorithms mostly utilize statistical concepts in exploratory data analysis and feature engineering.']}, {'end': 1181.336, 'start': 773.332, 'title': 'Advanced statistics and qq plot', 'summary': 'Covers the concept of qq plot for assessing the normal distribution of random variables, transforming non-gaussian distributions into normal distributions using techniques such as log-normal distribution and box-cox transform, and the practical applications of statistics in feature engineering and machine learning, with around 80% of the topics already covered in the uploaded videos.', 'duration': 408.004, 'highlights': ['The chapter covers the concept of QQ plot for assessing the normal distribution of random variables. QQ plot is used to determine whether a selected random variable follows a normal distribution.', 'The transcript discusses techniques for transforming non-Gaussian distributions into normal distributions, such as log-normal distribution and box-cox transform. Methods like log-normal distribution and box-cox transform are presented as ways to convert non-Gaussian distributions into normal distributions.', 'The practical applications of statistics in feature engineering and machine learning are emphasized, with around 80% of the topics already covered in the uploaded videos. The practical applications of statistics in feature engineering and machine learning are highlighted, with approximately 80% of the topics already covered in the available videos.']}], 'duration': 448.999, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/zRUliXuwJCQ/pics/zRUliXuwJCQ732337.jpg', 'highlights': ['Statistics is crucial in exploratory data analysis, feature engineering, and feature selection, especially in the process of removing unwanted features through correlation.', 'Machine learning algorithms mostly utilize statistical concepts in exploratory data analysis and feature engineering.', 'The chapter covers the concept of QQ plot for assessing the normal distribution of random variables. QQ plot is used to determine whether a selected random variable follows a normal distribution.', 'The practical applications of statistics in feature engineering and machine learning are emphasized, with around 80% of the topics already covered in the uploaded videos.']}], 'highlights': ['Statistics plays a crucial role in understanding and working with data for machine learning or deep learning, highlighting the importance of statistical knowledge. (Relevance: 5)', 'The video aims to help self-starters understand how to learn statistics for data science, indicating the target audience and purpose of the content. (Relevance: 4)', 'The chapter emphasizes foundational statistical concepts such as variables, measures of central tendency and dispersion, which form the base for intermediate and advanced levels of learning.', 'Understanding normal distribution and its practical application in data analysis and feature engineering, emphasizing its significance in representing data distribution and its relevance to real-life examples.', 'Practical understanding of skewness in data and its impact on machine learning problems, emphasizing the need to convert skewed data into a normal distribution for better model performance.', 'Explaining the significance of covariance and correlation in feature selection, emphasizing the importance of identifying and handling highly correlated independent features to improve model performance.', 'Explanation of hypothesis testing, including null hypothesis, alternate hypothesis, p-value, and t-test, with an emphasis on practical implementation using statistics and the importance of following a structured learning path for comprehensive understanding.', 'The importance of focusing on specific topics for easier learning and a different procedure to learn statistics which makes the task easier as one progresses towards the advanced level.', 'Statistics is crucial in exploratory data analysis, feature engineering, and feature selection, especially in the process of removing unwanted features through correlation.', 'Machine learning algorithms mostly utilize statistical concepts in exploratory data analysis and feature engineering.', 'The chapter covers the concept of QQ plot for assessing the normal distribution of random variables. QQ plot is used to determine whether a selected random variable follows a normal distribution.', 'The practical applications of statistics in feature engineering and machine learning are emphasized, with around 80% of the topics already covered in the uploaded videos.']}