title

Complete Statistics For Data Science In 6 hours By Krish Naik

description

Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied
#statistics #statsfordatascience #krishnaik
All the materials are available in
the below
https://github.com/krishnaik06/The-Grand-Complete-Data-Science-Materials/tree/main
Timestamp
0:00:00 Introduction
0:00:35 Descriptive Statistics
0:02:45 Inferential Stats
0:04:31 What is Statistics
0:06:54 Types of Statistics
0:11:22 Population And Sample
0:14:33 Sampling Teechniques
0:24:33 What are Variables?
0:30:54 Variable Measurement Scales
0:42:55 Mean, Median, Mode
0:57:10 Measure of dispersion with Variance And SD
1:08:05 Percentiles and Quartiles
1:15:35 Five number summary and boxplot
1:29:12 Gaussian And Normal Distribution
1:56:40 Stats Interview Question 1
2:17:10 Finding Outliers In Python
2:32:00 Probability, Additive Rule, Multiplicative Rule
2:51:26 Permutation And combination
2:56:22 p value
2:59:19 Hypothesis testing, confidence interval, significance values
3:12:22 Type 1 and Type 2 error
3:25:55 Confidence Interval
3:46:45 One sample z test
3:59:11 one sample t test
4:06:32 Chi square test
4:21:45 Inferential stats with python
4:24:37 Covariance, Pearson correlation, spearman rank correlation
4:54:59 Deriving P values and significance value
5:13:41 Other types of distribution
-----------------------------------------------------------------------------------------------------
►Data Science Projects:
https://www.youtube.com/watch?v=S_F_c9e2bz4&list=PLZoTAELRMXVPS-dOaVbAux22vzqdgoGhG&pp=iAQB
►Learn In One Tutorials
Statistics in 6 hours: https://www.youtube.com/watch?v=LZzq1zSL1bs&t=9522s&pp=ygUVa3Jpc2ggbmFpayBzdGF0aXN0aWNz
Machine Learning In 6 Hours: https://www.youtube.com/watch?v=JxgmHe2NyeY&t=4733s&pp=ygUba3Jpc2ggbmFpayBtYWNoaW5lIGxlYXJuaW5n
Deep Learning 5 hours : https://www.youtube.com/watch?v=d2kxUVwWWwU&t=1210s&pp=ygUYa3Jpc2ggbmFpayBkZWVwIGxlYXJuaW5n
►Learn In a Week Playlist
Statistics:https://www.youtube.com/watch?v=11unm2hmvOQ&list=PLZoTAELRMXVMgtxAboeAx-D9qbnY94Yay
Machine Learning : https://www.youtube.com/watch?v=z8sxaUw_f-M&list=PLZoTAELRMXVPjaAzURB77Kz0YXxj65tYz
Deep Learning:https://www.youtube.com/watch?v=8arGWdq_KL0&list=PLZoTAELRMXVPiyueAqA_eQnsycC_DSBns
NLP : https://www.youtube.com/watch?v=w3coRFpyddQ&list=PLZoTAELRMXVNNrHSKv36Lr3_156yCo6Nn
►Detailed Playlist:
Stats For Data Science In Hindi : https://www.youtube.com/watch?v=7y3XckjaVOw&list=PLTDARY42LDV6YHSRo669_uDDGmUEmQnDJ&pp=gAQB
Machine Learning In English : https://www.youtube.com/watch?v=bPrmA1SEN2k&list=PLZoTAELRMXVPBTrWtJkn3wWQxZkmTXGwe
Machine Learning In Hindi : https://www.youtube.com/watch?v=7uwa9aPbBRU&list=PLTDARY42LDV7WGmlzZtY-w9pemyPrKNUZ&pp=gAQB
Complete Deep Learning: https://www.youtube.com/watch?v=YFNKnUhm_-s&list=PLZoTAELRMXVPGU70ZGsckrMdr0FteeRUi

detail

{'title': 'Complete Statistics For Data Science In 6 hours By Krish Naik', 'heatmap': [{'end': 2170.173, 'start': 1971.494, 'weight': 0.792}, {'end': 2566.452, 'start': 2363.036, 'weight': 0.757}, {'end': 4145.706, 'start': 3742.098, 'weight': 0.762}, {'end': 4536.117, 'start': 4336.212, 'weight': 0.856}, {'end': 5126.735, 'start': 4729.128, 'weight': 0.752}, {'end': 5914.234, 'start': 5714.83, 'weight': 0.856}, {'end': 6508.544, 'start': 6109.771, 'weight': 0.801}, {'end': 7890.905, 'start': 7687.85, 'weight': 0.918}, {'end': 10646.199, 'start': 10446.707, 'weight': 0.721}, {'end': 12226.42, 'start': 12015.982, 'weight': 0.726}, {'end': 12623.491, 'start': 12410.841, 'weight': 0.768}, {'end': 13602.138, 'start': 12813.377, 'weight': 0.748}, {'end': 14789.744, 'start': 14389.332, 'weight': 0.811}, {'end': 15577.74, 'start': 15376.125, 'weight': 0.983}, {'end': 16558.152, 'start': 16359.469, 'weight': 0.84}], 'summary': 'This 6-hour video course covers statistics basics and advanced topics for data science, including measures of central tendency and dispersion, hypothesis testing, data visualization techniques, probability, permutation, chi-square and z-test analysis, relationship between variables, and various statistical analysis with practical applications demonstrated using python programming.', 'chapters': [{'end': 207.678, 'segs': [{'end': 55.727, 'src': 'embed', 'start': 6.24, 'weight': 0, 'content': [{'end': 7.041, 'text': 'hello, guys.', 'start': 6.24, 'duration': 0.801}, {'end': 12.624, 'text': 'what are we basically going to cover, from basics to advanced?', 'start': 7.041, 'duration': 5.583}, {'end': 22.731, 'text': 'uh, this will be specifically related to positions like data scientist, data analyst, related to business intelligence tool.', 'start': 12.624, 'duration': 10.107}, {'end': 24.272, 'text': 'everything will get covered over here.', 'start': 22.731, 'duration': 1.541}, {'end': 31.577, 'text': 'we need to understand the basic differences between descriptive statistics and, second one is inferential stats,', 'start': 24.272, 'duration': 7.305}, {'end': 35.48, 'text': 'the differences between descriptive stats and inferential stats,', 'start': 31.577, 'duration': 3.903}, {'end': 40.622, 'text': 'because The entire statistics with respect to data science is divided into these two concepts.', 'start': 35.48, 'duration': 5.142}, {'end': 49.885, 'text': 'In descriptive stats, some of the topics that I really want to mention is measure of central tendency, measure of dispersions.', 'start': 41.042, 'duration': 8.843}, {'end': 52.086, 'text': 'These are some of the examples.', 'start': 50.305, 'duration': 1.781}, {'end': 55.727, 'text': 'Anything that is related to summarizing the data.', 'start': 52.506, 'duration': 3.221}], 'summary': 'Covers basics to advanced related to data science positions, including descriptive and inferential statistics.', 'duration': 49.487, 'max_score': 6.24, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs6240.jpg'}, {'end': 103.354, 'src': 'embed', 'start': 75.752, 'weight': 1, 'content': [{'end': 79.133, 'text': 'by what techniques we create this PDF CDF, everything?', 'start': 75.752, 'duration': 3.381}, {'end': 84.514, 'text': "We'll also be understanding some topics like probability permutations,", 'start': 80.013, 'duration': 4.501}, {'end': 92.881, 'text': 'which are pretty much probability is very much important in terms for data science mean median mode.', 'start': 84.514, 'duration': 8.367}, {'end': 96.145, 'text': 'so you also have variance, standard deviation.', 'start': 92.881, 'duration': 3.264}, {'end': 99.169, 'text': 'we are going to cover many distributions.', 'start': 96.145, 'duration': 3.024}, {'end': 103.354, 'text': 'let me name the distributions over here, like gaussian distribution.', 'start': 99.169, 'duration': 4.185}], 'summary': 'Covering probability, permutations, variance, and distributions including gaussian.', 'duration': 27.602, 'max_score': 75.752, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs75752.jpg'}, {'end': 186.978, 'src': 'embed', 'start': 142.525, 'weight': 2, 'content': [{'end': 145.466, 'text': "We'll discuss about something called as qqplot.", 'start': 142.525, 'duration': 2.941}, {'end': 152.589, 'text': "We'll try to find out how to determine whether a distribution is a normal distribution or not.", 'start': 146.107, 'duration': 6.482}, {'end': 154.47, 'text': 'That all things we will try to discuss.', 'start': 152.87, 'duration': 1.6}, {'end': 155.971, 'text': 'These are some of the topics that I have written.', 'start': 154.49, 'duration': 1.481}, {'end': 161.114, 'text': 'There is also very something very much important which is called as inferential stats.', 'start': 156.971, 'duration': 4.143}, {'end': 170.699, 'text': 'Now in inferential stats, our main focus is basically like Z test, T test, ANOVA test, chi-square test.', 'start': 161.194, 'duration': 9.505}, {'end': 178.687, 'text': 'If I just consider some example with respect to Z test, there are multiple ways to actually perform z test.', 'start': 171.259, 'duration': 7.428}, {'end': 186.978, 'text': 'so in z test probably you will be having different ways, and this i will also try to show you by executing in python t test.', 'start': 178.687, 'duration': 8.291}], 'summary': 'Discussion on qqplot, normal distribution, and inferential stats including z test, t test, anova test, and chi-square test with examples in python.', 'duration': 44.453, 'max_score': 142.525, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs142525.jpg'}], 'start': 6.24, 'title': 'Statistics in data science', 'summary': 'Covers statistics basics and advanced topics for data science roles, including measures of central tendency and dispersion, as well as a wide range of statistical topics and inferential statistics demonstrated with python programming.', 'chapters': [{'end': 55.727, 'start': 6.24, 'title': 'Statistics for data science roles', 'summary': 'Covers the basics and advanced topics related to data science roles like data scientist and data analyst, including the differences between descriptive and inferential statistics, focusing on measures of central tendency and dispersion.', 'duration': 49.487, 'highlights': ['The chapter covers the basics and advanced topics related to data science roles like data scientist and data analyst, including the differences between descriptive and inferential statistics.', 'The entire statistics with respect to data science is divided into descriptive and inferential statistics, with a focus on measures of central tendency and dispersion.', 'The topics covered include measure of central tendency and measure of dispersion which are related to summarizing the data.']}, {'end': 207.678, 'start': 56.007, 'title': 'Statistics fundamentals for data science', 'summary': "Covers a wide range of statistical topics including histograms, pdf, cdf, probability permutations, distributions like gaussian, log normal, binomial, bernoulli's, pareto, and standard normal distributions, as well as inferential statistics like z test, t test, anova test, and chi-square test, all demonstrated with python programming.", 'duration': 151.671, 'highlights': ["Covering various statistical concepts such as histograms, PDF, CDF, probability permutations, and distributions like gaussian, log normal, binomial, bernoulli's, Pareto, and standard normal distributions.", 'Exploring inferential statistics including Z test, T test, ANOVA test, and chi-square test, with demonstrations using Python programming language.', 'Discussion of important topics such as hypothesis testing, transformation, standardization, and qqplot to determine normal distribution, all illustrated with Python.', 'Emphasizing the practical application of statistical techniques through demonstrations in Python for Z test, T test, chi-square test, and ANOVA test.']}], 'duration': 201.438, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs6240.jpg', 'highlights': ['The chapter covers the basics and advanced topics related to data science roles like data scientist and data analyst, including the differences between descriptive and inferential statistics.', "Covering various statistical concepts such as histograms, PDF, CDF, probability permutations, and distributions like gaussian, log normal, binomial, bernoulli's, Pareto, and standard normal distributions.", 'Exploring inferential statistics including Z test, T test, ANOVA test, and chi-square test, with demonstrations using Python programming language.', 'The entire statistics with respect to data science is divided into descriptive and inferential statistics, with a focus on measures of central tendency and dispersion.', 'Discussion of important topics such as hypothesis testing, transformation, standardization, and qqplot to determine normal distribution, all illustrated with Python.']}, {'end': 1434.196, 'segs': [{'end': 250.42, 'src': 'embed', 'start': 228.408, 'weight': 0, 'content': [{'end': 236.452, 'text': "then i'll also teach you how to see Z-table you know, which is a kind of sheet where you can directly get the values over there.", 'start': 228.408, 'duration': 8.044}, {'end': 238.073, 'text': 'Similarly, T-table is there.', 'start': 236.652, 'duration': 1.421}, {'end': 239.594, 'text': 'Chi-square table is there.', 'start': 238.453, 'duration': 1.141}, {'end': 241.275, 'text': 'Many things will basically be there.', 'start': 239.974, 'duration': 1.301}, {'end': 242.615, 'text': "Let's start the first topic.", 'start': 241.415, 'duration': 1.2}, {'end': 250.42, 'text': 'The first topic that obviously anybody needs to understand is that what is statistics?', 'start': 243.396, 'duration': 7.024}], 'summary': 'Learn about z-table, t-table, and chi-square table in statistics.', 'duration': 22.012, 'max_score': 228.408, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs228408.jpg'}, {'end': 294.31, 'src': 'embed', 'start': 268.904, 'weight': 1, 'content': [{'end': 277.491, 'text': 'statistics. many people have different kind of definition with statistics, but i really want to give a very simple definition, which is from wikipedia.', 'start': 268.904, 'duration': 8.587}, {'end': 285.627, 'text': "so i'm going to say statistics is the science of collecting, organizing and analyzing data.", 'start': 277.491, 'duration': 8.136}, {'end': 294.31, 'text': 'now you know, based on the amount of data that is getting generated, now you can just understand directly, like how important stats is.', 'start': 285.627, 'duration': 8.683}], 'summary': 'Statistics is the science of collecting, organizing, and analyzing data, crucial due to data volume.', 'duration': 25.406, 'max_score': 268.904, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs268904.jpg'}, {'end': 608.498, 'src': 'embed', 'start': 578.957, 'weight': 2, 'content': [{'end': 582.499, 'text': "I've told you the definition what inferential stats basically consist of.", 'start': 578.957, 'duration': 3.542}, {'end': 588.062, 'text': 'It is a technique wherein we use the data that we have measured to form conclusions.', 'start': 582.899, 'duration': 5.163}, {'end': 596.867, 'text': 'I may say that are the ages of the students of this classroom similar to the age of the college?', 'start': 588.542, 'duration': 8.325}, {'end': 605.496, 'text': "Similar, I'll not say age of the college, but age of the maths classroom in the college.", 'start': 597.167, 'duration': 8.329}, {'end': 608.498, 'text': 'so this is specifically my question.', 'start': 605.496, 'duration': 3.002}], 'summary': 'Inferential statistics uses measured data to draw conclusions, such as comparing ages in a classroom to those in the college.', 'duration': 29.541, 'max_score': 578.957, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs578957.jpg'}, {'end': 789.697, 'src': 'embed', 'start': 762.345, 'weight': 3, 'content': [{'end': 765.886, 'text': 'they basically say based on that they actually create their exit poll.', 'start': 762.345, 'duration': 3.541}, {'end': 771.909, 'text': 'Now, in this particular case, what is my population data? My population data is this entire population of Goa.', 'start': 766.126, 'duration': 5.783}, {'end': 775.05, 'text': 'So this specific thing is my population data.', 'start': 772.129, 'duration': 2.921}, {'end': 779.852, 'text': 'And this round circles that I have actually done is basically my sample data.', 'start': 775.33, 'duration': 4.522}, {'end': 784.154, 'text': 'So I hope you have basically got some examples with respect to that.', 'start': 780.352, 'duration': 3.802}, {'end': 786.615, 'text': 'Guys, I hope everybody is clear with this.', 'start': 784.674, 'duration': 1.941}, {'end': 789.697, 'text': "I basically told age over here, so don't get confused.", 'start': 786.875, 'duration': 2.822}], 'summary': 'Population data is the entire population of goa, while the sample data is represented by round circles.', 'duration': 27.352, 'max_score': 762.345, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs762345.jpg'}, {'end': 984.906, 'src': 'embed', 'start': 932.328, 'weight': 4, 'content': [{'end': 935.69, 'text': "i can basically say that i'll just give you a small definition over here.", 'start': 932.328, 'duration': 3.362}, {'end': 946.077, 'text': 'when performing simple random sampling, every member of the population has an equal chance of being selected for your sample.', 'start': 935.69, 'duration': 10.387}, {'end': 948.64, 'text': 'Now coming to the second type.', 'start': 947.418, 'duration': 1.222}, {'end': 955.307, 'text': 'The second type of sampling is called as stratified sampling.', 'start': 948.94, 'duration': 6.367}, {'end': 957.55, 'text': 'Let me give you a definition.', 'start': 956.128, 'duration': 1.422}, {'end': 968.86, 'text': 'Stratified sampling is a technique where the population, that is capital N, is split into non-overlapping groups.', 'start': 957.71, 'duration': 11.15}, {'end': 971.781, 'text': "So one example, I'll be talking about it, don't worry.", 'start': 969.44, 'duration': 2.341}, {'end': 974.022, 'text': 'This is also called as strata.', 'start': 972.161, 'duration': 1.861}, {'end': 978.523, 'text': 'Strata basically means layering, stratified layering, like that we basically say.', 'start': 974.082, 'duration': 4.441}, {'end': 981.685, 'text': 'This is what a stratified sampling basically means.', 'start': 978.724, 'duration': 2.961}, {'end': 983.385, 'text': 'Let me give you one example.', 'start': 982.265, 'duration': 1.12}, {'end': 984.906, 'text': "Let's consider gender.", 'start': 983.665, 'duration': 1.241}], 'summary': 'Simple random sampling ensures equal chance for selection. stratified sampling splits population into non-overlapping groups for selection, e.g. gender.', 'duration': 52.578, 'max_score': 932.328, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs932328.jpg'}, {'end': 1176.333, 'src': 'embed', 'start': 1149.933, 'weight': 6, 'content': [{'end': 1153.716, 'text': 'So in systematic sampling, You consider any 8th person.', 'start': 1149.933, 'duration': 3.783}, {'end': 1156.138, 'text': "I'm just saying as an example every 8th person.", 'start': 1154.076, 'duration': 2.062}, {'end': 1163.405, 'text': 'I may take every 1st person that I see or every 5th person that I see or every 10th person that I see in front of my eyes.', 'start': 1156.519, 'duration': 6.886}, {'end': 1164.646, 'text': "I'll just tell him to do the survey.", 'start': 1163.405, 'duration': 1.241}, {'end': 1167.249, 'text': 'So this is what systematic sampling is all about.', 'start': 1164.887, 'duration': 2.362}, {'end': 1172.051, 'text': "In systematic sampling, there is no reason why you're selecting the 8th or the 9th person.", 'start': 1167.609, 'duration': 4.442}, {'end': 1174.792, 'text': 'You just said that, okay, it is my personal duty.', 'start': 1172.391, 'duration': 2.401}, {'end': 1176.333, 'text': "What I'm actually going to do.", 'start': 1175.132, 'duration': 1.201}], 'summary': 'Systematic sampling selects every 8th person for surveying, without specific reason.', 'duration': 26.4, 'max_score': 1149.933, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs1149933.jpg'}, {'end': 1240.724, 'src': 'embed', 'start': 1211.516, 'weight': 7, 'content': [{'end': 1213.718, 'text': "I'll say it as convenience sampling.", 'start': 1211.516, 'duration': 2.202}, {'end': 1217.462, 'text': "This kind of samples, suppose let's consider that I'm doing a survey.", 'start': 1214.039, 'duration': 3.423}, {'end': 1227.854, 'text': 'Only those people who are a domain expertise in that particular survey will be participating in that particular survey.', 'start': 1218.043, 'duration': 9.811}, {'end': 1232.297, 'text': "Suppose let's say consider that I am doing a survey related to data science.", 'start': 1228.494, 'duration': 3.803}, {'end': 1240.724, 'text': 'I will say that any person who is probably interested in data science and has the knowledge of data science, if you consider only those people,', 'start': 1232.638, 'duration': 8.086}], 'summary': 'Convenience sampling involves selecting domain experts for specific surveys, such as data science.', 'duration': 29.208, 'max_score': 1211.516, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs1211516.jpg'}, {'end': 1424.63, 'src': 'embed', 'start': 1395.091, 'weight': 8, 'content': [{'end': 1400.294, 'text': 'based on that, you will do, and it is not like we will just be dependent on one kind of data.', 'start': 1395.091, 'duration': 5.203}, {'end': 1405.878, 'text': 'we try to use different, different sampling techniques and finally we try to come to a conclusion on the same.', 'start': 1400.294, 'duration': 5.584}, {'end': 1407.519, 'text': 'let me give you one more example.', 'start': 1405.878, 'duration': 1.641}, {'end': 1412.863, 'text': 'a drug needs to be tested, so for this, what kind of samples we may take?', 'start': 1407.519, 'duration': 5.344}, {'end': 1415.224, 'text': 'now? here i can bring up multiple use case.', 'start': 1412.863, 'duration': 2.361}, {'end': 1417.786, 'text': 'first of all, to whom this drug needs to be tested.', 'start': 1415.224, 'duration': 2.562}, {'end': 1424.63, 'text': 'If I get that specific information, I will basically do the age groupings and then I may probably apply.', 'start': 1418.647, 'duration': 5.983}], 'summary': 'Using various sampling techniques to draw conclusions from diverse data sources for drug testing.', 'duration': 29.539, 'max_score': 1395.091, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs1395091.jpg'}], 'start': 208.639, 'title': 'Statistics fundamentals', 'summary': 'Covers hypothesis testing, p values, confidence intervals, z-table, t-table, chi-square table, importance of statistics in decision-making, sampling techniques including simple random sampling and stratified sampling, and various sampling technique examples and use cases.', 'chapters': [{'end': 268.904, 'start': 208.639, 'title': 'Hypothesis testing and statistics basics', 'summary': 'Covers hypothesis testing, p values, confidence intervals, z-table, t-table, chi-square table, and emphasizes the importance of understanding statistics for interviews.', 'duration': 60.265, 'highlights': ['The chapter covers hypothesis testing, p values, confidence intervals, Z-table, T-table, Chi-square table.', 'The importance of understanding statistics for interviews is emphasized.', 'The chapter provides an introduction to statistics, which is crucial for interviews.']}, {'end': 802.664, 'start': 268.904, 'title': 'Understanding statistics: importance and types', 'summary': 'Discusses the importance of statistics in making better decisions through organizing and analyzing data, and explains descriptive and inferential statistics with examples.', 'duration': 533.76, 'highlights': ['Statistics is the science of collecting, organizing and analyzing data, and is crucial due to the vast amount of data generated for improving products and business goals. Statistics is defined as the science of collecting, organizing and analyzing data, which is crucial for making better decisions based on the vast amount of data generated for improving products and business goals.', 'Descriptive statistics involves organizing and summarizing data, demonstrated through an example of finding the average marks of a math class to understand the types of questions that may arise. Descriptive statistics consist of organizing and summarizing data, illustrated by finding the average marks of a math class to understand the types of questions that may arise.', 'Inferential statistics involves using measured data to form conclusions, exemplified by comparing the ages of students in a math classroom to the ages of students in the entire college. Inferential statistics involves using measured data to form conclusions, illustrated by comparing the ages of students in a math classroom to the ages of students in the entire college.', 'Population and sample data are explained using the example of conducting an exit poll in Goa to understand the distinction between the entire population and sample data. The distinction between population and sample data is explained using the example of conducting an exit poll in Goa to understand the difference between the entire population and sample data.']}, {'end': 1103.216, 'start': 802.784, 'title': 'Sampling techniques and notations', 'summary': 'Discusses different sampling techniques including simple random sampling and stratified sampling, along with the notation for population (capital n) and sample (small n). it emphasizes the importance of understanding non-overlapping groups in stratified sampling and the equal chance of selection in simple random sampling.', 'duration': 300.432, 'highlights': ['The chapter emphasizes the importance of understanding non-overlapping groups in stratified sampling and the equal chance of selection in simple random sampling. It explains the concept of stratified sampling, where the population is split into non-overlapping groups (strata) and highlights the equal chance of selection for every member of the population in simple random sampling.', 'It discusses different sampling techniques including simple random sampling and stratified sampling. The chapter covers the concept of simple random sampling, where every member of the population has an equal chance of being selected, and stratified sampling, which involves splitting the population into non-overlapping groups.', 'It introduces the notation for population (capital N) and sample (small n). The transcript introduces the notation for population (capital N) and sample (small n) and emphasizes the significance of understanding these notations in the context of sampling techniques.']}, {'end': 1434.196, 'start': 1103.316, 'title': 'Sampling techniques and examples', 'summary': 'Discusses stratified, systematic, convenience, and random sampling techniques, providing examples and use cases while emphasizing the importance of using different sampling techniques for various use cases and considerations.', 'duration': 330.88, 'highlights': ['In systematic sampling, individuals are selected at regular intervals from the population, exemplified by surveying every 7th or 8th person outside a mall for a COVID survey. Systematic sampling involves selecting every Nth individual from the population, such as surveying every 7th or 8th person for a specific survey.', 'Convenience sampling involves selecting only domain experts or individuals with specific knowledge or interest in the survey topic, such as surveying only those with expertise in data science for a related survey. Convenience sampling entails selecting individuals with specific domain expertise or interest in the survey topic, as seen in surveying only those knowledgeable in data science for a related survey.', 'The chapter emphasizes the need to use different sampling techniques based on the specific use case and not relying solely on one type, indicating the importance of considering various factors and conditions in sampling. The chapter highlights the importance of using different sampling techniques based on the specific use case, emphasizing the need to consider various factors and conditions in sampling.']}], 'duration': 1225.557, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs208639.jpg', 'highlights': ['The chapter covers hypothesis testing, p values, confidence intervals, Z-table, T-table, Chi-square table.', 'Statistics is the science of collecting, organizing and analyzing data, crucial for making better decisions based on the vast amount of data generated for improving products and business goals.', 'Inferential statistics involves using measured data to form conclusions, illustrated by comparing the ages of students in a math classroom to the ages of students in the entire college.', 'The distinction between population and sample data is explained using the example of conducting an exit poll in Goa to understand the difference between the entire population and sample data.', 'It explains the concept of stratified sampling, where the population is split into non-overlapping groups (strata) and highlights the equal chance of selection for every member of the population in simple random sampling.', 'The chapter covers the concept of simple random sampling, where every member of the population has an equal chance of being selected, and stratified sampling, which involves splitting the population into non-overlapping groups.', 'Systematic sampling involves selecting every Nth individual from the population, such as surveying every 7th or 8th person for a specific survey.', 'Convenience sampling entails selecting individuals with specific domain expertise or interest in the survey topic, as seen in surveying only those knowledgeable in data science for a related survey.', 'The chapter highlights the importance of using different sampling techniques based on the specific use case, emphasizing the need to consider various factors and conditions in sampling.']}, {'end': 2528.355, 'segs': [{'end': 1589.687, 'src': 'embed', 'start': 1563.507, 'weight': 0, 'content': [{'end': 1570.911, 'text': 'We can perform a lot of operations like add, subtract, divide, multiply, right? We can perform any kind of operations that we want.', 'start': 1563.507, 'duration': 7.404}, {'end': 1578.363, 'text': 'One example of this is, I may consider age, I may consider weight, I may consider height.', 'start': 1571.741, 'duration': 6.622}, {'end': 1581.985, 'text': 'Some of the examples with respect to quantitative variable.', 'start': 1578.864, 'duration': 3.121}, {'end': 1584.526, 'text': 'If I say that, okay, age is a quantitative variable.', 'start': 1582.245, 'duration': 2.281}, {'end': 1589.687, 'text': "In qualitative and categorical variables, if I specifically take an example, let's consider gender.", 'start': 1584.806, 'duration': 4.881}], 'summary': 'Perform various operations on quantitative and qualitative variables, such as age, weight, height, and gender.', 'duration': 26.18, 'max_score': 1563.507, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs1563507.jpg'}, {'end': 1710.673, 'src': 'embed', 'start': 1682.002, 'weight': 1, 'content': [{'end': 1685.604, 'text': 'Obviously, we know quantitative basically means we have some numerical values.', 'start': 1682.002, 'duration': 3.602}, {'end': 1688.285, 'text': 'Here I am going to divide this into two one.', 'start': 1686.264, 'duration': 2.021}, {'end': 1692.186, 'text': 'One is the discrete variables and one is the continuous variable.', 'start': 1688.885, 'duration': 3.301}, {'end': 1694.426, 'text': 'So discrete variables and continuous variable.', 'start': 1692.426, 'duration': 2}, {'end': 1698.208, 'text': 'In discrete variable, you will specifically have a whole number.', 'start': 1694.526, 'duration': 3.682}, {'end': 1700.448, 'text': 'Let me just talk about some of the examples.', 'start': 1698.688, 'duration': 1.76}, {'end': 1702.349, 'text': 'Number of bank accounts.', 'start': 1700.908, 'duration': 1.441}, {'end': 1705.01, 'text': 'of a person in this particular case.', 'start': 1703.209, 'duration': 1.801}, {'end': 1710.673, 'text': "the example is that you'll say that i have two bank account, three bank account, four, five, six bank account, seven bank account.", 'start': 1705.01, 'duration': 5.663}], 'summary': 'Quantitative data includes discrete variables (e.g. bank accounts) and continuous variables.', 'duration': 28.671, 'max_score': 1682.002, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs1682002.jpg'}, {'end': 1791.215, 'src': 'embed', 'start': 1761.533, 'weight': 2, 'content': [{'end': 1770.118, 'text': 'Suppose I say it is 1.1 inches, 1.25 inches, 1.35 inches, right? All these things are basically there.', 'start': 1761.533, 'duration': 8.585}, {'end': 1773.12, 'text': 'So this was an example with respect to continuous variables.', 'start': 1770.218, 'duration': 2.902}, {'end': 1774.721, 'text': "I'll give you some examples.", 'start': 1773.48, 'duration': 1.241}, {'end': 1784.287, 'text': 'What kind of variable gender is, what kind of variable marital status is, what kind of Variable river length is.', 'start': 1775.421, 'duration': 8.866}, {'end': 1788.131, 'text': 'What kind of variable the population of a state is.', 'start': 1784.647, 'duration': 3.484}, {'end': 1791.215, 'text': 'What kind of variable song length is.', 'start': 1788.612, 'duration': 2.603}], 'summary': 'Discussed examples of continuous variables, including lengths and populations.', 'duration': 29.682, 'max_score': 1761.533, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs1761533.jpg'}, {'end': 1909.46, 'src': 'embed', 'start': 1879.685, 'weight': 3, 'content': [{'end': 1882.327, 'text': 'because your data set will also have this kind of variables.', 'start': 1879.685, 'duration': 2.642}, {'end': 1889.051, 'text': "You'll have nominal data, you'll have ordinal data, you'll have internal data, interval data, ratio, related data,", 'start': 1882.807, 'duration': 6.244}, {'end': 1891.593, 'text': "so that you'll be able to do a good data analysis.", 'start': 1889.051, 'duration': 2.542}, {'end': 1894.215, 'text': 'So you basically use this kind of variables.', 'start': 1891.953, 'duration': 2.262}, {'end': 1904.397, 'text': 'So if I talk about nominal variable, so nominal data also I can say, these are specifically Categorical or qualitative data.', 'start': 1894.355, 'duration': 10.042}, {'end': 1909.46, 'text': 'So whenever I say categorical data, you know that it is split into different different classes.', 'start': 1904.517, 'duration': 4.943}], 'summary': 'Data analysis involves various types of variables, including nominal, ordinal, interval, and ratio data, which are categorical and qualitative.', 'duration': 29.775, 'max_score': 1879.685, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs1879685.jpg'}, {'end': 2187.868, 'src': 'heatmap', 'start': 1971.494, 'weight': 4, 'content': [{'end': 1975.475, 'text': 'This data that we specifically have is my ordinal data.', 'start': 1971.494, 'duration': 3.981}, {'end': 1978.176, 'text': 'Here we focus more on the order.', 'start': 1975.535, 'duration': 2.641}, {'end': 1980.558, 'text': 'not on the values.', 'start': 1979.116, 'duration': 1.442}, {'end': 1982.721, 'text': 'here we mostly focus on this ranks.', 'start': 1980.558, 'duration': 2.163}, {'end': 1986.826, 'text': 'we are not worried like what marks that particular person has got.', 'start': 1982.721, 'duration': 4.105}, {'end': 1988.728, 'text': 'yes, he has got the first rank.', 'start': 1986.826, 'duration': 1.902}, {'end': 1991.091, 'text': 'so this was with respect to the ordinal data.', 'start': 1988.728, 'duration': 2.363}, {'end': 1993.634, 'text': "now let's me let me come towards the so over.", 'start': 1991.091, 'duration': 2.543}, {'end': 2004.188, 'text': 'here you can basically say that ordinal data will be present and we also use a different technique to analyze those data and probably we try to.', 'start': 1993.634, 'duration': 10.554}, {'end': 2008.213, 'text': "probably, when we'll be seeing some data set in the future, we will probably try to see that.", 'start': 2004.188, 'duration': 4.025}, {'end': 2009.414, 'text': 'okay scenarios also.', 'start': 2008.213, 'duration': 1.201}, {'end': 2011.196, 'text': 'now, interval interval data.', 'start': 2009.414, 'duration': 1.782}, {'end': 2013.379, 'text': 'here the order matters.', 'start': 2011.196, 'duration': 2.183}, {'end': 2015.962, 'text': 'here The value also matters.', 'start': 2013.379, 'duration': 2.583}, {'end': 2019.327, 'text': 'And one thing is that your natural zero is not present.', 'start': 2016.343, 'duration': 2.984}, {'end': 2022.151, 'text': 'What is this natural zero? Your order also matters.', 'start': 2019.507, 'duration': 2.644}, {'end': 2023.052, 'text': 'Values also matter.', 'start': 2022.171, 'duration': 0.881}, {'end': 2027.858, 'text': "So if I take an example of interval, let's say that I have an interval of temperatures.", 'start': 2023.072, 'duration': 4.786}, {'end': 2029.34, 'text': "And let's consider Fahrenheit.", 'start': 2028.059, 'duration': 1.281}, {'end': 2031.283, 'text': "Fahrenheit temperature I'm just talking about.", 'start': 2029.46, 'duration': 1.823}, {'end': 2039.305, 'text': 'I may have values like this, 70 to 80 Fahrenheit, 80 to 90 Fahrenheit, or I may have 70 to 80 Fahrenheit, 80 to 90 Fahrenheit.', 'start': 2031.483, 'duration': 7.822}, {'end': 2040.905, 'text': 'Here, interval is there.', 'start': 2039.365, 'duration': 1.54}, {'end': 2044.286, 'text': 'Definitely some kind of values are there, 90 to 100 Fahrenheit.', 'start': 2041.065, 'duration': 3.221}, {'end': 2050.527, 'text': "But if I say 0 Fahrenheit, it won't basically make a useful meaning in this.", 'start': 2044.586, 'duration': 5.941}, {'end': 2053.467, 'text': 'So definitely this is basically called as an interval.', 'start': 2050.608, 'duration': 2.859}, {'end': 2058.248, 'text': 'You have some range of values between them, and the order also basically matters a lot.', 'start': 2053.808, 'duration': 4.44}, {'end': 2059.63, 'text': 'I may also have distance.', 'start': 2058.53, 'duration': 1.1}, {'end': 2065.254, 'text': '10 to 20, 20 to 30, 30 to 40, where probably this interval data may be used.', 'start': 2060.77, 'duration': 4.484}, {'end': 2068.397, 'text': 'In OLA, I think you have probably booked cabs.', 'start': 2065.574, 'duration': 2.823}, {'end': 2071.94, 'text': "You book the cab for, let's say you're booking the cab for six hours.", 'start': 2068.938, 'duration': 3.002}, {'end': 2076.244, 'text': "There they'll be saying that you can actually go till 0 to 60.", 'start': 2072.42, 'duration': 3.824}, {'end': 2080.966, 'text': 'And then you can probably, if you are more than 60, that time you have to pay more money.', 'start': 2076.244, 'duration': 4.722}, {'end': 2081.687, 'text': 'Natural zero.', 'start': 2081.005, 'duration': 0.682}, {'end': 2084.668, 'text': 'Zero will not be present, right? Zero Fahrenheit will not make any difference.', 'start': 2081.726, 'duration': 2.942}, {'end': 2086.849, 'text': 'Now, ratio data will be an assignment for you.', 'start': 2084.967, 'duration': 1.882}, {'end': 2091.952, 'text': 'Let me go ahead and let me take another topic, which is called as frequency distribution.', 'start': 2087.049, 'duration': 4.903}, {'end': 2096.833, 'text': "Now, this is pretty much important because in the later stages, you'll be understanding about histogram and all.", 'start': 2092.132, 'duration': 4.701}, {'end': 2098.915, 'text': "Let's say that I have a sample data set.", 'start': 2097.094, 'duration': 1.821}, {'end': 2106.558, 'text': 'And suppose in this particular data set I have three types of flowers rose, lily and sunflower.', 'start': 2099.115, 'duration': 7.443}, {'end': 2113.34, 'text': 'now, similarly, in this particular data set i have lot of flowers like rose, lily data, sunflower.', 'start': 2106.558, 'duration': 6.782}, {'end': 2118.262, 'text': 'then again i have rose, then again i have lily, then again i have lily.', 'start': 2113.34, 'duration': 4.922}, {'end': 2122.583, 'text': "okay, so suppose let's consider that this is my entire data set.", 'start': 2118.262, 'duration': 4.321}, {'end': 2129.526, 'text': 'Now, usually for showcasing this data set in some kind of visualized manner,', 'start': 2123.303, 'duration': 6.223}, {'end': 2135.549, 'text': 'we can basically use this frequency distribution table based on the flower type and how much is the frequency?', 'start': 2129.526, 'duration': 6.023}, {'end': 2137.89, 'text': 'Okay, and this will be very much important.', 'start': 2136.289, 'duration': 1.601}, {'end': 2139.511, 'text': 'Suppose if I say rows.', 'start': 2138.13, 'duration': 1.381}, {'end': 2143.792, 'text': 'In rows, how many types I have? 1, 2, 3.', 'start': 2139.871, 'duration': 3.921}, {'end': 2145.293, 'text': 'So 3 is the count of rows.', 'start': 2143.793, 'duration': 1.5}, {'end': 2149.53, 'text': 'If I consider lily, so lilia, what is the basic count?', 'start': 2145.453, 'duration': 4.077}, {'end': 2153.339, 'text': 'i am basically having one, two, three, four, so four is the count.', 'start': 2149.53, 'duration': 3.809}, {'end': 2156.761, 'text': 'if i consider sunflower, What is the count??', 'start': 2153.339, 'duration': 3.422}, {'end': 2157.942, 'text': '1 and 2..', 'start': 2156.781, 'duration': 1.161}, {'end': 2165.329, 'text': 'So this is the frequency of this particular values, of this particular data set with respect to different, different categories.', 'start': 2157.942, 'duration': 7.387}, {'end': 2170.173, 'text': 'Okay? So here you can see that I, this is entirely frequency distribution table.', 'start': 2165.549, 'duration': 4.624}, {'end': 2176.198, 'text': 'And from this table you can derive bar charts, you can derive pie charts, you can derive different, different things.', 'start': 2170.633, 'duration': 5.565}, {'end': 2177.479, 'text': 'Now, one more topic.', 'start': 2176.318, 'duration': 1.161}, {'end': 2183.304, 'text': 'Now this you know that it is a frequency distribution, but there is something called as cumulative frequency.', 'start': 2177.539, 'duration': 5.765}, {'end': 2187.868, 'text': 'cumulative frequency basically says that initially i have rose three flowers.', 'start': 2183.304, 'duration': 4.564}], 'summary': 'The transcript covers ordinal, interval, and ratio data, as well as frequency distribution and cumulative frequency.', 'duration': 216.374, 'max_score': 1971.494, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs1971494.jpg'}, {'end': 2248.932, 'src': 'embed', 'start': 2204.082, 'weight': 5, 'content': [{'end': 2206.483, 'text': 'This is basically the cumulative frequency.', 'start': 2204.082, 'duration': 2.401}, {'end': 2211.365, 'text': "The frequency is getting added and finally you'll be able to see the cumulative frequency over here.", 'start': 2207.003, 'duration': 4.362}, {'end': 2215.006, 'text': "Now what we can basically derive from this, I'll just show you an example.", 'start': 2211.465, 'duration': 3.541}, {'end': 2219.668, 'text': "There's something called as bar graphs and pie charts.", 'start': 2215.606, 'duration': 4.062}, {'end': 2222.609, 'text': "So that particular part now we'll try to draw from this.", 'start': 2220.148, 'duration': 2.461}, {'end': 2224.729, 'text': "And we'll try to see that how does it look like.", 'start': 2222.869, 'duration': 1.86}, {'end': 2228.611, 'text': 'In the case of discrete variables, we can definitely draw a bar chart.', 'start': 2224.81, 'duration': 3.801}, {'end': 2233.456, 'text': 'if the variable is continuous, at that point of time we can draw a continuous.', 'start': 2229.191, 'duration': 4.265}, {'end': 2234.918, 'text': 'we can draw a histogram.', 'start': 2233.456, 'duration': 1.462}, {'end': 2236.74, 'text': 'so let me just talk about bar graph.', 'start': 2234.918, 'duration': 1.822}, {'end': 2238.662, 'text': 'so first one is the bar graph.', 'start': 2236.74, 'duration': 1.922}, {'end': 2244.208, 'text': 'in bar graph, in the x-axis, i will probably have all my flowers.', 'start': 2238.662, 'duration': 5.546}, {'end': 2248.932, 'text': 'so this is rose, this is lily and this is sunflower.', 'start': 2244.208, 'duration': 4.724}], 'summary': 'Teaching about cumulative frequency, bar graphs, and pie charts with an example.', 'duration': 44.85, 'max_score': 2204.082, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs2204082.jpg'}, {'end': 2382.795, 'src': 'embed', 'start': 2346.433, 'weight': 7, 'content': [{'end': 2348.134, 'text': 'You can basically use histograms.', 'start': 2346.433, 'duration': 1.701}, {'end': 2349.715, 'text': 'So the histogram will have like this.', 'start': 2348.154, 'duration': 1.561}, {'end': 2352.016, 'text': 'Now understand one very important thing.', 'start': 2350.015, 'duration': 2.001}, {'end': 2355.417, 'text': 'In histogram, we make something called as bins.', 'start': 2352.116, 'duration': 3.301}, {'end': 2358.059, 'text': 'Bins basically means we make some kind of grouping.', 'start': 2355.798, 'duration': 2.261}, {'end': 2363.036, 'text': 'By default, the bin size is usually 10.', 'start': 2358.971, 'duration': 4.065}, {'end': 2369.604, 'text': "Now if I really want to make these bins, what I'll do, in the y-axis I will be having the frequency, obviously you'll know this.", 'start': 2363.036, 'duration': 6.568}, {'end': 2371.026, 'text': "Now let's make the bin.", 'start': 2370.005, 'duration': 1.021}, {'end': 2373.529, 'text': 'I told you, 10 will be the bin size.', 'start': 2371.566, 'duration': 1.963}, {'end': 2382.795, 'text': '30, 40, 50, 60, 70, 80, 90.', 'start': 2375.872, 'duration': 6.923}], 'summary': 'Histograms use bins with default size 10 to group data for frequency counting.', 'duration': 36.362, 'max_score': 2346.433, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs2346433.jpg'}, {'end': 2490.592, 'src': 'embed', 'start': 2460.467, 'weight': 8, 'content': [{'end': 2463.81, 'text': 'And in this histograms, your values will be continuous.', 'start': 2460.467, 'duration': 3.343}, {'end': 2467.974, 'text': 'Now, one amazing thing, because people ask about what is PDF.', 'start': 2464.671, 'duration': 3.303}, {'end': 2472.334, 'text': 'I say that PDF is smoothening of histogram.', 'start': 2469.21, 'duration': 3.124}, {'end': 2474.818, 'text': "So I'll just tell you one example.", 'start': 2472.475, 'duration': 2.343}, {'end': 2479.023, 'text': 'If I smoothen this histogram, my PDF function will look something like this.', 'start': 2474.898, 'duration': 4.125}, {'end': 2483.269, 'text': 'Now you may be considering, Krish, how is this basically getting created?', 'start': 2479.524, 'duration': 3.745}, {'end': 2485.851, 'text': 'Okay, how is this basically getting created?', 'start': 2484.21, 'duration': 1.641}, {'end': 2490.592, 'text': "I'll say that there is something called as kernel density estimator.", 'start': 2486.451, 'duration': 4.141}], 'summary': 'Histograms represent continuous values, and pdf is a smoothing of histograms using kernel density estimator.', 'duration': 30.125, 'max_score': 2460.467, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs2460467.jpg'}], 'start': 1434.196, 'title': 'Types of variables and data visualization techniques', 'summary': 'Distinguishes between quantitative and qualitative variables, discusses types of variables including measured variables, and emphasizes the importance of frequency distribution and visualization techniques in data analysis.', 'chapters': [{'end': 1760.933, 'start': 1434.196, 'title': 'Understanding variables in data analysis', 'summary': 'Introduces the concept of variables, distinguishing between quantitative and qualitative variables, and further categorizing quantitative variables into discrete and continuous, highlighting their distinct properties and examples.', 'duration': 326.737, 'highlights': ['The chapter introduces the concept of variables and differentiates between quantitative and qualitative variables, providing examples such as age, weight, and height for quantitative variables and gender for qualitative variables. The concept of variables is introduced, distinguishing between quantitative and qualitative variables and providing examples such as age, weight, height, and gender.', 'The chapter further categorizes quantitative variables into discrete and continuous, explaining that discrete variables consist of whole numbers, illustrated with examples like the number of bank accounts and children in a family, while continuous variables can take any value, demonstrated through examples like height, weight, and amount of rainfall. The quantitative variables are further categorized into discrete and continuous, with discrete variables consisting of whole numbers and continuous variables being able to take any value, as demonstrated through examples such as number of bank accounts, number of children in a family, height, weight, and amount of rainfall.']}, {'end': 2076.244, 'start': 1761.533, 'title': 'Types of variables and measured variables', 'summary': 'Discusses the types of variables including continuous, discrete, and categorical variables, and the four types of measured variables - nominal, ordinal, interval, and ratio - with examples and significance in data analysis.', 'duration': 314.711, 'highlights': ['The chapter explains the types of variables - continuous, discrete, and categorical - using examples such as gender, marital status, river length, population, and song length.', 'It details the significance of measured variables - nominal, ordinal, interval, and ratio - in data analysis, providing examples and emphasizing their presence in datasets for good data analysis.', 'The chapter provides examples and explanations for nominal, ordinal, and interval measured variables, illustrating their significance and impact on data analysis techniques.']}, {'end': 2528.355, 'start': 2076.244, 'title': 'Frequency distribution and visualization techniques', 'summary': 'Discusses the importance of frequency distribution and visualization techniques such as bar charts and histograms in descriptive statistics, exemplifying the usage of frequency distribution tables and explaining the process for creating bar charts and histograms for discrete and continuous variables.', 'duration': 452.111, 'highlights': ['Frequency distribution is important for showcasing data visually, and can be used to derive bar charts and pie charts. Frequency distribution is crucial for visualizing data and can be utilized to derive bar charts and pie charts for data analysis.', 'Cumulative frequency allows for understanding the total number of occurrences and can be used to create cumulative frequency diagrams. Cumulative frequency provides insight into the total occurrences and can be used to generate cumulative frequency diagrams.', 'Bar charts are suitable for discrete variables, where the x-axis represents categories and the y-axis represents frequency. Bar charts are appropriate for discrete variables, with the x-axis indicating categories and the y-axis indicating frequency.', 'Histograms are used for representing continuous data and involve creating bins to group data, with the y-axis representing frequency. Histograms are utilized for continuous data, involving the creation of bins to group data, and the y-axis indicating frequency.', 'The Probability Density Function (PDF) is a smoothed version of the histogram, created using a Kernel Density Estimator (KDE). The Probability Density Function (PDF) is a smoothed version of the histogram, produced using a Kernel Density Estimator (KDE).']}], 'duration': 1094.159, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs1434196.jpg', 'highlights': ['The chapter introduces the concept of variables and differentiates between quantitative and qualitative variables, providing examples such as age, weight, and height for quantitative variables and gender for qualitative variables.', 'The chapter further categorizes quantitative variables into discrete and continuous, explaining that discrete variables consist of whole numbers, illustrated with examples like the number of bank accounts and children in a family, while continuous variables can take any value, demonstrated through examples like height, weight, and amount of rainfall.', 'The chapter explains the types of variables - continuous, discrete, and categorical - using examples such as gender, marital status, river length, population, and song length.', 'It details the significance of measured variables - nominal, ordinal, interval, and ratio - in data analysis, providing examples and emphasizing their presence in datasets for good data analysis.', 'Frequency distribution is important for showcasing data visually, and can be used to derive bar charts and pie charts.', 'Cumulative frequency allows for understanding the total number of occurrences and can be used to create cumulative frequency diagrams.', 'Bar charts are suitable for discrete variables, where the x-axis represents categories and the y-axis represents frequency.', 'Histograms are used for representing continuous data and involve creating bins to group data, with the y-axis representing frequency.', 'The Probability Density Function (PDF) is a smoothed version of the histogram, created using a Kernel Density Estimator (KDE).']}, {'end': 3321.912, 'segs': [{'end': 2594.811, 'src': 'embed', 'start': 2547.025, 'weight': 0, 'content': [{'end': 2557.007, 'text': 'we are basically going to cover measure of central tendency, measure of central tendency, measure of dispersions, gaussian distribution.', 'start': 2547.025, 'duration': 9.982}, {'end': 2559.628, 'text': 'then, fourth, we are going to understand z score.', 'start': 2557.007, 'duration': 2.621}, {'end': 2566.452, 'text': 'Then we are going to understand standard normal distribution, standard normal distribution.', 'start': 2560.968, 'duration': 5.484}, {'end': 2568.553, 'text': 'and there are some more topics that we really need to cover.', 'start': 2566.452, 'duration': 2.101}, {'end': 2578.78, 'text': 'So the first topic that probably we are going to discuss is something called as arithmetic mean for population and sample.', 'start': 2569.274, 'duration': 9.506}, {'end': 2583.103, 'text': 'Mean basically means over here specifically we are talking about average.', 'start': 2579.24, 'duration': 3.863}, {'end': 2594.811, 'text': 'Now with population and with sample, we really need to understand the formulas of mean and we will try to understand in this specific way.', 'start': 2584.167, 'duration': 10.644}], 'summary': 'Covering central tendency, dispersion, z-score, and normal distribution with emphasis on mean for population and sample.', 'duration': 47.786, 'max_score': 2547.025, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs2547025.jpg'}, {'end': 2733.734, 'src': 'embed', 'start': 2709.16, 'weight': 1, 'content': [{'end': 2715.645, 'text': "because I want, because in the real world industry, when you are working, when you're explaining someone as a data scientist,", 'start': 2709.16, 'duration': 6.485}, {'end': 2718.467, 'text': 'you really need to use this well-known notation.', 'start': 2715.645, 'duration': 2.822}, {'end': 2720.448, 'text': 'You can use your own notation, whatever you like.', 'start': 2718.487, 'duration': 1.961}, {'end': 2724.01, 'text': 'but think of a larger point of view here.', 'start': 2720.928, 'duration': 3.082}, {'end': 2729.832, 'text': 'you really need to make sure that whatever standards is being followed, we need to try to follow in that specific way.', 'start': 2724.01, 'duration': 5.822}, {'end': 2733.734, 'text': 'so this was the basic things with respect to mean.', 'start': 2729.832, 'duration': 3.902}], 'summary': 'In industry, using well-known notation is crucial for data scientists to adhere to standards.', 'duration': 24.574, 'max_score': 2709.16, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs2709160.jpg'}, {'end': 2927.854, 'src': 'embed', 'start': 2899.459, 'weight': 3, 'content': [{'end': 2904.842, 'text': 'outliers really have a adverse impact on the entire distribution.', 'start': 2899.459, 'duration': 5.383}, {'end': 2908.764, 'text': 'so that is the reason why we should be very much careful with outliers.', 'start': 2904.842, 'duration': 3.922}, {'end': 2913.886, 'text': 'in data science also, in statistics also, we use different techniques to remove the outliers,', 'start': 2908.764, 'duration': 5.122}, {'end': 2918.528, 'text': "which also i'll be discussing today when we are going to discuss about percentiles and all.", 'start': 2913.886, 'duration': 4.642}, {'end': 2921.209, 'text': 'so remember, outliers have a major impact,', 'start': 2918.528, 'duration': 2.681}, {'end': 2927.854, 'text': 'because here you can see that the entire Distribution of the central data is basically moving and the difference is quite huge.', 'start': 2921.209, 'duration': 6.645}], 'summary': 'Outliers have a major impact on data distribution, requiring careful handling and removal techniques in data science and statistics.', 'duration': 28.395, 'max_score': 2899.459, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs2899459.jpg'}, {'end': 3197.575, 'src': 'embed', 'start': 3164.598, 'weight': 4, 'content': [{'end': 3171.783, 'text': "And finally, you'll be able to see that when I probably used two outliers and then probably I got the median as 3.5.", 'start': 3164.598, 'duration': 7.185}, {'end': 3179.407, 'text': 'Now here you can basically see that there is less difference, right, less difference when compared to this.', 'start': 3171.783, 'duration': 7.624}, {'end': 3186.47, 'text': 'if i talk about median, it works well with outlier, so this is the proper statement that i want to consider.', 'start': 3179.407, 'duration': 7.063}, {'end': 3197.575, 'text': 'so in the case of mode the third topic now, suppose if i have a specific data set like this one, two, three, four, five, six, six, six, seven,', 'start': 3186.47, 'duration': 11.105}], 'summary': 'Median of 3.5 shows less difference with outliers compared to mode in the given dataset.', 'duration': 32.977, 'max_score': 3164.598, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs3164598.jpg'}, {'end': 3312.508, 'src': 'embed', 'start': 3284.318, 'weight': 5, 'content': [{'end': 3287.519, 'text': 'Or let me just change this data set and make it in a simpler way.', 'start': 3284.318, 'duration': 3.201}, {'end': 3293.901, 'text': 'Why specifically we use mode? In mode also we use it in both integer and categorical variables.', 'start': 3288.039, 'duration': 5.862}, {'end': 3296.502, 'text': 'But it works well with categorical variables.', 'start': 3293.941, 'duration': 2.561}, {'end': 3298.643, 'text': "Let's say that this is a type of flower.", 'start': 3296.942, 'duration': 1.701}, {'end': 3301.964, 'text': 'Type of flower and this is petal length and petal width.', 'start': 3299.003, 'duration': 2.961}, {'end': 3309.007, 'text': 'Now, over here you will be able to see different different flowers like rose, lily, sunflower and you have some flowers.', 'start': 3302.364, 'duration': 6.643}, {'end': 3312.508, 'text': "Let's consider that you have some missing data over here.", 'start': 3309.047, 'duration': 3.461}], 'summary': 'Mode is useful for categorical variables. works well with flower types.', 'duration': 28.19, 'max_score': 3284.318, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs3284318.jpg'}], 'start': 2529.576, 'title': 'Statistics for data science', 'summary': 'Covers intermediate statistics for data science, including measures of central tendency and dispersion, z score, and standard normal distribution, emphasizing practical application in real-world industry. it also discusses the impact of mean, median, and mode on data distribution, focusing on outliers and the suitability of each measure in different scenarios.', 'chapters': [{'end': 2729.832, 'start': 2529.576, 'title': 'Intermediate stats for data science', 'summary': 'Covers intermediate statistics for data science, including topics like measures of central tendency, dispersion, z score, and standard normal distribution, with an emphasis on notation and practical application in real-world industry.', 'duration': 200.256, 'highlights': ['The chapter covers measures of central tendency, dispersion, z score, and standard normal distribution for data science. It includes topics like measure of central tendency, measure of dispersion, z score, and standard normal distribution.', 'The importance of notation in statistics is emphasized, with practical application in real-world industry as a data scientist. Emphasizes the importance of using well-known notation in statistics, highlighting its practical application in the real-world industry as a data scientist.', 'The arithmetic mean for both population and sample is explained with detailed formulas and practical examples. Explains the arithmetic mean for population and sample, providing detailed formulas and a practical example of calculating the average.']}, {'end': 3321.912, 'start': 2729.832, 'title': 'Central measure of tendency & its impact', 'summary': 'Discusses central measure of tendency, covering mean, median, and mode, and their impact on data distribution, with emphasis on the effect of outliers and the suitability of each measure in different scenarios.', 'duration': 592.08, 'highlights': ['Impact of Outliers Outliers have a significant impact on the distribution, causing a substantial change in mean, making it crucial to handle them in data analysis and statistics.', 'Suitability of Median with Outliers Median demonstrates its effectiveness by exhibiting minimal deviation even with the presence of outliers, making it a suitable measure of central tendency in such scenarios.', 'Determining Mode for Categorical Variables Mode is particularly useful for categorical variables, such as identifying the most frequent type of flower in a dataset, making it valuable for such data analysis.']}], 'duration': 792.336, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs2529576.jpg', 'highlights': ['The chapter covers measures of central tendency, dispersion, z score, and standard normal distribution for data science.', 'The importance of notation in statistics is emphasized, with practical application in real-world industry as a data scientist.', 'The arithmetic mean for both population and sample is explained with detailed formulas and practical examples.', 'Impact of Outliers: Outliers have a significant impact on the distribution, causing a substantial change in mean, making it crucial to handle them in data analysis and statistics.', 'Suitability of Median with Outliers: Median demonstrates its effectiveness by exhibiting minimal deviation even with the presence of outliers, making it a suitable measure of central tendency in such scenarios.', 'Determining Mode for Categorical Variables: Mode is particularly useful for categorical variables, such as identifying the most frequent type of flower in a dataset, making it valuable for such data analysis.']}, {'end': 5002.628, 'segs': [{'end': 3405.506, 'src': 'embed', 'start': 3349.747, 'weight': 0, 'content': [{'end': 3354.209, 'text': 'so we can definitely say that most frequent element.', 'start': 3349.747, 'duration': 4.462}, {'end': 3363.694, 'text': 'you can actually get it by using mode which is most frequently used and this specifically works well with categorical variable.', 'start': 3354.209, 'duration': 9.485}, {'end': 3365.595, 'text': "now let's take another example.", 'start': 3363.694, 'duration': 1.901}, {'end': 3367.697, 'text': 'suppose i have a feature age, age.', 'start': 3365.595, 'duration': 2.102}, {'end': 3373.762, 'text': 'i have values like 25, 26, dash, dash, dash, dash 32, 34, 38.', 'start': 3367.697, 'duration': 6.065}, {'end': 3379.467, 'text': 'now, in this particular case, what do you think, what may be a suitable thing?', 'start': 3373.762, 'duration': 5.705}, {'end': 3383.23, 'text': "suppose let's say that these are my ages of students.", 'start': 3379.467, 'duration': 3.763}, {'end': 3389.021, 'text': 'should i apply mean median or mode?', 'start': 3383.23, 'duration': 5.791}, {'end': 3395.523, 'text': 'which do you think, based on this scenario, that is, ages of students, we should definitely apply?', 'start': 3389.021, 'duration': 6.502}, {'end': 3396.704, 'text': 'just tell me this answer.', 'start': 3395.523, 'duration': 1.181}, {'end': 3405.506, 'text': "in this particular case, definitely, i would suggest let's go with mean, because i know students age will basically range from one value to one value.", 'start': 3396.704, 'duration': 8.802}], 'summary': 'Use mean for age feature with missing values in student dataset.', 'duration': 55.759, 'max_score': 3349.747, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs3349747.jpg'}, {'end': 3598.11, 'src': 'embed', 'start': 3563.691, 'weight': 3, 'content': [{'end': 3568.034, 'text': 'So these two things, why I am teaching you with respect to population sample, it will all make sense.', 'start': 3563.691, 'duration': 4.343}, {'end': 3573.957, 'text': 'So usually population variance is given by something called as sigma square.', 'start': 3568.494, 'duration': 5.463}, {'end': 3578.64, 'text': 'Here you basically use as summation of i is equal to 1 to capital N.', 'start': 3574.777, 'duration': 3.863}, {'end': 3585.363, 'text': 'x of i minus mu whole square divided by n.', 'start': 3579.88, 'duration': 5.483}, {'end': 3598.11, 'text': 'Sample variance is basically given by small s square summation of i is equal to 1 to small n x of i minus x bar.', 'start': 3585.363, 'duration': 12.747}], 'summary': 'Teaching about population and sample variance calculation.', 'duration': 34.419, 'max_score': 3563.691, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs3563691.jpg'}, {'end': 4145.706, 'src': 'heatmap', 'start': 3742.098, 'weight': 0.762, 'content': [{'end': 3747.101, 'text': 'where do you think the variance is more variance?', 'start': 3742.098, 'duration': 5.003}, {'end': 3748.262, 'text': 'understand variance.', 'start': 3747.101, 'duration': 1.161}, {'end': 3753.226, 'text': 'variance whenever your things comes into mind it should be talking about spread.', 'start': 3748.262, 'duration': 4.964}, {'end': 3757.99, 'text': 'so over here in the second picture, definitely variance will be higher.', 'start': 3753.226, 'duration': 4.764}, {'end': 3759.992, 'text': "Let's consider that.", 'start': 3758.61, 'duration': 1.382}, {'end': 3761.414, 'text': "I'm just going to take this example.", 'start': 3759.992, 'duration': 1.422}, {'end': 3764.018, 'text': 'Here my variance is 1.81.', 'start': 3761.855, 'duration': 2.163}, {'end': 3766.281, 'text': "Let's consider that this is 1.81.", 'start': 3764.018, 'duration': 2.263}, {'end': 3770.29, 'text': 'And tomorrow, if I probably get 5.45.', 'start': 3766.281, 'duration': 4.009}, {'end': 3774.632, 'text': 'Can I say that it may belong to this particular distribution? Yes.', 'start': 3770.29, 'duration': 4.342}, {'end': 3779.274, 'text': 'So the variance will be definitely higher because the spread is quite high.', 'start': 3775.192, 'duration': 4.082}, {'end': 3787.377, 'text': 'Spread, when we say spread is basically high, that basically means the elements that is present in the central region is more.', 'start': 3779.574, 'duration': 7.803}, {'end': 3791.679, 'text': 'Whenever I talk about more variance, that basically means the data is more dispersed.', 'start': 3787.817, 'duration': 3.862}, {'end': 3796.441, 'text': 'Let me talk about this also to you so that you can understand.', 'start': 3791.999, 'duration': 4.442}, {'end': 3801.34, 'text': "Okay Now let's forget about standard deviation for right now.", 'start': 3797.001, 'duration': 4.339}, {'end': 3803.902, 'text': "Now in this particular image, let's see.", 'start': 3801.821, 'duration': 2.081}, {'end': 3811.586, 'text': 'In this particular image, what do you see over here? You can see over here standard deviation is 10, standard deviation is 50.', 'start': 3804.622, 'duration': 6.964}, {'end': 3816.768, 'text': 'Now if you see standard deviation formula, it is nothing but root of variance.', 'start': 3811.586, 'duration': 5.182}, {'end': 3822.711, 'text': "Now here you can see when the standard deviation is smaller, that basically means you're having a very huge curve.", 'start': 3817.128, 'duration': 5.583}, {'end': 3826.433, 'text': 'That basically means the data is not that much distributed.', 'start': 3822.831, 'duration': 3.602}, {'end': 3833.4, 'text': 'When you have a big standard deviation like 50, 60 and all, you can see your data is highly distributed.', 'start': 3827.094, 'duration': 6.306}, {'end': 3836.823, 'text': 'So this is very much important to understand.', 'start': 3833.62, 'duration': 3.203}, {'end': 3839.165, 'text': 'Why variance is more for dispersed data?', 'start': 3837.103, 'duration': 2.062}, {'end': 3840.727, 'text': 'Because over here you can see right guys?', 'start': 3839.205, 'duration': 1.522}, {'end': 3848.534, 'text': "Okay, when you probably calculate, I'll show you some of the problem statements over here, but just understand this graphically, okay?", 'start': 3841.227, 'duration': 7.307}, {'end': 3853.038, 'text': "Later on I'll just show you one example where probably I will talk about it,", 'start': 3849.114, 'duration': 3.924}, {'end': 3857.963, 'text': "and let's try to solve that particular example and then we can definitely understand it.", 'start': 3853.038, 'duration': 4.925}, {'end': 3863.889, 'text': 'But some idea you basically got, because obviously the variance needs to be spreaded high.', 'start': 3858.284, 'duration': 5.605}, {'end': 3865.571, 'text': 'if the variance is high, right?', 'start': 3863.889, 'duration': 1.682}, {'end': 3870.076, 'text': 'The dispersion becomes high because you have more number of values inside it.', 'start': 3865.611, 'duration': 4.465}, {'end': 3872.297, 'text': "Now, let's go ahead and let's try to see.", 'start': 3870.496, 'duration': 1.801}, {'end': 3874.739, 'text': 'Now I got my variance as 1.81.', 'start': 3872.357, 'duration': 2.382}, {'end': 3878.962, 'text': 'Now my standard deviation is nothing but root of variance.', 'start': 3874.739, 'duration': 4.223}, {'end': 3880.003, 'text': 'Root of variance.', 'start': 3879.382, 'duration': 0.621}, {'end': 3883.045, 'text': 'That basically means it is nothing but root of 1.81.', 'start': 3880.043, 'duration': 3.002}, {'end': 3889.189, 'text': "So if I go and open my calculator, I'll just say root of 1.81.", 'start': 3883.045, 'duration': 6.144}, {'end': 3891.711, 'text': "And there I'm actually getting is nothing but 1.345.", 'start': 3889.189, 'duration': 2.522}, {'end': 3892.172, 'text': 'So 1.345.', 'start': 3891.711, 'duration': 0.461}, {'end': 3898.316, 'text': 'Now see what does standard deviation basically mean?', 'start': 3892.172, 'duration': 6.144}, {'end': 3900.802, 'text': 'What is the mean in this particular case?', 'start': 3899.159, 'duration': 1.643}, {'end': 3902.364, 'text': 'What is the mean?', 'start': 3901.623, 'duration': 0.741}, {'end': 3904.247, 'text': 'Mean is nothing but 2.83, right?', 'start': 3902.545, 'duration': 1.702}, {'end': 3906.391, 'text': "Let's consider this one.", 'start': 3905.449, 'duration': 0.942}, {'end': 3907.352, 'text': 'The mean is 2.83.', 'start': 3906.852, 'duration': 0.5}, {'end': 3912.325, 'text': 'Now from this mean your data will be distributed,', 'start': 3907.352, 'duration': 4.973}, {'end': 3920.67, 'text': 'because mean is basically specifying your measure of central tendency basically says that where the center is there for that specific distribution.', 'start': 3912.325, 'duration': 8.345}, {'end': 3929.976, 'text': 'so from here, if i go one step right, one standard deviation to the right, you have seen standard deviation formula.', 'start': 3920.67, 'duration': 9.306}, {'end': 3940.367, 'text': "the next element that may probably fall between the one standard deviation will range between let's consider that this is my first standard deviation to the right.", 'start': 3929.976, 'duration': 10.391}, {'end': 3942.369, 'text': 'then it will basically have 2.83 plus 3.4.', 'start': 3940.367, 'duration': 2.002}, {'end': 3948.45, 'text': 'so this is nothing but 4.17.', 'start': 3942.369, 'duration': 6.081}, {'end': 3951.01, 'text': 'That basically means in this distribution,', 'start': 3948.45, 'duration': 2.56}, {'end': 3958.892, 'text': 'whatever elements are basically present between 2.83 to 4.17 will be falling within the first standard deviation.', 'start': 3951.01, 'duration': 7.882}, {'end': 3966.713, 'text': "And if I consider the same thing towards the left, that basically is one standard deviation towards the left then what I'll do?", 'start': 3959.332, 'duration': 7.381}, {'end': 3968.194, 'text': "I'll just subtract 1.34..", 'start': 3966.713, 'duration': 1.481}, {'end': 3969.234, 'text': 'So this will basically be 9741.', 'start': 3968.194, 'duration': 1.04}, {'end': 3975.595, 'text': 'So it will basically become 1.49.', 'start': 3969.234, 'duration': 6.361}, {'end': 3976.156, 'text': 'now here.', 'start': 3975.595, 'duration': 0.561}, {'end': 3983.943, 'text': 'it basically says that any elements that falls between 1.49 to 2.83 will be falling in this region.', 'start': 3976.156, 'duration': 7.787}, {'end': 3986.285, 'text': 'that is one standard deviation to the left.', 'start': 3983.943, 'duration': 2.342}, {'end': 3988.988, 'text': "similarly, we'll go with the second standard deviation.", 'start': 3986.285, 'duration': 2.703}, {'end': 3995.974, 'text': 'now, in this particular case it will be four point one, seven, one point, three, four, five, five, five point, five, one.', 'start': 3988.988, 'duration': 6.986}, {'end': 3997.416, 'text': 'similarly, you go and calculate similarly.', 'start': 3995.974, 'duration': 1.442}, {'end': 3998.577, 'text': 'you go and calculate similarly.', 'start': 3997.416, 'duration': 1.161}, {'end': 4003.157, 'text': 'Now your standard deviation is a very small number.', 'start': 3999.955, 'duration': 3.202}, {'end': 4004.817, 'text': "Still, I'll say that this is a small number.", 'start': 4003.197, 'duration': 1.62}, {'end': 4009.239, 'text': 'And if I probably try to construct a graph, it will look something like this.', 'start': 4005.278, 'duration': 3.961}, {'end': 4015.002, 'text': 'The tip, right, this region that you probably will see, this is basically called as a bell curve.', 'start': 4009.64, 'duration': 5.362}, {'end': 4020.485, 'text': "And based on the standard deviation and variance, you'll be able to decide two important things.", 'start': 4015.362, 'duration': 5.123}, {'end': 4024.667, 'text': "With the help of variance, definitely you'll be able to understand how the data is spread.", 'start': 4020.585, 'duration': 4.082}, {'end': 4032.332, 'text': 'And with standard deviation, you will be able to understand that between one standard deviation to the right and the left,', 'start': 4025.567, 'duration': 6.765}, {'end': 4034.914, 'text': 'what may be the range of data that may be falling.', 'start': 4032.332, 'duration': 2.582}, {'end': 4039.297, 'text': 'So standard deviation is nothing but it is the square root of variance.', 'start': 4035.394, 'duration': 3.903}, {'end': 4044.5, 'text': 'That basically means from the mean, right, how far an element can be.', 'start': 4039.977, 'duration': 4.523}, {'end': 4045.901, 'text': "Let's consider that if I consider 5.", 'start': 4044.56, 'duration': 1.341}, {'end': 4051.022, 'text': 'Now for 5, If you try to calculate, it may fall somewhere here.', 'start': 4045.901, 'duration': 5.121}, {'end': 4059.85, 'text': 'So how are you going to represent 5? You will say that it falls in 1.5 standard deviation from the mean.', 'start': 4051.683, 'duration': 8.167}, {'end': 4063.433, 'text': 'So this kind of definition, you will be able to tell them.', 'start': 4060.33, 'duration': 3.103}, {'end': 4069.078, 'text': 'So that basically means from the mean, how far a specific number is with respect to standard deviation.', 'start': 4063.633, 'duration': 5.445}, {'end': 4073.919, 'text': "you're calculating, you're using a unit called as standard deviation for saying that.", 'start': 4069.438, 'duration': 4.481}, {'end': 4076.84, 'text': 'And variance specifically talk about spread.', 'start': 4074.4, 'duration': 2.44}, {'end': 4083.022, 'text': 'If the variance is high, the values, the data spread that is there is very, very high.', 'start': 4077.24, 'duration': 5.782}, {'end': 4089.364, 'text': "Now let's understand some amazing basic things, which is called as percentile and quartiles.", 'start': 4083.362, 'duration': 6.002}, {'end': 4095.593, 'text': 'This is the first step to find outliers.', 'start': 4089.884, 'duration': 5.709}, {'end': 4097.273, 'text': 'how do we find an outlier?', 'start': 4095.593, 'duration': 1.68}, {'end': 4105.774, 'text': 'so probably we are going to discuss in this the first and with the help of code also, you can basically do now with respect to percentiles.', 'start': 4097.273, 'duration': 8.501}, {'end': 4108.736, 'text': "let's try to understand what is percentiles.", 'start': 4105.774, 'duration': 2.962}, {'end': 4110.696, 'text': 'and how do you find out percentile now?', 'start': 4108.736, 'duration': 1.96}, {'end': 4115.256, 'text': 'before understanding percentile, you basically need to understand about percentage.', 'start': 4110.696, 'duration': 4.56}, {'end': 4118.738, 'text': 'suppose, if i have a distribution, i say one, two, three, four, five.', 'start': 4115.256, 'duration': 3.482}, {'end': 4125.126, 'text': 'Now, my question is that what is the percentage of numbers that are odd?', 'start': 4119.377, 'duration': 5.749}, {'end': 4127.89, 'text': 'So how do you basically apply a formula over here?', 'start': 4125.466, 'duration': 2.424}, {'end': 4135.841, 'text': 'So I can basically say percentage is equal to number of numbers that are odd, divided by total numbers.', 'start': 4128.371, 'duration': 7.47}, {'end': 4139.176, 'text': 'So if I really try to calculate how many numbers are odd, 1, 2, 3.', 'start': 4136.254, 'duration': 2.922}, {'end': 4145.706, 'text': 'So 3 divided by 5 is nothing but how much? 0.6, which is nothing but 60 percentage.', 'start': 4139.178, 'duration': 6.528}], 'summary': 'Explains variance, standard deviation, and distribution spread in data analysis using numerical examples and graphical representations.', 'duration': 403.608, 'max_score': 3742.098, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs3742098.jpg'}, {'end': 4044.5, 'src': 'embed', 'start': 4015.362, 'weight': 2, 'content': [{'end': 4020.485, 'text': "And based on the standard deviation and variance, you'll be able to decide two important things.", 'start': 4015.362, 'duration': 5.123}, {'end': 4024.667, 'text': "With the help of variance, definitely you'll be able to understand how the data is spread.", 'start': 4020.585, 'duration': 4.082}, {'end': 4032.332, 'text': 'And with standard deviation, you will be able to understand that between one standard deviation to the right and the left,', 'start': 4025.567, 'duration': 6.765}, {'end': 4034.914, 'text': 'what may be the range of data that may be falling.', 'start': 4032.332, 'duration': 2.582}, {'end': 4039.297, 'text': 'So standard deviation is nothing but it is the square root of variance.', 'start': 4035.394, 'duration': 3.903}, {'end': 4044.5, 'text': 'That basically means from the mean, right, how far an element can be.', 'start': 4039.977, 'duration': 4.523}], 'summary': 'Variance helps understand data spread; standard deviation shows range around mean.', 'duration': 29.138, 'max_score': 4015.362, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs4015362.jpg'}, {'end': 4250.666, 'src': 'embed', 'start': 4215.508, 'weight': 5, 'content': [{'end': 4217.769, 'text': 'So this is the definition of percentile.', 'start': 4215.508, 'duration': 2.261}, {'end': 4223.733, 'text': 'It is basically saying, it is a value, if I say, okay, this number is the 25 percentile.', 'start': 4218.53, 'duration': 5.203}, {'end': 4230.296, 'text': 'this basically says that 25 percentage of the entire distribution is less than that particular value.', 'start': 4225.154, 'duration': 5.142}, {'end': 4236.059, 'text': 'so percentile is a value below which a certain percentage of observation rely.', 'start': 4230.296, 'duration': 5.763}, {'end': 4238.94, 'text': 'let me take a very good example and show it to you.', 'start': 4236.059, 'duration': 2.881}, {'end': 4250.666, 'text': 'suppose i have a data set and inside this data set i have elements like 2, comma 2, 3, comma 4, comma 5, comma 5, 6, comma 7, comma 8, comma 8,', 'start': 4238.94, 'duration': 11.726}], 'summary': 'Percentile indicates value below which a certain percentage of observations lie.', 'duration': 35.158, 'max_score': 4215.508, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs4215508.jpg'}, {'end': 4297.614, 'src': 'embed', 'start': 4272.864, 'weight': 6, 'content': [{'end': 4280.087, 'text': 'My question is, What is the percentile ranking of 10? So this is my question.', 'start': 4272.864, 'duration': 7.223}, {'end': 4282.788, 'text': 'We solve this problem by using a simple formula.', 'start': 4280.467, 'duration': 2.321}, {'end': 4294.473, 'text': "I want to find out the percentile rank of 10, right? So my formula, let's consider this x is equal to 10.", 'start': 4283.188, 'duration': 11.285}, {'end': 4297.614, 'text': 'Okay, so here I am specifically going to write x.', 'start': 4294.473, 'duration': 3.141}], 'summary': 'Finding percentile rank of 10 using a simple formula.', 'duration': 24.75, 'max_score': 4272.864, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs4272864.jpg'}, {'end': 4536.117, 'src': 'heatmap', 'start': 4336.212, 'weight': 0.856, 'content': [{'end': 4338.533, 'text': 'So how many number of values? X is 10.', 'start': 4336.212, 'duration': 2.321}, {'end': 4343.371, 'text': 'How many number of values are below X? 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16.', 'start': 4338.533, 'duration': 4.838}, {'end': 4345.938, 'text': 'So this will basically become 16.', 'start': 4343.377, 'duration': 2.561}, {'end': 4347.299, 'text': '16 divided by 20 multiplied by 100.', 'start': 4345.938, 'duration': 1.361}, {'end': 4356.944, 'text': 'In short, this will become 16.', 'start': 4347.299, 'duration': 9.645}, {'end': 4365.43, 'text': '4. 4s are 16, 4, 5s are 1s are 20s are so 80 percentile will basically be my answer for this.', 'start': 4356.944, 'duration': 8.486}, {'end': 4372.554, 'text': 'that basically means, if i really want to find out what this 10 Value percentile is, it is 80..', 'start': 4365.43, 'duration': 7.124}, {'end': 4374.834, 'text': 'Now understand what is the main meaning out of it.', 'start': 4372.554, 'duration': 2.28}, {'end': 4382.037, 'text': 'The main meaning is that 80%, please listen to me very, very carefully.', 'start': 4375.395, 'duration': 6.642}, {'end': 4385.278, 'text': '80% of the entire distribution is less than 10.', 'start': 4382.057, 'duration': 3.221}, {'end': 4388.839, 'text': 'This is the real meaning that you can probably understand from it.', 'start': 4385.278, 'duration': 3.561}, {'end': 4399.702, 'text': "Now quickly, what is the percentile ranking of 11, of value 11? So how many elements are present below 11? I'll say 17.", 'start': 4389.259, 'duration': 10.443}, {'end': 4401.803, 'text': 'Divide by 20 multiplied by 100.', 'start': 4399.702, 'duration': 2.101}, {'end': 4402.064, 'text': '1 za, 5 za, 85%.', 'start': 4401.803, 'duration': 0.261}, {'end': 4404.685, 'text': "Let's do the reverse of this.", 'start': 4402.064, 'duration': 2.621}, {'end': 4419.453, 'text': 'So, from this particular distribution, what value exists at percentile ranking of 25%?', 'start': 4408.188, 'duration': 11.265}, {'end': 4420.833, 'text': 'So how do you calculate this?', 'start': 4419.453, 'duration': 1.38}, {'end': 4423.395, 'text': 'For this you use a very simple formula.', 'start': 4420.953, 'duration': 2.442}, {'end': 4425.715, 'text': 'And the formula is something like this.', 'start': 4424.035, 'duration': 1.68}, {'end': 4434.6, 'text': 'Value is equal to percentile divided by 100, multiplied by n plus 1.', 'start': 4426.076, 'duration': 8.524}, {'end': 4442.104, 'text': "now see, guys, i'm not going to derive the formula why it is n plus 1, why it is n minus 1, why it is this for sample variance.", 'start': 4434.6, 'duration': 7.504}, {'end': 4443.886, 'text': "i'll discuss about why n minus 1.", 'start': 4442.104, 'duration': 1.782}, {'end': 4449.549, 'text': 'but understand, we really need to understand what things we are doing and how we are using it in some specific purpose.', 'start': 4443.886, 'duration': 5.663}, {'end': 4455.211, 'text': 'so percentile over here is 25 by 100, multiplied by 21..', 'start': 4449.549, 'duration': 5.662}, {'end': 4459.752, 'text': 'Now understand this, this 5.25 is the index position.', 'start': 4455.211, 'duration': 4.541}, {'end': 4461.713, 'text': 'It is very much important to understand.', 'start': 4460.212, 'duration': 1.501}, {'end': 4464.493, 'text': 'This is not the value, the index position.', 'start': 4461.733, 'duration': 2.76}, {'end': 4468.634, 'text': 'Now I will go and find out which is 5.25.', 'start': 4465.073, 'duration': 3.561}, {'end': 4476.916, 'text': 'So this is my first element, first index, second index, third index, fourth index, fifth index, and 5.25 will be in between this.', 'start': 4468.634, 'duration': 8.282}, {'end': 4482.745, 'text': "But right now, I don't see any element between this.", 'start': 4479.4, 'duration': 3.345}, {'end': 4489.193, 'text': 'So what we do is that we take 5th and 6th index and then we do the average and we calculate the value.', 'start': 4483.165, 'duration': 6.028}, {'end': 4492.277, 'text': 'In this particular case, my answer will be 5.', 'start': 4489.654, 'duration': 2.623}, {'end': 4494.821, 'text': 'So 5 is the value for 25 percentile.', 'start': 4492.277, 'duration': 2.544}, {'end': 4500.026, 'text': 'Try to find out what is 75 percentile.', 'start': 4496.943, 'duration': 3.083}, {'end': 4506.812, 'text': 'So if I use 75 divided by 100 multiplied by 21 15.75 is the index position.', 'start': 4500.286, 'duration': 6.526}, {'end': 4508.373, 'text': 'Now go and count which is 15.75 from the top.', 'start': 4506.872, 'duration': 1.501}, {'end': 4520.993, 'text': '1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15.', 'start': 4508.393, 'duration': 12.6}, {'end': 4523.146, 'text': '15.75 is the sum of these two numbers.', 'start': 4521.004, 'duration': 2.142}, {'end': 4524.848, 'text': 'So my answer is 9.', 'start': 4523.567, 'duration': 1.281}, {'end': 4528.01, 'text': '15.75 is the index position.', 'start': 4524.848, 'duration': 3.162}, {'end': 4530.892, 'text': "So here I'm basically getting the 9 answer.", 'start': 4528.47, 'duration': 2.422}, {'end': 4536.117, 'text': "Now let's go and discuss about a new topic, which is called as 5 number summary.", 'start': 4531.293, 'duration': 4.824}], 'summary': '80% of the distribution is less than 10. calculating percentiles and 5 number summary.', 'duration': 199.905, 'max_score': 4336.212, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs4336212.jpg'}, {'end': 4506.812, 'src': 'embed', 'start': 4479.4, 'weight': 7, 'content': [{'end': 4482.745, 'text': "But right now, I don't see any element between this.", 'start': 4479.4, 'duration': 3.345}, {'end': 4489.193, 'text': 'So what we do is that we take 5th and 6th index and then we do the average and we calculate the value.', 'start': 4483.165, 'duration': 6.028}, {'end': 4492.277, 'text': 'In this particular case, my answer will be 5.', 'start': 4489.654, 'duration': 2.623}, {'end': 4494.821, 'text': 'So 5 is the value for 25 percentile.', 'start': 4492.277, 'duration': 2.544}, {'end': 4500.026, 'text': 'Try to find out what is 75 percentile.', 'start': 4496.943, 'duration': 3.083}, {'end': 4506.812, 'text': 'So if I use 75 divided by 100 multiplied by 21 15.75 is the index position.', 'start': 4500.286, 'duration': 6.526}], 'summary': 'Calculate 25th percentile as 5, try to find 75th percentile at index position 15.75.', 'duration': 27.412, 'max_score': 4479.4, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs4479400.jpg'}, {'end': 4562.571, 'src': 'embed', 'start': 4536.317, 'weight': 8, 'content': [{'end': 4541.901, 'text': 'In 5 number summary, we need to discuss about something called as, first one is something called as minimum.', 'start': 4536.317, 'duration': 5.584}, {'end': 4547.426, 'text': 'The second topic that we should discuss about is something called as first quartile, which is also denoted by Q1.', 'start': 4542.381, 'duration': 5.045}, {'end': 4552.287, 'text': 'The third topic that we must discuss about is something called as median.', 'start': 4548.706, 'duration': 3.581}, {'end': 4559.15, 'text': 'The fourth topic that we should discuss about third quartile, which is also read as Q3.', 'start': 4553.768, 'duration': 5.382}, {'end': 4562.571, 'text': 'And the fifth topic we basically discuss about maximum.', 'start': 4559.91, 'duration': 2.661}], 'summary': 'Discuss the 5 number summary including minimum, q1, median, q3, and maximum.', 'duration': 26.254, 'max_score': 4536.317, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs4536317.jpg'}, {'end': 4633.324, 'src': 'embed', 'start': 4607.866, 'weight': 9, 'content': [{'end': 4614.57, 'text': 'always understand, guys, whenever we need to remove an outlier, we really need to define a lower fence.', 'start': 4607.866, 'duration': 6.704}, {'end': 4620.694, 'text': "let's consider that i'm going to define a lower fence and then i'm going to define a higher fence.", 'start': 4614.57, 'duration': 6.124}, {'end': 4626.358, 'text': 'the values that you have over here will be between lower fence to higher fence.', 'start': 4620.694, 'duration': 5.664}, {'end': 4633.324, 'text': 'That basically means after a greater number, all the numbers above that number will be an outlier.', 'start': 4626.398, 'duration': 6.926}], 'summary': 'Define lower and higher fences to remove outliers.', 'duration': 25.458, 'max_score': 4607.866, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs4607866.jpg'}, {'end': 4801.145, 'src': 'embed', 'start': 4770.302, 'weight': 10, 'content': [{'end': 4772.422, 'text': 'You will get the 15th index for Q3.', 'start': 4770.302, 'duration': 2.12}, {'end': 4775.504, 'text': 'So you are basically going to get 7.', 'start': 4773.143, 'duration': 2.361}, {'end': 4782.75, 'text': 'Now if I go and compute the interquartile range, what is interquartile range? 7 minus 3, which is nothing but 4.', 'start': 4775.504, 'duration': 7.246}, {'end': 4784.672, 'text': 'Now you have calculated the IQR.', 'start': 4782.75, 'duration': 1.922}, {'end': 4789.676, 'text': 'So what all things we have calculated? The IQR, Q3, Q1, everything has been computed.', 'start': 4784.732, 'duration': 4.944}, {'end': 4792.618, 'text': "Now let's go ahead and compute the lower fence.", 'start': 4790.196, 'duration': 2.422}, {'end': 4801.145, 'text': 'Now the lower fence basically say Q1 minus 1.5 multiplied by IQR, right? This is what lower fence formula is.', 'start': 4793.139, 'duration': 8.006}], 'summary': 'Calculated q3, iqr, q1, and lower fence with specific values.', 'duration': 30.843, 'max_score': 4770.302, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs4770302.jpg'}, {'end': 4893.358, 'src': 'embed', 'start': 4861.134, 'weight': 11, 'content': [{'end': 4864.617, 'text': 'Anything lesser than minus 3 is considered as an outlier.', 'start': 4861.134, 'duration': 3.483}, {'end': 4868.901, 'text': 'So which number should we remove? We should remove 27.', 'start': 4865.078, 'duration': 3.823}, {'end': 4872.624, 'text': 'Why? 27 is greater than 13, which is from the higher fence.', 'start': 4868.901, 'duration': 3.723}, {'end': 4876.347, 'text': 'Now let me write the distributions once again for all of you.', 'start': 4873.024, 'duration': 3.323}, {'end': 4879.619, 'text': 'Let me write the distribution after removing the 13.', 'start': 4876.747, 'duration': 2.872}, {'end': 4893.358, 'text': 'So the remaining data what I have 1, 2, 2, 3, 3, 4, 5, 5, 5, 6, 6, 6, 6, 7, 8, 8, 9, 27.', 'start': 4879.619, 'duration': 13.739}], 'summary': 'An outlier was identified as 27, greater than the higher fence of 13, leading to its removal from the distribution.', 'duration': 32.224, 'max_score': 4861.134, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs4861134.jpg'}, {'end': 4956.956, 'src': 'embed', 'start': 4922.054, 'weight': 12, 'content': [{'end': 4925.056, 'text': 'So here you are getting your 5 number summary.', 'start': 4922.054, 'duration': 3.002}, {'end': 4927.777, 'text': 'Now quickly compute median and tell me.', 'start': 4925.536, 'duration': 2.241}, {'end': 4930.958, 'text': 'What is median? Median is nothing but 5.', 'start': 4928.157, 'duration': 2.801}, {'end': 4935.541, 'text': "Now let's draw a plot which is called as box plot.", 'start': 4930.958, 'duration': 4.583}, {'end': 4940.809, 'text': 'By this specific data, you can definitely draw a box plot.', 'start': 4936.607, 'duration': 4.202}, {'end': 4946.591, 'text': 'Now how does a box plot basically get drawn? So you will be having x-axis.', 'start': 4941.569, 'duration': 5.022}, {'end': 4956.956, 'text': "And let's consider that in this particular x-axis, you have values like minus 2, 0, 2, 4, 6, 8, 10.", 'start': 4947.512, 'duration': 9.444}], 'summary': 'Compute median as 5 and draw a box plot with specific data.', 'duration': 34.902, 'max_score': 4922.054, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs4922054.jpg'}], 'start': 3322.152, 'title': 'Descriptive statistics', 'summary': 'Covers handling missing values using mean, median, and mode, emphasizing mode for categorical variables and mean for numerical variables. it also discusses variance, standard deviation as measures of dispersion, explains percentiles, and introduces the five number summary and outlier removal process.', 'chapters': [{'end': 3425.5, 'start': 3322.152, 'title': 'Handling missing values using mean, median, and mode', 'summary': 'Illustrates the use of mean, median, and mode to handle missing values, emphasizing the use of mode for replacing missing values in categorical variables and mean for numerical variables, with emphasis on domain knowledge.', 'duration': 103.348, 'highlights': ['Using mode to replace missing values works well with categorical variables as it represents the most frequent element and is specifically suited for this data type.', 'Applying mean for numerical variables, such as ages, is suitable when the values are within a specific range, and domain knowledge plays a crucial role in deciding the appropriate method.', 'The example of handling missing values in a dataset of student ages demonstrates the choice of mean due to the range of ages within a specific value, highlighting the consideration of domain knowledge in making the decision.']}, {'end': 4083.022, 'start': 3425.981, 'title': 'Measure of dispersion: variance and standard deviation', 'summary': 'Discusses the concepts of variance and standard deviation as measures of dispersion, highlighting their formulas and significance in understanding data spread and distribution.', 'duration': 657.041, 'highlights': ['Variance and standard deviation are key topics in measure of dispersion, essential for understanding data spread and differences in distributions. The chapter emphasizes the importance of variance and standard deviation in understanding the spread and differences in distributions.', "The formulas for population and sample variance are provided, clarifying the distinction and significance of the 'n' and 'n-1' terms in the equations. The chapter explains the formulas for population and sample variance, highlighting the significance of 'n' and 'n-1' in the equations.", 'A practical example is used to demonstrate the calculation of variance and standard deviation for a specific dataset, emphasizing the quantitative analysis of data spread. A practical example is used to demonstrate the quantitative analysis of data spread through the calculation of variance and standard deviation for a specific dataset.', 'The concept of standard deviation is explained as a measure of how far an element can be from the mean, with relevance to data distribution and representation. The chapter explains the concept of standard deviation as a measure of how far an element can be from the mean, emphasizing its relevance to data distribution and representation.']}, {'end': 4536.117, 'start': 4083.362, 'title': 'Understanding percentiles and quartiles', 'summary': 'Explains the concept of percentiles, including their definition and calculation, using examples and formulas. it also demonstrates how to find the percentile ranking of specific values and introduces the 5 number summary.', 'duration': 452.755, 'highlights': ['Percentile is a value below which a certain percentage of observations lie, calculated using the formula (number of values below x / sample size) * 100. The definition and calculation of percentiles are thoroughly explained, emphasizing the importance of understanding the formula and using it to derive the percentile ranking of specific values.', 'Demonstrates the calculation of percentile ranking for the value 10, yielding a result of 80%, indicating that 80% of the entire distribution is less than 10. A detailed example of calculating the percentile ranking for the value 10 is provided, emphasizing the interpretation of the result in the context of the distribution.', 'Illustrates the calculation of the value at the 25th percentile using the formula (percentile / 100) * (sample size + 1), followed by finding the average of the values at the calculated index positions. The process of calculating the value at the 25th percentile is demonstrated, emphasizing the step-by-step application of the formula and the use of averaging for index positions.', 'Introduces the concept of the 5 number summary, indicating a forthcoming discussion on this topic. The chapter introduces the 5 number summary as a new topic, setting the stage for further exploration of this concept.']}, {'end': 5002.628, 'start': 4536.317, 'title': 'Five number summary & outlier removal', 'summary': 'Covers the five number summary (minimum, q1, median, q3, maximum) and the process of removing outliers using lower and upper fences, with the formula for iqr (interquartile range) and box plot representation.', 'duration': 466.311, 'highlights': ['The process of removing outliers involves defining lower and upper fences, where values outside this range are considered outliers. The chapter emphasizes the importance of defining lower and upper fences to identify and remove outliers from a dataset.', 'The formula for IQR (interquartile range) is given by Q3 minus Q1, with Q1 and Q3 representing the 25th and 75th percentiles, respectively. The formula for IQR (interquartile range) is explained as Q3 minus Q1, where Q1 and Q3 correspond to the 25th and 75th percentiles, respectively.', 'The 5 number summary for the given dataset is minimum: 1, Q1: 3, median: 5, Q3: 7, maximum: 9, after removing the outlier 27. The 5 number summary (minimum, Q1, median, Q3, maximum) is provided for the dataset, with the outlier 27 removed.', 'The process of drawing a box plot involves plotting the minimum, Q1, median, Q3, and maximum on the x-axis and joining the respective values to form the box plot representation. The chapter explains the process of drawing a box plot by plotting the minimum, Q1, median, Q3, and maximum on the x-axis and joining the respective values to form the box plot representation.', 'Calculation of the lower fence is performed using the formula Q1 minus 1.5 multiplied by IQR, while the upper fence is calculated as Q3 plus 1.5 multiplied by IQR. The calculation of the lower fence and upper fence is explained, with the lower fence formula as Q1 minus 1.5 multiplied by IQR and the upper fence formula as Q3 plus 1.5 multiplied by IQR.']}], 'duration': 1680.476, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs3322152.jpg', 'highlights': ['Using mode for categorical variables is specifically suited for this data type.', 'Applying mean for numerical variables is suitable when values are within a specific range.', 'Variance and standard deviation are essential for understanding data spread and differences in distributions.', "The formulas for population and sample variance are provided, clarifying the distinction and significance of 'n' and 'n-1' terms.", 'A practical example is used to demonstrate the calculation of variance and standard deviation for a specific dataset.', 'Percentile is a value below which a certain percentage of observations lie, calculated using a specific formula.', 'Demonstrates the calculation of percentile ranking for the value 10, yielding a result of 80%.', 'Illustrates the calculation of the value at the 25th percentile using a specific formula, followed by finding the average of the values at the calculated index positions.', 'Introduces the concept of the 5 number summary, indicating a forthcoming discussion on this topic.', 'The process of removing outliers involves defining lower and upper fences to identify and remove outliers from a dataset.', 'The formula for IQR (interquartile range) is given by Q3 minus Q1, with Q1 and Q3 representing the 25th and 75th percentiles, respectively.', 'The 5 number summary for the given dataset is provided, with the outlier 27 removed.', 'The process of drawing a box plot is explained by plotting the minimum, Q1, median, Q3, and maximum on the x-axis and joining the respective values to form the box plot representation.', 'Calculation of the lower fence and upper fence is explained using specific formulas.']}, {'end': 6467.67, 'segs': [{'end': 5102.2, 'src': 'embed', 'start': 5078.441, 'weight': 2, 'content': [{'end': 5085.748, 'text': 'Box plots can be used to determine outliers, because I told you that if I was giving 27 over here, my element would have come over here.', 'start': 5078.441, 'duration': 7.307}, {'end': 5092.435, 'text': 'So box plot actually gives you a visualization way to basically see where an outlier is actually present.', 'start': 5086.148, 'duration': 6.287}, {'end': 5099.758, 'text': 'if someone asks you how do you create or how do you determine an outlier, you can explain this entire concepts,', 'start': 5093.095, 'duration': 6.663}, {'end': 5102.2, 'text': 'whatever i have explained with respect to percentiles.', 'start': 5099.758, 'duration': 2.442}], 'summary': 'Box plots help identify outliers by visualizing data distribution and percentiles.', 'duration': 23.759, 'max_score': 5078.441, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs5078441.jpg'}, {'end': 5425.841, 'src': 'embed', 'start': 5397.258, 'weight': 3, 'content': [{'end': 5403.705, 'text': 'and one important property of this bell curve is that this side is exactly symmetrical to this side.', 'start': 5397.258, 'duration': 6.447}, {'end': 5409.991, 'text': 'So there are many inferential statistics that we will probably be discussing about in the future.', 'start': 5404.045, 'duration': 5.946}, {'end': 5417.316, 'text': 'about this bell curve, about this entire distribution or Gaussian distribution, here you can see that, it is exactly similar.', 'start': 5410.772, 'duration': 6.544}, {'end': 5419.757, 'text': 'I mean it is exactly symmetrical.', 'start': 5417.696, 'duration': 2.061}, {'end': 5425.841, 'text': 'The right part of the curve, when I say consider this particular particular path is equal to this part.', 'start': 5420.338, 'duration': 5.503}], 'summary': 'Bell curve is symmetrical, representing a gaussian distribution.', 'duration': 28.583, 'max_score': 5397.258, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs5397258.jpg'}, {'end': 5682.526, 'src': 'embed', 'start': 5651.465, 'weight': 0, 'content': [{'end': 5661.469, 'text': 'within the two standard deviation region, which is this specific region, around 95% of the entire data lies in this region.', 'start': 5651.465, 'duration': 10.004}, {'end': 5667.752, 'text': 'And similarly, if I go and consider with respect to the third standard deviation, which is from here to here,', 'start': 5661.95, 'duration': 5.802}, {'end': 5675.983, 'text': 'around 99.7% of the entire distribution will fall in this region.', 'start': 5668.76, 'duration': 7.223}, {'end': 5682.526, 'text': 'So that is the reason why it is basically called as 68, 95 and 99.7 percentile loop.', 'start': 5676.283, 'duration': 6.243}], 'summary': 'Within 2 standard deviations, 95% of data lies; within 3 standard deviations, 99.7% lies. known as 68, 95, and 99.7 percentile loop.', 'duration': 31.061, 'max_score': 5651.465, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs5651465.jpg'}, {'end': 5914.234, 'src': 'heatmap', 'start': 5714.83, 'weight': 0.856, 'content': [{'end': 5716.952, 'text': 'The domain expert is basically saying it.', 'start': 5714.83, 'duration': 2.122}, {'end': 5723.738, 'text': 'Now who is the domain expert in this particular case? In this particular case, the domain expert is a doctor.', 'start': 5717.352, 'duration': 6.386}, {'end': 5733.266, 'text': 'Doctor have taken various samples from different different places, and whenever the doctor was constructing this bell curve,', 'start': 5723.998, 'duration': 9.268}, {'end': 5734.727, 'text': 'it was forming something like this', 'start': 5733.266, 'duration': 1.461}, {'end': 5738.451, 'text': 'And from that he was able to understand, he was able to derive.', 'start': 5735.248, 'duration': 3.203}, {'end': 5745.455, 'text': 'He or she was able to derive that within the first standard deviation, how much data is basically falling, within the second standard deviation,', 'start': 5739.391, 'duration': 6.064}, {'end': 5748.597, 'text': 'how much data is falling, and within the third standard deviation, how much data is falling?', 'start': 5745.455, 'duration': 3.142}, {'end': 5753.761, 'text': 'Second example, if you consider weight, weight will also follow a Gaussian distribution.', 'start': 5748.757, 'duration': 5.004}, {'end': 5757.143, 'text': 'Third, I hope everybody knows about Iris dataset.', 'start': 5754.221, 'duration': 2.922}, {'end': 5764.41, 'text': 'In Irish data set, if you talk about petal length, sepal length, it actually follows Gaussian distribution.', 'start': 5758.268, 'duration': 6.142}, {'end': 5765.71, 'text': 'I will show you practically.', 'start': 5764.57, 'duration': 1.14}, {'end': 5766.691, 'text': "Don't worry about that.", 'start': 5765.77, 'duration': 0.921}, {'end': 5771.212, 'text': 'Does that, following the empirical rule, necessarily imply that it is distributed??', 'start': 5767.131, 'duration': 4.081}, {'end': 5777.354, 'text': 'See, whenever you have a Gaussian distributed data at that time it will follow this 68, 95, 99.7 percentile rule.', 'start': 5771.572, 'duration': 5.782}, {'end': 5783.597, 'text': 'So this was the thing with respect to Gaussian or normally distributed.', 'start': 5779.875, 'duration': 3.722}, {'end': 5785.698, 'text': "Now let's go ahead and try to see this.", 'start': 5783.777, 'duration': 1.921}, {'end': 5787.118, 'text': "Let's take an example.", 'start': 5785.798, 'duration': 1.32}, {'end': 5792.661, 'text': 'Suppose I have a data set where my mean is 4 and my standard deviation is 1.', 'start': 5787.459, 'duration': 5.202}, {'end': 5796.703, 'text': 'If I have these two information, can I construct a distribution?', 'start': 5792.661, 'duration': 4.042}, {'end': 5799.964, 'text': 'Suppose this is 4, then in the next step, what it will come?', 'start': 5797.103, 'duration': 2.861}, {'end': 5800.464, 'text': '5,, 6, 7, 8, right?', 'start': 5799.984, 'duration': 0.48}, {'end': 5809.389, 'text': 'And then 3,, 2, 1 and 0..', 'start': 5800.484, 'duration': 8.905}, {'end': 5811.57, 'text': 'So I will be able to create this.', 'start': 5809.389, 'duration': 2.181}, {'end': 5815.873, 'text': "And let's consider that this is basically following this kind of distribution.", 'start': 5811.75, 'duration': 4.123}, {'end': 5818.615, 'text': 'So this basically follows this kind of distribution.', 'start': 5816.013, 'duration': 2.602}, {'end': 5823.378, 'text': 'Now understand this middle one is basically your mean and standard deviation.', 'start': 5818.895, 'duration': 4.483}, {'end': 5826.82, 'text': 'Sorry, mean is for and standard deviation is 1.', 'start': 5823.478, 'duration': 3.342}, {'end': 5828.222, 'text': 'Now see one thing, guys.', 'start': 5826.82, 'duration': 1.402}, {'end': 5836.489, 'text': 'if I talk about 4.5, my question is that where does 4.5 fall in terms of standard deviation?', 'start': 5828.222, 'duration': 8.267}, {'end': 5839.812, 'text': 'So you may be thinking okay, 4.5, where exactly it is?', 'start': 5836.709, 'duration': 3.103}, {'end': 5840.712, 'text': 'It is somewhere here.', 'start': 5839.872, 'duration': 0.84}, {'end': 5851.505, 'text': 'Obviously, when I say 5 is first standard deviation to the right, that basically means 4 will be plus 0.5 standard deviation to the right.', 'start': 5841.393, 'duration': 10.112}, {'end': 5853.506, 'text': 'Understand, 0.5 standard deviation.', 'start': 5851.826, 'duration': 1.68}, {'end': 5857.108, 'text': 'If you say 1 standard deviation, it is basically coming to 5.', 'start': 5853.626, 'duration': 3.482}, {'end': 5858.908, 'text': 'It is 0.5 standard deviation.', 'start': 5857.108, 'duration': 1.8}, {'end': 5869.219, 'text': 'Now similarly if I say, where does 4.75 fall? Then how you will be able to see it? See, the standard deviation was 1.', 'start': 5859.168, 'duration': 10.051}, {'end': 5870.599, 'text': 'I told 4.5.', 'start': 5869.219, 'duration': 1.38}, {'end': 5874.78, 'text': 'So, 4.5 will be something falling over here and this is like 0.5 standard deviation.', 'start': 5870.599, 'duration': 4.181}, {'end': 5881.122, 'text': 'But in the case of 4.75, it will be very much difficult for you to do the calculation.', 'start': 5875.38, 'duration': 5.742}, {'end': 5885.603, 'text': 'So, that is the reason what we can do is that we can use a concept which is called as Z-score.', 'start': 5881.422, 'duration': 4.181}, {'end': 5896.033, 'text': 'Now z-score will basically help you find out whenever I talk about a value, how much standard deviation away it is from the mean.', 'start': 5886.343, 'duration': 9.69}, {'end': 5902.058, 'text': 'So this formula is x of i minus mu divided by standard deviation.', 'start': 5896.533, 'duration': 5.525}, {'end': 5903.9, 'text': 'Now I need to find out for 4.75.', 'start': 5902.419, 'duration': 1.481}, {'end': 5911.571, 'text': 'I will just write 4.75 minus mu is what? Mu is 4.', 'start': 5903.9, 'duration': 7.671}, {'end': 5914.234, 'text': '4 divided by standard deviation is 1.', 'start': 5911.571, 'duration': 2.663}], 'summary': 'A doctor uses gaussian distribution to analyze data, including weight and iris dataset, and explains z-scores for standard deviation calculations.', 'duration': 199.404, 'max_score': 5714.83, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs5714830.jpg'}, {'end': 6417.167, 'src': 'embed', 'start': 6387.654, 'weight': 1, 'content': [{'end': 6393.253, 'text': 'If I want to probably shift this between minus 1 to plus 1, I can basically apply this.', 'start': 6387.654, 'duration': 5.599}, {'end': 6401.297, 'text': 'So normalization gives you a process where you can basically define the lower bound and upper bound and you can convert your data between them.', 'start': 6393.653, 'duration': 7.644}, {'end': 6406.561, 'text': 'Now very important thing, where do we use normalization? I hope everybody knows about deep learning.', 'start': 6401.518, 'duration': 5.043}, {'end': 6408.201, 'text': 'In CNN.', 'start': 6406.621, 'duration': 1.58}, {'end': 6417.167, 'text': 'whenever you are doing image training, image classification or object detection, in this particular case, understand every images has a pixels.', 'start': 6408.201, 'duration': 8.966}], 'summary': 'Normalization allows data conversion between defined bounds for deep learning applications like cnn image training.', 'duration': 29.513, 'max_score': 6387.654, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs6387654.jpg'}], 'start': 5002.809, 'title': 'Data visualization and normal distribution in ml', 'summary': 'Covers data visualization using box plots, outlier detection, and variance, understanding normal distribution, z scores, and histograms, explaining the 68, 95, 99.7% rule, and the practical application of standardization and normalization in machine learning, especially in deep learning for image training and classification.', 'chapters': [{'end': 5099.758, 'start': 5002.809, 'title': 'Understanding data visualization techniques', 'summary': 'Introduces the concept of using box plots for visualizing data, including the calculation of quartiles, removal of outliers, and the application of variance, providing insights into data visualization techniques and outlier detection.', 'duration': 96.949, 'highlights': ['Box plots can be used to determine outliers, providing a visualization to identify the presence of outliers in a dataset.', 'The technique of removing an outlier involves defining lower and higher fences, and utilizing the interquartile range (IQR) for outlier detection.', 'The calculation of variance involves the formula: Summation of i is equal to 1 to n, x of i minus x bar whole square divided by n minus 1, where n minus 1 is known as basal correction or degree of freedom.', 'Understanding the concept of quartiles, including Q1 (first quartile), median, Q3 (third quartile), and the application of these values in creating box plots for data visualization.']}, {'end': 5573.866, 'start': 5099.758, 'title': 'Understanding distributions and normal distribution', 'summary': 'Provides an in-depth explanation of distributions, with a focus on gaussian or normal distribution, z scores, and the empirical formula, emphasizing the importance of visualizing data through histograms and understanding the symmetrical nature of the bell curve, and the significance of standard deviation in deriving conclusions.', 'duration': 474.108, 'highlights': ['The chapter emphasizes the importance of visualizing data through histograms and understanding the symmetrical nature of the bell curve. ', 'The significance of standard deviation in deriving conclusions is highlighted, with a specific focus on the empirical formula. ', 'Explanation of Z scores and their application in understanding data distribution and the concept of standard normal distribution. ']}, {'end': 6200.77, 'start': 5574.066, 'title': 'Understanding 68, 95, 99.7% rule and z-score', 'summary': 'Explains the 68, 95, 99.7% rule, indicating the percentage of data within 1, 2, and 3 standard deviations, and illustrates the use of z-score to find the standard deviation of elements, transforming a normal distribution into a standard normal distribution.', 'duration': 626.704, 'highlights': ['The 68, 95, 99.7% rule explains the percentage of data within 1, 2, and 3 standard deviations, with 68% within the first standard deviation, 95% within the second, and 99.7% within the third, exemplified by distribution data and its relevance to Gaussian or normally distributed data. 68% of the data lies within the first standard deviation, 95% within the second, and 99.7% within the third standard deviation, illustrated using distribution data and its association with Gaussian or normally distributed data.', 'The use of Z-score is demonstrated to find the standard deviation of elements, transforming a normal distribution into a standard normal distribution with a mean of 0 and standard deviation of 1. Z-score is used to find the standard deviation of elements, transforming a normal distribution into a standard normal distribution with a mean of 0 and standard deviation of 1.']}, {'end': 6467.67, 'start': 6200.951, 'title': 'Standardization and normalization in ml', 'summary': 'Discusses the importance of standardization and normalization in machine learning, emphasizing the process of converting data into standard normal distribution and defining the difference between standardization and normalization. it also highlights the practical application of normalization in deep learning, specifically in image training and classification.', 'duration': 266.719, 'highlights': ['The process of converting data into standard normal distribution by applying Z-score and standardizing mean to 0 and standard deviation to 1 is highlighted as a crucial step in standardization in machine learning. Conversion of data into standard normal distribution, application of Z-score, standardizing mean to 0 and standard deviation to 1', 'The difference between standardization and normalization is explained, with emphasis on the mean and standard deviation for standardization and the optional range shift for normalization. Explanation of the difference between standardization and normalization, emphasis on mean and standard deviation for standardization, optional range shift for normalization', 'The practical application of normalization in deep learning, specifically in image training and classification, is highlighted, with a focus on using min-max scalar to convert pixel values from 0 to 255 to a range between 0 and 1. Practical application of normalization in deep learning, usage in image training and classification, conversion of pixel values using min-max scalar']}], 'duration': 1464.861, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs5002809.jpg', 'highlights': ['Understanding the 68, 95, 99.7% rule for data distribution within 1, 2, and 3 standard deviations.', 'The practical application of standardization and normalization in machine learning, especially in deep learning for image training and classification.', 'Box plots for outlier detection and visualization of data distribution.', 'The importance of visualizing data through histograms and understanding the symmetrical nature of the bell curve.']}, {'end': 7781.486, 'segs': [{'end': 6495.235, 'src': 'embed', 'start': 6467.67, 'weight': 0, 'content': [{'end': 6474.273, 'text': 'So when we do this specific division by divide by 255, all your values will be getting changed between 0 to 1.', 'start': 6467.67, 'duration': 6.603}, {'end': 6478.015, 'text': 'And this is another type of normalization process.', 'start': 6474.273, 'duration': 3.742}, {'end': 6484.799, 'text': 'So till here we have discussed about min-max scalar, we have discussed about normalization, standardization.', 'start': 6478.736, 'duration': 6.063}, {'end': 6490.472, 'text': 'Now let us Solve one practical example for Z score.', 'start': 6485.319, 'duration': 5.153}, {'end': 6495.235, 'text': 'Okay, recently India versus South Africa where India lost it obviously.', 'start': 6491.092, 'duration': 4.143}], 'summary': 'Data values are divided by 255, resulting in a range of 0 to 1. discussion includes min-max scalar, normalization, and standardization. practical example of z score is presented.', 'duration': 27.565, 'max_score': 6467.67, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs6467670.jpg'}, {'end': 6560.583, 'src': 'embed', 'start': 6527.57, 'weight': 1, 'content': [{'end': 6530.751, 'text': 'Now similarly, I have a data for 2020 series.', 'start': 6527.57, 'duration': 3.181}, {'end': 6539.568, 'text': "Let's say the series average In 2020, let's say that the series average is a little bit different.", 'start': 6531.451, 'duration': 8.117}, {'end': 6546.633, 'text': 'In 2020, the series average of the team scoring in 2020 was 260.', 'start': 6539.708, 'duration': 6.925}, {'end': 6554.018, 'text': 'The standard deviation of the score of all the matches is 12.', 'start': 6546.633, 'duration': 7.385}, {'end': 6560.583, 'text': "And then over here, probably Rishabh Pant's final score is 68.", 'start': 6554.018, 'duration': 6.565}], 'summary': "In 2020, the team's series average score was 260 with a standard deviation of 12, and rishabh pant's final score was 68.", 'duration': 33.013, 'max_score': 6527.57, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs6527570.jpg'}, {'end': 7097.41, 'src': 'embed', 'start': 7065.096, 'weight': 2, 'content': [{'end': 7066.777, 'text': 'I am basically interested in this region.', 'start': 7065.096, 'duration': 1.681}, {'end': 7074.084, 'text': 'I am saying that what is the percentage of the scores that are greater than 4.25? This is my question.', 'start': 7066.798, 'duration': 7.286}, {'end': 7076.686, 'text': 'Okay Simple question is this.', 'start': 7075.165, 'duration': 1.521}, {'end': 7080.97, 'text': 'And now we will try to understand how we can use Z-score in this.', 'start': 7077.307, 'duration': 3.663}, {'end': 7083.653, 'text': 'So everybody knows about Z-score formula.', 'start': 7081.351, 'duration': 2.302}, {'end': 7087.306, 'text': 'x of i minus mu divided by standard deviation.', 'start': 7084.765, 'duration': 2.541}, {'end': 7091.048, 'text': 'Here my mu is 4, standard deviation is 1.', 'start': 7088.046, 'duration': 3.002}, {'end': 7097.41, 'text': 'What is my x of i? x of i is nothing but 4.25 minus 4 divided by 1.', 'start': 7091.048, 'duration': 6.362}], 'summary': 'Analyzing scores to find percentage above 4.25 using z-score.', 'duration': 32.314, 'max_score': 7065.096, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs7065096.jpg'}, {'end': 7332.784, 'src': 'embed', 'start': 7304.379, 'weight': 3, 'content': [{'end': 7309.261, 'text': 'And remember this Z table will be giving me the area of the body curve.', 'start': 7304.379, 'duration': 4.882}, {'end': 7314.019, 'text': 'A Z table shows the area to the right hand side of the curve.', 'start': 7310.814, 'duration': 3.205}, {'end': 7319.146, 'text': 'Use these values to find the area between Z is equal to 0 and any positive value.', 'start': 7314.219, 'duration': 4.927}, {'end': 7325.295, 'text': 'For area in the left table, look at the left tail Z table instead.', 'start': 7319.607, 'duration': 5.688}, {'end': 7332.784, 'text': 'Okay? If you want to find out the area in the left tail, search for it guys.', 'start': 7326.737, 'duration': 6.047}], 'summary': 'Use z table to find area under the curve.', 'duration': 28.405, 'max_score': 7304.379, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs7304379.jpg'}, {'end': 7682.104, 'src': 'embed', 'start': 7649.946, 'weight': 4, 'content': [{'end': 7651.147, 'text': 'how do you compute the z score?', 'start': 7649.946, 'duration': 1.201}, {'end': 7653.549, 'text': 'the same example that what we have done over here.', 'start': 7651.147, 'duration': 2.402}, {'end': 7655.971, 'text': 'here, in this particular case, uh 4.25.', 'start': 7653.549, 'duration': 2.422}, {'end': 7659.653, 'text': 'uh Falls over.', 'start': 7655.971, 'duration': 3.682}, {'end': 7662.354, 'text': 'we are just taking IQ lower than 85..', 'start': 7659.653, 'duration': 2.701}, {'end': 7667.937, 'text': 'So what is IQ lower than 85? So it will become 85 minus 100 divided by 15.', 'start': 7662.354, 'duration': 5.583}, {'end': 7671.539, 'text': 'What it is? Minus 15 by 15, it is minus 1.', 'start': 7667.937, 'duration': 3.602}, {'end': 7672.899, 'text': 'So one standard deviation.', 'start': 7671.539, 'duration': 1.36}, {'end': 7673.64, 'text': 'This is my mean.', 'start': 7672.939, 'duration': 0.701}, {'end': 7677.241, 'text': 'This is my minus 1 standard deviation.', 'start': 7674.66, 'duration': 2.581}, {'end': 7682.104, 'text': 'Now this is the area that I want to find out.', 'start': 7677.962, 'duration': 4.142}], 'summary': 'Compute z-score for iq lower than 85: (85-100)/15 = -1', 'duration': 32.158, 'max_score': 7649.946, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs7649946.jpg'}], 'start': 6467.67, 'title': 'Analyzing sports performance', 'summary': "Covers data normalization, z score calculation, rishabh pant's score comparison, mean and standard deviation analysis in cricket, and using z-scores to calculate percentage of scores above a certain value.", 'chapters': [{'end': 6617.934, 'start': 6467.67, 'title': 'Z score calculation example', 'summary': "Discusses data normalization techniques such as division by 255 and practical application of z score using a cricket example, comparing rishabh pant's scores in 2020 and 2021 odi series using z score calculation, with 2020 series having a better score.", 'duration': 150.264, 'highlights': ['The chapter discusses data normalization techniques like division by 255 and practical application of Z score using a cricket example.', "Comparison of Rishabh Pant's final scores in the 2020 and 2021 ODI series, with 2020 series having a better score.", "Explanation of Z score calculation for Rishabh Pant's scores in both 2020 and 2021 ODI series."]}, {'end': 6786.618, 'start': 6619.423, 'title': "Rishabh pant's average score analysis", 'summary': "Discusses the analysis of rishabh pant's average score in a series, calculating the standard deviation and comparing it with team average score, emphasizing the importance of understanding statistical concepts in cricket.", 'duration': 167.195, 'highlights': ['The speaker emphasizes the significance of understanding statistical concepts in cricket, using examples to explain the calculation of standard deviation and its impact on average scores.', "The transcript includes the calculation of Rishabh Pant's average score and the resulting standard deviation, demonstrating the application of statistical concepts in cricket analysis.", "The chapter provides an example of Aakash Singh's potential score and relates it to the average score discussion, highlighting the practical application of statistical analysis in cricket."]}, {'end': 7006.378, 'start': 6786.618, 'title': 'Analyzing mean and standard deviation in sports performance', 'summary': 'Discusses the mean and standard deviation of sports performance in 2020 and 2021, highlighting the significance of these statistical measures in evaluating team performance and pitch conditions.', 'duration': 219.76, 'highlights': ['The mean score for 2021 is 250 with a standard deviation of 10, while for 2020, the mean is 260 with a standard deviation of 12, indicating differences in team performance and pitch conditions.', 'The discussion of z-scores and their implications in evaluating team performance and pitch conditions provides valuable insights for statistical analysis and assessment of sports performance.', 'The practical example of z-scores is emphasized as an important concept in statistics, relevant for interviews and practical application in sports analysis.']}, {'end': 7166.341, 'start': 7006.378, 'title': 'Percentage of scores above 4.25', 'summary': 'Discusses how to calculate the percentage of scores falling above 4.25 using z-score and a symmetrical bell curve, with a mean of 4 and a standard deviation of 1.', 'duration': 159.963, 'highlights': ['The Z-score formula (x of i minus mu divided by standard deviation) is used to calculate that 4.25 falls 0.25 standard deviation from the mean, and the entire area under the symmetrical bell curve is considered as 1.', 'Understanding the Z-score is crucial to determining the percentage of scores falling above 4.25, as it allows the calculation of how much the value falls within the standard deviation from the mean.']}, {'end': 7781.486, 'start': 7167.042, 'title': 'Understanding z-scores and area calculation', 'summary': 'Explains how to use z-scores to calculate the area of the body curve and provides an example of finding the percentage of iq scores below a certain value, while demonstrating the process using specific z-score calculations and table references.', 'duration': 614.444, 'highlights': ['The Z-score is used to find the area of the body curve, with a specific example demonstrating how to find the percentage of IQ scores below a certain value using Z-score calculations and table references.', 'The process for finding the area of the body curve using Z-scores is demonstrated step by step, including specific calculations and references to Z tables for both left and right tail areas.', 'The example of finding the percentage of IQ scores below a certain value illustrates the practical application of Z-scores and area calculation, with detailed step-by-step computations and explanations provided.', 'The explanation covers the importance of understanding Z-scores and their relevance to solving practical problems, such as determining the percentage of scores falling within specific ranges.', 'The chapter provides detailed insights into the process of using Z-scores and Z tables to calculate the area of the body curve, emphasizing the significance of these calculations in practical scenarios and problem-solving.']}], 'duration': 1313.816, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs6467670.jpg', 'highlights': ['The chapter discusses data normalization techniques like division by 255 and practical application of Z score using a cricket example.', 'The mean score for 2021 is 250 with a standard deviation of 10, while for 2020, the mean is 260 with a standard deviation of 12, indicating differences in team performance and pitch conditions.', 'Understanding the Z-score is crucial to determining the percentage of scores falling above 4.25, as it allows the calculation of how much the value falls within the standard deviation from the mean.', 'The process for finding the area of the body curve using Z-scores is demonstrated step by step, including specific calculations and references to Z tables for both left and right tail areas.', 'The example of finding the percentage of IQ scores below a certain value illustrates the practical application of Z-scores and area calculation, with detailed step-by-step computations and explanations provided.']}, {'end': 9140.794, 'segs': [{'end': 8193.757, 'src': 'embed', 'start': 8162.868, 'weight': 0, 'content': [{'end': 8167.135, 'text': 'The second topic we are going to discuss about is probability.', 'start': 8162.868, 'duration': 4.267}, {'end': 8174.988, 'text': 'The third thing that we are going to discuss about is something called as permutation and combination.', 'start': 8168.925, 'duration': 6.063}, {'end': 8184.332, 'text': 'Once we finish this up, the fourth thing that we are going to discuss about is something called as confidence intervals.', 'start': 8176.008, 'duration': 8.324}, {'end': 8193.757, 'text': 'So in confidence intervals, then probably if we get time, we will cover up p-value and then we will start with hypothesis testing.', 'start': 8185.013, 'duration': 8.744}], 'summary': 'Discussing probability, permutation, combination, confidence intervals, p-value, and hypothesis testing.', 'duration': 30.889, 'max_score': 8162.868, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs8162868.jpg'}, {'end': 8259.342, 'src': 'embed', 'start': 8232.367, 'weight': 2, 'content': [{'end': 8236.549, 'text': 'So in this session, we are going to, first of all, discuss about outlier.', 'start': 8232.367, 'duration': 4.182}, {'end': 8241.45, 'text': "Now, first of all, what I'm actually going to do over here is that I'm going to import some libraries.", 'start': 8237.009, 'duration': 4.441}, {'end': 8244.272, 'text': 'Import numpy as np.', 'start': 8242.332, 'duration': 1.94}, {'end': 8249.535, 'text': 'Okay Import matplotlib.pyplot.', 'start': 8245.433, 'duration': 4.102}, {'end': 8259.342, 'text': "as plt and then i'm just going to import matplotlib inline, so i'll be executing this now.", 'start': 8251.191, 'duration': 8.151}], 'summary': 'Discussion on outlier and library imports for data visualization.', 'duration': 26.975, 'max_score': 8232.367, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs8232367.jpg'}, {'end': 8334.755, 'src': 'embed', 'start': 8304.602, 'weight': 1, 'content': [{'end': 8308.046, 'text': 'Till now you have discussed, we have discussed so many things in normal distribution.', 'start': 8304.602, 'duration': 3.444}, {'end': 8309.666, 'text': 'We know that this is the mean distribution.', 'start': 8308.106, 'duration': 1.56}, {'end': 8315.869, 'text': '1st standard deviation, 2nd standard deviation, 3rd standard deviation, 1st, 2nd and 3rd standard deviation to the left.', 'start': 8310.107, 'duration': 5.762}, {'end': 8322.531, 'text': 'You know that 68% of data, 95% of data and 99.7% of data.', 'start': 8315.968, 'duration': 6.563}, {'end': 8334.755, 'text': 'Can I consider that during some of the scenarios, if my data is normally distributed after the 3rd standard deviation, probably the data are outliers,', 'start': 8323.191, 'duration': 11.564}], 'summary': 'Discussed normal distribution, with 68%, 95%, and 99.7% data within 1st, 2nd, and 3rd standard deviations; outliers beyond 3rd deviation.', 'duration': 30.153, 'max_score': 8304.602, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs8304602.jpg'}, {'end': 8418.196, 'src': 'embed', 'start': 8373.665, 'weight': 4, 'content': [{'end': 8376.047, 'text': "Okay, so here I'm just saying it is outliers.", 'start': 8373.665, 'duration': 2.382}, {'end': 8381.15, 'text': "I'm going to basically create it as a list and put up all outliers inside.", 'start': 8376.207, 'duration': 4.943}, {'end': 8385.253, 'text': "Let's define, and how do you find out standard deviation?", 'start': 8381.41, 'duration': 3.843}, {'end': 8387.615, 'text': 'or by using z-score?', 'start': 8385.253, 'duration': 2.362}, {'end': 8397.044, 'text': 'right?. We can definitely find out z-score with the help of z-score, how many data set or data points actually fall within the third standard deviation.', 'start': 8387.615, 'duration': 9.429}, {'end': 8403.73, 'text': "So here I'm actually going to create a function which says define detect underscore outliers.", 'start': 8397.524, 'duration': 6.206}, {'end': 8404.971, 'text': 'So this will be my function.', 'start': 8403.89, 'duration': 1.081}, {'end': 8407.333, 'text': 'And here I am going to give my data.', 'start': 8405.772, 'duration': 1.561}, {'end': 8409.833, 'text': 'Now the first thing that I will create a threshold.', 'start': 8407.713, 'duration': 2.12}, {'end': 8412.534, 'text': 'My threshold will basically be 3 standard deviation.', 'start': 8409.934, 'duration': 2.6}, {'end': 8418.196, 'text': 'Anything that falls away from the 3 standard deviation, I will basically be able to do it.', 'start': 8413.415, 'duration': 4.781}], 'summary': 'Creating a function to detect outliers using 3 standard deviations.', 'duration': 44.531, 'max_score': 8373.665, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs8373665.jpg'}, {'end': 8904.929, 'src': 'embed', 'start': 8876.518, 'weight': 3, 'content': [{'end': 8881.865, 'text': 'So these are the steps that I will be performing in order to find the outliers with the help of IQR.', 'start': 8876.518, 'duration': 5.347}, {'end': 8887.945, 'text': 'Now, first of all, if I really want to find out the sorted dataset, how do I find out the sorted dataset?', 'start': 8882.324, 'duration': 5.621}, {'end': 8889.385, 'text': 'Sorted dataset.', 'start': 8888.505, 'duration': 0.88}, {'end': 8893.366, 'text': 'I will just say this will be my dataset and I can use sorted function.', 'start': 8889.385, 'duration': 3.981}, {'end': 8900.048, 'text': 'And in sorted function, if I give you the dataset, this will basically be my sorted dataset.', 'start': 8893.906, 'duration': 6.142}, {'end': 8904.929, 'text': 'So sorted is an inbuilt function, which will actually help you to sort all the numbers.', 'start': 8900.068, 'duration': 4.861}], 'summary': 'Using iqr to find outliers in a sorted dataset.', 'duration': 28.411, 'max_score': 8876.518, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs8876518.jpg'}], 'start': 7781.666, 'title': 'Data analysis techniques', 'summary': 'Covers data analysis and visualization techniques including mean, median, mode, box plot, histograms, gaussian distribution, percentiles, and upcoming topics on probability, permutation, combination, confidence intervals, and hypothesis testing.', 'chapters': [{'end': 8282.486, 'start': 7781.666, 'title': 'Data analysis and visualization', 'summary': 'Covers data analysis and visualization techniques including mean, median, mode, box plot, histograms, gaussian distribution, percentiles, and upcoming topics on probability, permutation, combination, confidence intervals, and hypothesis testing.', 'duration': 500.82, 'highlights': ["The chapter covers upcoming topics on probability, permutation, combination, confidence intervals, and hypothesis testing. This indicates the upcoming topics to be covered, providing a comprehensive overview of the chapter's content.", 'The chapter demonstrates data analysis techniques including mean, median, mode, box plot, histograms, Gaussian distribution, and percentiles. This showcases the practical application of various data analysis and visualization techniques covered in the chapter.', 'The chapter emphasizes the use of Python libraries like numpy, matplotlib, and seaborn for data analysis and visualization. This highlights the practical implementation of data analysis and visualization using Python libraries, demonstrating a hands-on approach to the subject.']}, {'end': 8741.176, 'start': 8283.147, 'title': 'Detecting outliers with z-score', 'summary': 'Delves into detecting outliers in a dataset using the z-score method and explains the process of identifying outliers beyond the third standard deviation, resulting in the detection of three outliers in the dataset.', 'duration': 458.029, 'highlights': ['The chapter explains the process of identifying outliers beyond the third standard deviation. The discussion emphasizes that data beyond the third standard deviation in a normal distribution can be considered as outliers, with the detection process implemented using the Z-score method.', "The function 'detect_outliers' is created to identify outliers using the Z-score method. A function named 'detect_outliers' is established to compute the Z-score for each data point and identify outliers falling beyond the third standard deviation, with a specific threshold set at 3 standard deviations.", 'The process involves the computation of mean and standard deviation for the dataset. The process includes the computation of mean and standard deviation for the dataset using the numpy library, facilitating the calculation of Z-scores for each data point.']}, {'end': 9140.794, 'start': 8741.296, 'title': 'Iqr and outliers detection', 'summary': 'Discusses the process of using iqr to detect outliers, including steps such as sorting the data, calculating q1 and q3, finding the iqr, lower and higher fences, and using box plots for outlier visualization, with a focus on probability in machine learning and deep learning.', 'duration': 399.498, 'highlights': ['The process of using IQR to detect outliers involves steps such as sorting the data, calculating Q1 and Q3, finding the IQR, lower and higher fences, and using box plots for outlier visualization. IQR process steps, outlier detection techniques', 'Probability is a crucial concept in machine learning and deep learning, with applications in creating best fit lines for different categories of datasets. Importance of probability in ML and DL, application in creating best fit lines', 'Discuss the importance of probability in machine learning and deep learning with an example of creating a best fit line for different categories of datasets. Importance of probability, practical example in creating best fit lines']}], 'duration': 1359.128, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs7781666.jpg', 'highlights': ["The chapter covers upcoming topics on probability, permutation, combination, confidence intervals, and hypothesis testing. This indicates the upcoming topics to be covered, providing a comprehensive overview of the chapter's content.", 'The chapter demonstrates data analysis techniques including mean, median, mode, box plot, histograms, Gaussian distribution, and percentiles. This showcases the practical application of various data analysis and visualization techniques covered in the chapter.', 'The chapter emphasizes the use of Python libraries like numpy, matplotlib, and seaborn for data analysis and visualization. This highlights the practical implementation of data analysis and visualization using Python libraries, demonstrating a hands-on approach to the subject.', 'The process of using IQR to detect outliers involves steps such as sorting the data, calculating Q1 and Q3, finding the IQR, lower and higher fences, and using box plots for outlier visualization. IQR process steps, outlier detection techniques', 'The chapter explains the process of identifying outliers beyond the third standard deviation. The discussion emphasizes that data beyond the third standard deviation in a normal distribution can be considered as outliers, with the detection process implemented using the Z-score method.', "The function 'detect_outliers' is created to identify outliers using the Z-score method. A function named 'detect_outliers' is established to compute the Z-score for each data point and identify outliers falling beyond the third standard deviation, with a specific threshold set at 3 standard deviations."]}, {'end': 10546.044, 'segs': [{'end': 9288.554, 'src': 'embed', 'start': 9223.961, 'weight': 0, 'content': [{'end': 9228.406, 'text': "If this is my question, then how probability you'll be able to calculate?", 'start': 9223.961, 'duration': 4.445}, {'end': 9229.427, 'text': 'What is the answer??', 'start': 9228.626, 'duration': 0.801}, {'end': 9232.27, 'text': "Obviously, you'll say 1 by 6, right?", 'start': 9229.707, 'duration': 2.563}, {'end': 9233.351, 'text': "It's very simple.", 'start': 9232.79, 'duration': 0.561}, {'end': 9239.837, 'text': "So how do we define probability? I'll say that number of ways, number of ways an event can occur.", 'start': 9234.012, 'duration': 5.825}, {'end': 9245.262, 'text': 'An event can occur divided by number of possible outcomes.', 'start': 9240.658, 'duration': 4.604}, {'end': 9247.344, 'text': 'So this is the exact definition of this.', 'start': 9245.382, 'duration': 1.962}, {'end': 9250.867, 'text': 'Now, in this particular scenario, number of ways an event can occur.', 'start': 9247.604, 'duration': 3.263}, {'end': 9255.391, 'text': 'Over here, I am trying to find out what is the probability when I roll a dice, I get a six.', 'start': 9251.047, 'duration': 4.344}, {'end': 9258.193, 'text': 'So how many events can occur? It can only occur as one.', 'start': 9255.851, 'duration': 2.342}, {'end': 9261.556, 'text': 'And what is the number of total possible outcomes? It is 6.', 'start': 9258.954, 'duration': 2.602}, {'end': 9263.357, 'text': 'So this is how we basically find out.', 'start': 9261.556, 'duration': 1.801}, {'end': 9268.941, 'text': "Similarly, if I give one more example, let's say that I want to toss a coin.", 'start': 9263.497, 'duration': 5.444}, {'end': 9271.823, 'text': 'Obviously, I know what are my sample space, head and tail.', 'start': 9269.141, 'duration': 2.682}, {'end': 9281.469, 'text': "What is the probability of getting head? You'll just say that 1 by 2 because the sample space is 2 and one number of event that can occur is 1 by 2.", 'start': 9272.463, 'duration': 9.006}, {'end': 9284.791, 'text': 'So you basically say this as probability of head as 1 by 2.', 'start': 9281.469, 'duration': 3.322}, {'end': 9288.554, 'text': "Now let's go one step above probability, which is called an additional rule.", 'start': 9284.791, 'duration': 3.763}], 'summary': 'Probability calculation: 1/6 for rolling a six, 1/2 for getting head in a coin toss.', 'duration': 64.593, 'max_score': 9223.961, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs9223961.jpg'}, {'end': 9539.49, 'src': 'embed', 'start': 9514.314, 'weight': 3, 'content': [{'end': 9520.902, 'text': 'So, whenever you have a mutual exclusive event at that point of time, you can define this specific definition,', 'start': 9514.314, 'duration': 6.588}, {'end': 9524.907, 'text': 'which is also called as additional rule for mutual exclusive.', 'start': 9520.902, 'duration': 4.005}, {'end': 9529.562, 'text': 'Now here what is probability of A? You know that it is 1 by 2.', 'start': 9525.708, 'duration': 3.854}, {'end': 9532.144, 'text': 'Plus 1 by 2, so the answer will be 1.', 'start': 9529.562, 'duration': 2.582}, {'end': 9537.328, 'text': 'So probability of A or B to come is basically 1.', 'start': 9532.144, 'duration': 5.184}, {'end': 9539.49, 'text': 'These are some very very important things.', 'start': 9537.328, 'duration': 2.162}], 'summary': 'When two events are mutually exclusive, the probability of a occurring is 1/2, and the probability of a or b occurring is 1.', 'duration': 25.176, 'max_score': 9514.314, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs9514314.jpg'}, {'end': 9885.238, 'src': 'embed', 'start': 9855.002, 'weight': 5, 'content': [{'end': 9857.143, 'text': 'In the third instance, I may get 2, I may get any number.', 'start': 9855.002, 'duration': 2.141}, {'end': 9861.986, 'text': 'So one event is not at all dependent on the other event.', 'start': 9857.664, 'duration': 4.322}, {'end': 9869.39, 'text': 'Because anytime we roll, every possibilities or every outcomes has an equal probability to come.', 'start': 9862.806, 'duration': 6.584}, {'end': 9879.055, 'text': 'So over here what you can understand is that each and every events are independent.', 'start': 9869.41, 'duration': 9.645}, {'end': 9885.238, 'text': 'If one comes or if two comes or if any events come, it is not going to impact any other event.', 'start': 9879.815, 'duration': 5.423}], 'summary': 'Events in the rolling scenario are independent, with every outcome having an equal probability.', 'duration': 30.236, 'max_score': 9855.002, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs9855002.jpg'}, {'end': 10159.209, 'src': 'embed', 'start': 10132.081, 'weight': 6, 'content': [{'end': 10135.442, 'text': 'First of all, again, you need to find out whether this is an independent or dependent event.', 'start': 10132.081, 'duration': 3.361}, {'end': 10139.824, 'text': 'Obviously, in this case, this will be a dependent event because a deck of cards will get reduced.', 'start': 10135.962, 'duration': 3.862}, {'end': 10147.007, 'text': 'So in this particular case, I am saying what is the probability of A and B in the case of independent event.', 'start': 10140.184, 'duration': 6.823}, {'end': 10153.966, 'text': 'So here I can basically write probability of A multiplied by probability of B given A.', 'start': 10147.728, 'duration': 6.238}, {'end': 10159.209, 'text': 'Now what does this mean? This term is basically called as conditional probability.', 'start': 10153.966, 'duration': 5.243}], 'summary': 'Probability calculation for dependent events using conditional probability.', 'duration': 27.128, 'max_score': 10132.081, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs10132081.jpg'}, {'end': 10472.11, 'src': 'embed', 'start': 10424.188, 'weight': 7, 'content': [{'end': 10431.274, 'text': 'Now 120 what it is, it is all the possible permutations with respect to the chocolate name that he may see.', 'start': 10424.188, 'duration': 7.086}, {'end': 10433.296, 'text': 'All the possible permutations.', 'start': 10431.995, 'duration': 1.301}, {'end': 10438.781, 'text': 'Like he may see in this way, dairy milk, gems, milky bar.', 'start': 10433.516, 'duration': 5.265}, {'end': 10443.405, 'text': 'He may also see in different way, milky bar, gem, dairy milk.', 'start': 10439.121, 'duration': 4.284}, {'end': 10446.707, 'text': 'So all the possible options that are possible is 120.', 'start': 10443.945, 'duration': 2.762}, {'end': 10452.252, 'text': 'Now when I say 120, Okay, these are all the possible options.', 'start': 10446.707, 'duration': 5.545}, {'end': 10454.674, 'text': 'Now, this is what permutation is.', 'start': 10452.633, 'duration': 2.041}, {'end': 10458.558, 'text': "Permutation formula, how do you write? Now, let's go back to school days.", 'start': 10455.095, 'duration': 3.463}, {'end': 10460.66, 'text': 'We are directly used to ratify all the formulas.', 'start': 10458.658, 'duration': 2.002}, {'end': 10466.165, 'text': 'NPR is equal to N factorial divided by N minus R factorial.', 'start': 10460.98, 'duration': 5.185}, {'end': 10468.967, 'text': 'Over here, N is nothing but the total number of chocolates.', 'start': 10466.525, 'duration': 2.442}, {'end': 10472.11, 'text': 'R is nothing but how many names I have told that person to write.', 'start': 10469.467, 'duration': 2.643}], 'summary': '120 possible permutations of chocolate names, explained using permutation formula npr = n! / (n-r)!', 'duration': 47.922, 'max_score': 10424.188, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs10424188.jpg'}, {'end': 10519.518, 'src': 'embed', 'start': 10494.605, 'weight': 9, 'content': [{'end': 10499.748, 'text': 'Now, in combination, always understand permutation if I have the same element like this.', 'start': 10494.605, 'duration': 5.143}, {'end': 10501.209, 'text': 'I have dairy milk.', 'start': 10500.429, 'duration': 0.78}, {'end': 10503.611, 'text': 'I have gems.', 'start': 10501.93, 'duration': 1.681}, {'end': 10505.012, 'text': 'I have probably eclairs.', 'start': 10503.831, 'duration': 1.181}, {'end': 10515.231, 'text': "If I've used this element once, this combination, I cannot use the same element and probably make a different combination.", 'start': 10506.218, 'duration': 9.013}, {'end': 10519.518, 'text': 'So combination will be unique with respect to the elements that is used.', 'start': 10515.772, 'duration': 3.746}], 'summary': 'Unique combinations are formed from elements like dairy milk, gems, and eclairs.', 'duration': 24.913, 'max_score': 10494.605, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs10494605.jpg'}], 'start': 9140.894, 'title': 'Probability and permutation', 'summary': 'Covers the concepts of probability, including calculations for rolling a dice and tossing a coin, addition rule for mutually exclusive and non-mutually exclusive events, and permutation and combination with practical examples and problem-solving approaches.', 'chapters': [{'end': 9464.504, 'start': 9140.894, 'title': 'Understanding probability and additional rule', 'summary': 'Explains the concept of probability, including the calculation of probabilities for rolling a dice and tossing a coin, along with the addition rule for mutually exclusive and non-mutually exclusive events.', 'duration': 323.61, 'highlights': ['Probability Definition Probability is defined as a measure of the likelihood of an event, calculated as the number of ways an event can occur divided by the number of possible outcomes.', 'Calculating Probability for Rolling a Dice The probability of getting a 6 when rolling a dice is 1/6, illustrating the concept of probability calculation for specific events.', 'Calculating Probability for Tossing a Coin The probability of getting a head when tossing a coin is 1/2, demonstrating the application of probability calculation for binary events.', 'Understanding Mutual Exclusive Events Mutually exclusive events, such as rolling a dice or tossing a coin, illustrate scenarios where events cannot occur simultaneously, providing key insights into probability concepts.', 'Understanding Non-Mutual Exclusive Events Non-mutually exclusive events, exemplified by drawing cards from a deck, demonstrate situations where multiple events can occur simultaneously, contributing to a comprehensive understanding of probability concepts.']}, {'end': 10281.149, 'start': 9464.504, 'title': 'Probability rules and events', 'summary': 'Covers the concepts of mutual exclusive and non-mutual exclusive events, addition rule for mutual exclusive events, and addition and multiplication rules for non-mutual exclusive events, with examples and calculations, as well as the concepts of independent and dependent events, and conditional probability, with practical examples and problem-solving approaches.', 'duration': 816.645, 'highlights': ['The addition rule for mutual exclusive events: probability of A or B is equal to probability of A plus probability of B. Explains the addition rule for mutual exclusive events and its application in calculating the probability of outcomes like coin toss and dice roll.', 'The addition rule for non-mutual exclusive events: probability of A or B is equal to probability of A plus probability of B minus probability of A and B (intersection). Illustrates the addition rule for non-mutual exclusive events and its application in calculating the probability of outcomes like drawing cards from a deck, including the consideration of intersecting events.', 'The concepts of independent and dependent events, with examples: independent events have equal probabilities for each outcome, while dependent events involve outcomes impacting each other. Describes the concepts of independent and dependent events, providing examples such as dice rolls and marble picking to illustrate the differences between the two types of events.', 'Conditional probability and its application in dependent events: the probability of an event given that another event has already occurred, impacting the probabilities of subsequent events. Explains the concept of conditional probability in the context of dependent events and how it affects the probabilities of subsequent events, with practical examples and problem-solving approaches.', 'Application of multiplication rule for independent and dependent events: multiplication rule for independent events involves multiplying the probabilities of individual events, while for dependent events, it includes conditional probabilities. Clarifies the application of the multiplication rule in calculating probabilities for both independent and dependent events, emphasizing the differences in approach for each type of event.']}, {'end': 10546.044, 'start': 10281.89, 'title': 'Permutation and combination', 'summary': 'Explains the concepts of permutation and combination using a scenario of choosing chocolates from a factory, with permutation resulting in 120 possible arrangements and permutation formula npr = n factorial / (n - r) factorial, and then differentiates between permutation and combination by highlighting the uniqueness of elements in combination using the formula ncr = n factorial / (r factorial * (n - r) factorial).', 'duration': 264.154, 'highlights': ['Permutation results in 120 possible arrangements The scenario of choosing chocolates from a factory demonstrates that there are 120 possible arrangements, and the calculation is based on the number of choices available at each step (6 * 5 * 4).', 'Permutation formula NPR = N factorial / (N - R) factorial The formula NPR = N factorial / (N - R) factorial is used to calculate permutations, where N represents the total number of elements and R represents the number of elements to be arranged.', 'Differentiates between permutation and combination by highlighting the uniqueness of elements in combination In combination, the uniqueness of elements is emphasized, and once an element is used, it cannot be reused in a different order, unlike in permutation.', 'Combination formula NCR = N factorial / (R factorial * (N - R) factorial) The formula NCR = N factorial / (R factorial * (N - R) factorial) is used to calculate combinations, focusing on the uniqueness of the elements being selected.']}], 'duration': 1405.15, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs9140894.jpg', 'highlights': ['Probability is defined as a measure of the likelihood of an event, calculated as the number of ways an event can occur divided by the number of possible outcomes.', 'The probability of getting a 6 when rolling a dice is 1/6, illustrating the concept of probability calculation for specific events.', 'The probability of getting a head when tossing a coin is 1/2, demonstrating the application of probability calculation for binary events.', 'The addition rule for mutual exclusive events: probability of A or B is equal to probability of A plus probability of B.', 'The addition rule for non-mutual exclusive events: probability of A or B is equal to probability of A plus probability of B minus probability of A and B (intersection).', 'The concepts of independent and dependent events, with examples: independent events have equal probabilities for each outcome, while dependent events involve outcomes impacting each other.', 'Conditional probability and its application in dependent events: the probability of an event given that another event has already occurred, impacting the probabilities of subsequent events.', 'Permutation results in 120 possible arrangements The scenario of choosing chocolates from a factory demonstrates that there are 120 possible arrangements, and the calculation is based on the number of choices available at each step (6 * 5 * 4).', 'Permutation formula NPR = N factorial / (N - R) factorial The formula NPR = N factorial / (N - R) factorial is used to calculate permutations, where N represents the total number of elements and R represents the number of elements to be arranged.', 'Differentiates between permutation and combination by highlighting the uniqueness of elements in combination In combination, the uniqueness of elements is emphasized, and once an element is used, it cannot be reused in a different order, unlike in permutation.']}, {'end': 11814.623, 'segs': [{'end': 10668.427, 'src': 'embed', 'start': 10642.23, 'weight': 0, 'content': [{'end': 10646.199, 'text': 'This area is less because over here hardly you will be touching over here.', 'start': 10642.23, 'duration': 3.969}, {'end': 10650.447, 'text': "Now let's consider that I say my P value for this position is..", 'start': 10646.379, 'duration': 4.068}, {'end': 10655.218, 'text': 'My P value for this position is 0.8.', 'start': 10652.116, 'duration': 3.102}, {'end': 10666.085, 'text': "Now, here, what I am actually going to do, what does this 0.8 basically means that? Let's say I am doing 100 times I'm touching this mouse pad.", 'start': 10655.218, 'duration': 10.867}, {'end': 10668.427, 'text': "100 times I'm touching.", 'start': 10667.446, 'duration': 0.981}], 'summary': 'P value for the position is 0.8 after 100 touches.', 'duration': 26.197, 'max_score': 10642.23, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs10642230.jpg'}, {'end': 10910.797, 'src': 'embed', 'start': 10884.903, 'weight': 1, 'content': [{'end': 10890.347, 'text': 'First of all, in this particular scenario, we have to focus on something called as hypothesis testing.', 'start': 10884.903, 'duration': 5.444}, {'end': 10894.425, 'text': 'we have to focus on hypothesis testing.', 'start': 10892.844, 'duration': 1.581}, {'end': 10899.728, 'text': 'In hypothesis testing, the first thing is that we need to define our null hypothesis.', 'start': 10895.185, 'duration': 4.543}, {'end': 10906.714, 'text': 'The null hypothesis is usually given in the problem statement.', 'start': 10902.329, 'duration': 4.385}, {'end': 10910.797, 'text': 'We want to test whether the coin is a fair coin or not.', 'start': 10906.754, 'duration': 4.043}], 'summary': 'Focus on hypothesis testing to determine if the coin is fair.', 'duration': 25.894, 'max_score': 10884.903, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs10884903.jpg'}, {'end': 11140.052, 'src': 'embed', 'start': 11109.48, 'weight': 3, 'content': [{'end': 11113.684, 'text': 'Now this significance value is basically given by alpha.', 'start': 11109.48, 'duration': 4.204}, {'end': 11118.408, 'text': "Suppose let's consider that I am considering alpha as 0.05.", 'start': 11114.004, 'duration': 4.404}, {'end': 11121.311, 'text': 'Now this 0.05, what exactly it is?', 'start': 11118.408, 'duration': 2.903}, {'end': 11123.514, 'text': 'What exactly it actually means?', 'start': 11121.371, 'duration': 2.143}, {'end': 11131.983, 'text': "This means that if I do 1 minus 0.05, this answer, let's say that this answer, how much it will come?", 'start': 11123.994, 'duration': 7.989}, {'end': 11135.75, 'text': 'It will basically come as this 0.05..', 'start': 11133.129, 'duration': 2.621}, {'end': 11140.052, 'text': 'I have taken my significance value as 0.05.', 'start': 11135.75, 'duration': 4.302}], 'summary': 'The significance value alpha is 0.05, indicating a 5% significance level.', 'duration': 30.572, 'max_score': 11109.48, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs11109480.jpg'}, {'end': 11284.31, 'src': 'embed', 'start': 11256.5, 'weight': 2, 'content': [{'end': 11263.485, 'text': "Now tell me, let's say that this number that you are seeing is 20, let's say, and this number that you are seeing is 75.", 'start': 11256.5, 'duration': 6.985}, {'end': 11265.326, 'text': '20 to 75 is my confidence interval.', 'start': 11263.485, 'duration': 1.841}, {'end': 11266.887, 'text': 'Now I perform the experiment.', 'start': 11265.546, 'duration': 1.341}, {'end': 11281.209, 'text': 'If I get 10 heads only out of 100 experiments, Should we accept or reject the null hypothesis? The null hypothesis is basically the coin is fair.', 'start': 11267.888, 'duration': 13.321}, {'end': 11284.31, 'text': 'The alternate hypothesis coin is unfair.', 'start': 11282.089, 'duration': 2.221}], 'summary': 'Confidence interval from 20 to 75, 10 heads out of 100 experiments raise null hypothesis rejection for fair coin.', 'duration': 27.81, 'max_score': 11256.5, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs11256500.jpg'}, {'end': 11547.756, 'src': 'embed', 'start': 11499.772, 'weight': 5, 'content': [{'end': 11505.037, 'text': 'okay, that is what we are going to see now confidence interval, how to calculate this?', 'start': 11499.772, 'duration': 5.265}, {'end': 11506.798, 'text': 'probably when an alpha value is given.', 'start': 11505.037, 'duration': 1.761}, {'end': 11512.724, 'text': 'i told you we need to define some confidence interval in order to solve uh, you know some problems.', 'start': 11506.798, 'duration': 5.926}, {'end': 11519.727, 'text': 'The fourth topic that we will try to see after confidence interval is something called as Z-test, T-test.', 'start': 11513.104, 'duration': 6.623}, {'end': 11523.648, 'text': 'And if we get time, we will also finish up chi-square test.', 'start': 11520.687, 'duration': 2.961}, {'end': 11524.668, 'text': "So let's start.", 'start': 11524.068, 'duration': 0.6}, {'end': 11529.87, 'text': 'The first topic that we are probably going to discuss about is something called as type 1 and type 2 error.', 'start': 11524.868, 'duration': 5.002}, {'end': 11537.473, 'text': 'Always understand whenever we do any kind of hypothesis testing, one very important thing I told you that what we have.', 'start': 11531.011, 'duration': 6.462}, {'end': 11543.015, 'text': 'The first topic that we are probably going to discuss about is type 1.', 'start': 11538.153, 'duration': 4.862}, {'end': 11544.415, 'text': 'and type 2 error.', 'start': 11543.015, 'duration': 1.4}, {'end': 11547.756, 'text': 'type 1 and type 2 error, always understand.', 'start': 11544.415, 'duration': 3.341}], 'summary': 'The transcript covers topics on confidence intervals, z-test, t-test, and chi-square test with a focus on type 1 and type 2 errors.', 'duration': 47.984, 'max_score': 11499.772, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs11499772.jpg'}], 'start': 10546.284, 'title': 'Statistical hypothesis testing', 'summary': 'Covers hypothesis testing concepts, p-value, significance value, confidence interval, type 1 and type 2 errors, with practical applications and examples. it emphasizes the importance of decision-making and provides insights into inferential statistics, with a focus on determining fairness in experiments and understanding the significance of alpha and confidence intervals.', 'chapters': [{'end': 11073.679, 'start': 10546.284, 'title': 'Hypothesis testing and p-value in statistics', 'summary': 'Discusses the concepts of p-value and hypothesis testing in statistics, emphasizing their importance and practical application, with a focus on determining the fairness of a coin through 100 tosses and understanding the significance of p-values in specific regions of distribution, providing insights into inferential statistics.', 'duration': 527.395, 'highlights': ['The concept of P-value and its practical application is emphasized, with an example illustrating the probability of touching specific regions on a laptop mouse pad, where a P-value of 0.8 implies touching a specific region 80% of the time out of 100 touches, providing practical insights into statistical significance. Practical application of P-value, example illustrating probability of touching specific regions on a laptop mouse pad, P-value of 0.8 implies touching a specific region 80% of the time out of 100 touches.', 'The significance of hypothesis testing in determining the fairness of a coin through 100 tosses is explained, focusing on defining the null and alternate hypotheses, performing experiments, and assessing whether the coin is fair based on the number of times heads appear, providing a practical example of hypothesis testing in statistics. Significance of hypothesis testing in determining fairness of a coin through 100 tosses, focus on defining null and alternate hypotheses, practical example of hypothesis testing in statistics.']}, {'end': 11450.31, 'start': 11075.191, 'title': 'Understanding significance value and confidence interval', 'summary': 'Explains the significance value, denoted by alpha, as a measure of confidence that determines the fairness of an experiment. the 95% confidence interval is crucial in accepting or rejecting the null hypothesis, illustrating the relationship between alpha, confidence interval, and hypothesis decisions.', 'duration': 375.119, 'highlights': ['The 95% confidence interval is crucial in accepting or rejecting the null hypothesis The 95% confidence interval, determined by a domain expert, plays a key role in determining the fairness of an experiment, with values falling within the interval leading to acceptance of the null hypothesis, and values outside the interval leading to rejection.', 'Explanation of the significance value alpha and its impact on confidence interval The significance value alpha, exemplified by 0.05 or 0.20, influences the confidence interval percentage, with a higher alpha leading to a wider confidence interval and vice versa, impacting the decision to accept or reject the null hypothesis based on experiment results.', 'Illustration of using experiment results to determine fairness and accept/reject the null hypothesis Experimental results, such as obtaining 10 heads out of 100 experiments falling outside the 95% confidence interval, lead to the rejection of the null hypothesis, while obtaining 95 heads leads to acceptance, demonstrating the practical application of significance value and confidence interval in hypothesis decisions.']}, {'end': 11814.623, 'start': 11450.791, 'title': 'Hypothesis testing errors', 'summary': 'Discusses the concepts of type 1 and type 2 errors in hypothesis testing, with examples and implications, and also covers topics like confidence intervals and z-test, t-test, and chi-square test, emphasizing the importance of decision-making and implications of rejecting null hypothesis.', 'duration': 363.832, 'highlights': ['The chapter covers the concepts of type 1 and type 2 errors in hypothesis testing, emphasizing the importance of correct decision-making, with an example of a person being wrongly convicted as an illustration of type 1 error.', 'It also includes discussions on topics like confidence intervals and Z-test, T-test, and chi-square test, highlighting the significance of these concepts in statistical analysis and decision-making processes.']}], 'duration': 1268.339, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs10546284.jpg', 'highlights': ['Practical application of P-value, example illustrating probability of touching specific regions on a laptop mouse pad, P-value of 0.8 implies touching a specific region 80% of the time out of 100 touches.', 'Significance of hypothesis testing in determining fairness of a coin through 100 tosses, focus on defining null and alternate hypotheses, practical example of hypothesis testing in statistics.', '95% confidence interval, determined by a domain expert, plays a key role in determining the fairness of an experiment, with values falling within the interval leading to acceptance of the null hypothesis, and values outside the interval leading to rejection.', 'The significance value alpha, exemplified by 0.05 or 0.20, influences the confidence interval percentage, with a higher alpha leading to a wider confidence interval and vice versa, impacting the decision to accept or reject the null hypothesis based on experiment results.', 'Experimental results, such as obtaining 10 heads out of 100 experiments falling outside the 95% confidence interval, lead to the rejection of the null hypothesis, while obtaining 95 heads leads to acceptance, demonstrating the practical application of significance value and confidence interval in hypothesis decisions.', 'The chapter covers the concepts of type 1 and type 2 errors in hypothesis testing, emphasizing the importance of correct decision-making, with an example of a person being wrongly convicted as an illustration of type 1 error.', 'It also includes discussions on topics like confidence intervals and Z-test, T-test, and chi-square test, highlighting the significance of these concepts in statistical analysis and decision-making processes.']}, {'end': 14734.548, 'segs': [{'end': 11892.349, 'src': 'embed', 'start': 11862.748, 'weight': 0, 'content': [{'end': 11866.029, 'text': 'So definitely this error is basically called as type 2 error.', 'start': 11862.748, 'duration': 3.281}, {'end': 11869.97, 'text': "Okay, I hope everybody's got is clear, right? Now let's go to outcome 4.", 'start': 11866.049, 'duration': 3.921}, {'end': 11876.791, 'text': 'Outcome 4 is that we accept the null hypothesis when in reality it is true.', 'start': 11869.97, 'duration': 6.821}, {'end': 11886.067, 'text': 'So this is obviously a good case, right? So here I can say that fine, this decision, And this decision are perfectly fine.', 'start': 11878.052, 'duration': 8.015}, {'end': 11892.349, 'text': 'But whenever we have this scenarios, we basically have to consider it as type 1 and type 2 error.', 'start': 11887.027, 'duration': 5.322}], 'summary': 'Type 2 error occurs when accepting null hypothesis when it is true, leading to type 1 and type 2 error consideration.', 'duration': 29.601, 'max_score': 11862.748, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs11862748.jpg'}, {'end': 12003.634, 'src': 'embed', 'start': 11978.338, 'weight': 1, 'content': [{'end': 11983.482, 'text': 'this is also very much super important one tail and two tail test.', 'start': 11978.338, 'duration': 5.144}, {'end': 11985.466, 'text': 'So one tail and two tail test.', 'start': 11983.963, 'duration': 1.503}, {'end': 11988.993, 'text': "Now let's go ahead and let's try to understand what is one tail and two tail test.", 'start': 11985.526, 'duration': 3.467}, {'end': 11998.73, 'text': 'Now, already you have seen that I have probably drawn a curve, a bell curve, and in that I basically define a kind of one-tailed and two-tailed test.', 'start': 11989.884, 'duration': 8.846}, {'end': 12003.634, 'text': 'Still, you have seen it, but let me give you a good example, okay?', 'start': 11999.511, 'duration': 4.123}], 'summary': 'The transcript discusses the significance of one-tail and two-tail tests in statistical analysis.', 'duration': 25.296, 'max_score': 11978.338, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs11978338.jpg'}, {'end': 12226.42, 'src': 'heatmap', 'start': 12015.982, 'weight': 0.726, 'content': [{'end': 12027.985, 'text': 'in Karnataka have an 85% placement rate in the placements type.', 'start': 12015.982, 'duration': 12.003}, {'end': 12047.759, 'text': 'A new college was recently opened and it was found that a sample of 150 students had a placement rate of 88% with a standard deviation 4%.', 'start': 12028.826, 'duration': 18.933}, {'end': 12059.941, 'text': 'Does this college have or has a different placement rate than the other colleges? So understand this question very much importantly.', 'start': 12047.759, 'duration': 12.182}, {'end': 12062.642, 'text': 'Oops, sorry guys, I made one mistake.', 'start': 12060.561, 'duration': 2.081}, {'end': 12065.582, 'text': 'This should not be type 2.', 'start': 12063.222, 'duration': 2.36}, {'end': 12070.023, 'text': 'False negative should be type 2, right? True positive and true negative are perfectly fine.', 'start': 12065.582, 'duration': 4.441}, {'end': 12072.263, 'text': 'This should be type 2.', 'start': 12070.903, 'duration': 1.36}, {'end': 12074.704, 'text': 'True positive and true negative are always right.', 'start': 12072.263, 'duration': 2.441}, {'end': 12077.104, 'text': "Let's try to understand some very important thing.", 'start': 12074.984, 'duration': 2.12}, {'end': 12085.469, 'text': 'Now, what does this question basically say? Whether, see there are colleges in Karnataka which has 85% placement rate.', 'start': 12077.426, 'duration': 8.043}, {'end': 12093.391, 'text': 'A new college was recently opened and it was found out that a sample of 150 students had a placement rate of 88%.', 'start': 12086.469, 'duration': 6.922}, {'end': 12095.192, 'text': 'With a standard division 4%.', 'start': 12093.391, 'duration': 1.801}, {'end': 12097.872, 'text': 'Thus the college has a different placement rate.', 'start': 12095.192, 'duration': 2.68}, {'end': 12098.833, 'text': 'Thus this college.', 'start': 12097.992, 'duration': 0.841}, {'end': 12100.493, 'text': 'This basically means the new college.', 'start': 12099.153, 'duration': 1.34}, {'end': 12104.835, 'text': 'Now in this particular case, first of all think about the question.', 'start': 12100.773, 'duration': 4.062}, {'end': 12113.585, 'text': 'Now, Over here it says, does this college has a different placement rate? What is the placement rate of the entire college? 85%.', 'start': 12105.495, 'duration': 8.09}, {'end': 12124.248, 'text': 'So does it have a different rate than 85%? That is what we really need to check, right? Now in this particular case, this becomes a two-tailed test.', 'start': 12113.585, 'duration': 10.663}, {'end': 12126.608, 'text': "Why? We'll think over it.", 'start': 12125.048, 'duration': 1.56}, {'end': 12130.829, 'text': "Let's say that here the significance value is given as 0.05.", 'start': 12126.668, 'duration': 4.161}, {'end': 12131.529, 'text': "Let's consider.", 'start': 12130.829, 'duration': 0.7}, {'end': 12138.546, 'text': "Let's consider that over here, the significance value is given as 0.05.", 'start': 12131.969, 'duration': 6.577}, {'end': 12142.489, 'text': 'Now what we do over here is that we will try to create a graph.', 'start': 12138.546, 'duration': 3.943}, {'end': 12153.197, 'text': 'Now when we have 0.05 that basically means if it is a two tail test, two tail test basically means right now I have a placement rate of 85 percent.', 'start': 12143.309, 'duration': 9.888}, {'end': 12158.821, 'text': 'So 85 percent is you can just consider that 85 percent will be what in this particular case.', 'start': 12153.217, 'duration': 5.604}, {'end': 12163.295, 'text': '85% placement rate.', 'start': 12161.854, 'duration': 1.441}, {'end': 12165.296, 'text': 'So 85%.', 'start': 12163.675, 'duration': 1.621}, {'end': 12169.058, 'text': 'But we need to find out over here when alpha is given, 2.5 will be here.', 'start': 12165.296, 'duration': 3.762}, {'end': 12172.92, 'text': 'And this will be my 95% confidence interval.', 'start': 12169.578, 'duration': 3.342}, {'end': 12174.141, 'text': 'So 95 will basically be here.', 'start': 12172.94, 'duration': 1.201}, {'end': 12176.002, 'text': 'If I combine all these things, it will become 1.', 'start': 12174.261, 'duration': 1.741}, {'end': 12181.624, 'text': 'Now you need to understand whether this will become a two-tailed test or a one-tailed test.', 'start': 12176.002, 'duration': 5.622}, {'end': 12183.565, 'text': 'This is what is very much simple.', 'start': 12181.965, 'duration': 1.6}, {'end': 12185.366, 'text': 'Now this 85% will be my mean.', 'start': 12184.306, 'duration': 1.06}, {'end': 12192.365, 'text': 'My value can be greater than 85, it can be less than 85.', 'start': 12187.236, 'duration': 5.129}, {'end': 12199.257, 'text': 'It can be greater than 85, it can be less than 85 because we are just checking whether it has a different placement rate.', 'start': 12192.365, 'duration': 6.892}, {'end': 12202.705, 'text': 'It can be greater, it can be less also.', 'start': 12200.484, 'duration': 2.221}, {'end': 12206.948, 'text': 'So that is the reason this entire test becomes a two-tailed test.', 'start': 12203.346, 'duration': 3.602}, {'end': 12213.092, 'text': 'Because the new college that gets added, it may fall in this region also, it may fall in this region.', 'start': 12207.549, 'duration': 5.543}, {'end': 12221.297, 'text': "Right now, you'll be able to see that we are just trying to check whether it is greater than 85 or whether it is less than 85.", 'start': 12213.672, 'duration': 7.625}, {'end': 12223.559, 'text': 'So this becomes a two-tailed test.', 'start': 12221.297, 'duration': 2.262}, {'end': 12226.42, 'text': 'Now let me just make a little bit change into the question.', 'start': 12223.699, 'duration': 2.721}], 'summary': 'A new college in karnataka with 150 students has an 88% placement rate, with a standard deviation of 4%, prompting a two-tailed test to determine if it has a different placement rate than the other colleges with an 85% rate.', 'duration': 210.438, 'max_score': 12015.982, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs12015982.jpg'}, {'end': 12130.829, 'src': 'embed', 'start': 12099.153, 'weight': 2, 'content': [{'end': 12100.493, 'text': 'This basically means the new college.', 'start': 12099.153, 'duration': 1.34}, {'end': 12104.835, 'text': 'Now in this particular case, first of all think about the question.', 'start': 12100.773, 'duration': 4.062}, {'end': 12113.585, 'text': 'Now, Over here it says, does this college has a different placement rate? What is the placement rate of the entire college? 85%.', 'start': 12105.495, 'duration': 8.09}, {'end': 12124.248, 'text': 'So does it have a different rate than 85%? That is what we really need to check, right? Now in this particular case, this becomes a two-tailed test.', 'start': 12113.585, 'duration': 10.663}, {'end': 12126.608, 'text': "Why? We'll think over it.", 'start': 12125.048, 'duration': 1.56}, {'end': 12130.829, 'text': "Let's say that here the significance value is given as 0.05.", 'start': 12126.668, 'duration': 4.161}], 'summary': "Analyzing if the new college's placement rate differs from 85% using a two-tailed test with a significance value of 0.05.", 'duration': 31.676, 'max_score': 12099.153, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs12099153.jpg'}, {'end': 12374.805, 'src': 'embed', 'start': 12345.813, 'weight': 5, 'content': [{'end': 12349.394, 'text': 'I told you, right? See, this is very much important.', 'start': 12345.813, 'duration': 3.581}, {'end': 12352.775, 'text': 'confidence intervals with respect to means.', 'start': 12350.174, 'duration': 2.601}, {'end': 12357.357, 'text': 'I told you, right? In confidence interval, what we do? We basically have this graph.', 'start': 12353.335, 'duration': 4.022}, {'end': 12362.339, 'text': 'When I say my alpha value is 0.05, then this becomes a two-tailed test, suppose.', 'start': 12357.857, 'duration': 4.482}, {'end': 12366.1, 'text': 'I need to find this value, right? I need to find these two values.', 'start': 12362.879, 'duration': 3.221}, {'end': 12369.622, 'text': 'How do I find out these two values? That is what we are going to see.', 'start': 12366.561, 'duration': 3.061}, {'end': 12374.805, 'text': 'We are going to do some kind of calculations which will actually help me to understand.', 'start': 12370.422, 'duration': 4.383}], 'summary': 'Calculating confidence intervals for means with alpha value 0.05.', 'duration': 28.992, 'max_score': 12345.813, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs12345813.jpg'}, {'end': 12623.491, 'src': 'heatmap', 'start': 12410.841, 'weight': 0.768, 'content': [{'end': 12430.782, 'text': 'Point estimate can be defined as a value of any statistic that estimates the value of a parameter is called a point estimate.', 'start': 12410.841, 'duration': 19.941}, {'end': 12439.629, 'text': 'So a simple definition I have written over here, I will define about what is this statistic and which estimates the value of a parameter.', 'start': 12431.242, 'duration': 8.387}, {'end': 12442.752, 'text': 'So two things, one is statistics and one is parameter.', 'start': 12439.649, 'duration': 3.103}, {'end': 12449.057, 'text': 'So what exactly is point estimate? A value of any statistics that estimates the value of a parameter.', 'start': 12443.412, 'duration': 5.645}, {'end': 12457.487, 'text': 'Now understand one thing guys, in inferential statistics, Any work that we will be doing, first of all we will be considering a sample data.', 'start': 12449.697, 'duration': 7.79}, {'end': 12463.076, 'text': 'Based on this sample data, we will be estimating something for the population data.', 'start': 12458.749, 'duration': 4.327}, {'end': 12471.566, 'text': "In this particular example, let's consider that I will try to, if I have the sample meal, I'll try to estimate the population.", 'start': 12464.644, 'duration': 6.922}, {'end': 12475.707, 'text': 'And usually this only thing happens in inferential stats.', 'start': 12472.526, 'duration': 3.181}, {'end': 12480.189, 'text': "You'll just have the sample information, probably population, standard deviation.", 'start': 12475.967, 'duration': 4.222}, {'end': 12483.75, 'text': 'you may know, but you really need to find out or estimate the population.', 'start': 12480.189, 'duration': 3.561}, {'end': 12487.291, 'text': "And as you know, like let's say that I'll give one example.", 'start': 12484.11, 'duration': 3.181}, {'end': 12488.371, 'text': 'This is my X bar.', 'start': 12487.551, 'duration': 0.82}, {'end': 12497.335, 'text': 'This x bar, we will try to estimate the value of mu bar, right? Because if I have a population with the help of sample, I can definitely estimate mu.', 'start': 12488.971, 'duration': 8.364}, {'end': 12502.777, 'text': 'But always remember, this value may be approximately equal to this.', 'start': 12497.575, 'duration': 5.202}, {'end': 12504.358, 'text': 'It may be also less.', 'start': 12503.457, 'duration': 0.901}, {'end': 12506.038, 'text': 'it may be also greater, right?', 'start': 12504.358, 'duration': 1.68}, {'end': 12513.962, 'text': 'In one case I may say that if my x bar is 2.9 and probably my population mean is mu is equal to 3, right?', 'start': 12506.959, 'duration': 7.003}, {'end': 12515.162, 'text': 'This may be equal.', 'start': 12514.242, 'duration': 0.92}, {'end': 12515.902, 'text': 'this may be less.', 'start': 12515.162, 'duration': 0.74}, {'end': 12517.303, 'text': 'this may be little bit greater also.', 'start': 12515.902, 'duration': 1.401}, {'end': 12519.825, 'text': 'This is what point estimate is all about.', 'start': 12517.723, 'duration': 2.102}, {'end': 12523.95, 'text': 'So this is the point estimate which will be estimating the mu value.', 'start': 12520.186, 'duration': 3.764}, {'end': 12528.194, 'text': 'So in this particular case, I hope you have understood what exactly is point estimate.', 'start': 12524.17, 'duration': 4.024}, {'end': 12532.999, 'text': 'So point estimate is the value of any statistic that estimates the value of a parameter.', 'start': 12529.135, 'duration': 3.864}, {'end': 12536.342, 'text': 'So through this, we are basically estimating the mean.', 'start': 12533.419, 'duration': 2.923}, {'end': 12538.405, 'text': 'So at least get this specific knowledge.', 'start': 12536.723, 'duration': 1.682}, {'end': 12542.469, 'text': 'Now, in most of the problem statement, I will be given this.', 'start': 12539.005, 'duration': 3.464}, {'end': 12544.491, 'text': 'And I really need to estimate this.', 'start': 12542.729, 'duration': 1.762}, {'end': 12549.858, 'text': 'How will I be able to do this? So for that specific case, we will try to see a problem statement.', 'start': 12544.972, 'duration': 4.886}, {'end': 12553.462, 'text': 'And here, we will use something called as confidence interval.', 'start': 12550.198, 'duration': 3.264}, {'end': 12558.688, 'text': 'Now understand, I told you that this value will be approximately equal to mean.', 'start': 12553.502, 'duration': 5.186}, {'end': 12561.47, 'text': 'It may be less than mean, it may be greater than mean.', 'start': 12559.268, 'duration': 2.202}, {'end': 12570.458, 'text': 'So in this particular scenario, we define something called as confidence intervals so that we will be able to come towards the population mean.', 'start': 12561.911, 'duration': 8.547}, {'end': 12579.222, 'text': 'So confidence interval is usually given by the formula Which is nothing but point estimate plus or minus margin of error.', 'start': 12570.658, 'duration': 8.564}, {'end': 12580.962, 'text': 'So there is some margin of error.', 'start': 12579.422, 'duration': 1.54}, {'end': 12582.743, 'text': 'There is some margin of error.', 'start': 12581.403, 'duration': 1.34}, {'end': 12584.323, 'text': 'Because over here you can see 2.9.', 'start': 12583.163, 'duration': 1.16}, {'end': 12586.504, 'text': 'This is obviously less.', 'start': 12584.323, 'duration': 2.181}, {'end': 12588.124, 'text': 'It can also be greater.', 'start': 12586.944, 'duration': 1.18}, {'end': 12590.925, 'text': 'So I have written plus or minus of margin of error.', 'start': 12588.404, 'duration': 2.521}, {'end': 12594.945, 'text': "Because obviously we will not know the exact population mean, right? We don't know.", 'start': 12591.125, 'duration': 3.82}, {'end': 12601.687, 'text': 'So obviously the point estimate plus margin of error will actually help us to get the same mean.', 'start': 12595.526, 'duration': 6.161}, {'end': 12604.707, 'text': 'And this is how we determine the confidence interval.', 'start': 12602.367, 'duration': 2.34}, {'end': 12606.488, 'text': "Now let's see one problem statement.", 'start': 12604.968, 'duration': 1.52}, {'end': 12611.709, 'text': 'By this, you will basically be able to understand what I am actually saying.', 'start': 12607.088, 'duration': 4.621}, {'end': 12617.63, 'text': 'From this formula, you will be able to understand that how close we are near to the population mean.', 'start': 12612.209, 'duration': 5.421}, {'end': 12623.491, 'text': 'The second thing is that suppose, if you are given the population standard deviation at that point of time,', 'start': 12618.13, 'duration': 5.361}], 'summary': 'Point estimate is a value that estimates the population parameter in inferential statistics using sample data and confidence intervals to determine the accuracy.', 'duration': 212.65, 'max_score': 12410.841, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs12410841.jpg'}, {'end': 12471.566, 'src': 'embed', 'start': 12443.412, 'weight': 4, 'content': [{'end': 12449.057, 'text': 'So what exactly is point estimate? A value of any statistics that estimates the value of a parameter.', 'start': 12443.412, 'duration': 5.645}, {'end': 12457.487, 'text': 'Now understand one thing guys, in inferential statistics, Any work that we will be doing, first of all we will be considering a sample data.', 'start': 12449.697, 'duration': 7.79}, {'end': 12463.076, 'text': 'Based on this sample data, we will be estimating something for the population data.', 'start': 12458.749, 'duration': 4.327}, {'end': 12471.566, 'text': "In this particular example, let's consider that I will try to, if I have the sample meal, I'll try to estimate the population.", 'start': 12464.644, 'duration': 6.922}], 'summary': 'Point estimate is a statistical value used to estimate a parameter from sample data.', 'duration': 28.154, 'max_score': 12443.412, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs12443412.jpg'}, {'end': 12977.05, 'src': 'embed', 'start': 12945.325, 'weight': 3, 'content': [{'end': 12949.105, 'text': 'Similarly lower bound of confidence interval I will try to find out.', 'start': 12945.325, 'duration': 3.78}, {'end': 12960.089, 'text': 'That is x bar minus z 0.05 divided by 2 100 divided by root 25.', 'start': 12949.385, 'duration': 10.704}, {'end': 12968.887, 'text': 'Now here I will write 0.05 by 2 is nothing but z is nothing but 0.025.', 'start': 12960.089, 'duration': 8.798}, {'end': 12970.15, 'text': 'I hope everybody is getting this.', 'start': 12968.887, 'duration': 1.263}, {'end': 12977.05, 'text': 'Now, how do I find out this particular value for this? Go and open your browser and open Z table.', 'start': 12970.908, 'duration': 6.142}], 'summary': 'Finding lower bound confidence interval using z-score and sample size of 25.', 'duration': 31.725, 'max_score': 12945.325, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs12945325.jpg'}, {'end': 13635.803, 'src': 'heatmap', 'start': 12813.377, 'weight': 7, 'content': [{'end': 12815.36, 'text': 'This is for my confidence interval formula.', 'start': 12813.377, 'duration': 1.983}, {'end': 12817.963, 'text': 'Now, point estimate is obviously your X bar.', 'start': 12815.66, 'duration': 2.303}, {'end': 12827.203, 'text': 'Now plus or minus, whenever you have this population standard deviation, you apply a Z test.', 'start': 12819.013, 'duration': 8.19}, {'end': 12830.723, 'text': 'So here you will write Z alpha by 2.', 'start': 12827.484, 'duration': 3.239}, {'end': 12834.646, 'text': 'And the formula will be standard deviation divided by root n.', 'start': 12830.723, 'duration': 3.923}, {'end': 12835.907, 'text': 'Now this is your formula.', 'start': 12834.646, 'duration': 1.261}, {'end': 12838.529, 'text': "This term, I'll talk about this term.", 'start': 12836.167, 'duration': 2.362}, {'end': 12842.552, 'text': 'This term that you see is called as standard error.', 'start': 12839.13, 'duration': 3.422}, {'end': 12850.219, 'text': 'So in this particular case, one more second point is that when we should use this formula to find out the confidence interval.', 'start': 12843.353, 'duration': 6.866}, {'end': 12856.844, 'text': 'The next thing is that over here you will be able to see that I have taken a sample of 25.', 'start': 12850.919, 'duration': 5.925}, {'end': 12860.847, 'text': 'But usually the sample size will be greater than or equal to 30.', 'start': 12856.844, 'duration': 4.003}, {'end': 12864.73, 'text': 'But just for an example I have taken as 25.', 'start': 12860.847, 'duration': 3.883}, {'end': 12866.531, 'text': "So it's okay.", 'start': 12864.73, 'duration': 1.801}, {'end': 12869.753, 'text': "Now don't fight with me Krish why I have taken 25.", 'start': 12867.011, 'duration': 2.742}, {'end': 12871.895, 'text': 'Take it 30 also we have to do the calculation.', 'start': 12869.753, 'duration': 2.142}, {'end': 12873.877, 'text': 'But these two conditions suits well.', 'start': 12872.115, 'duration': 1.762}, {'end': 12877.079, 'text': 'for this kind of problem statement, okay.', 'start': 12874.918, 'duration': 2.161}, {'end': 12883.163, 'text': 'So for a Z test to happen, most of the time this two condition needs to be approved.', 'start': 12877.74, 'duration': 5.423}, {'end': 12886.465, 'text': 'Now this Z test is nothing but Z score, okay.', 'start': 12883.663, 'duration': 2.802}, {'end': 12890.368, 'text': 'Z score to find out the Z score, that is what Z test is basically used.', 'start': 12886.505, 'duration': 3.863}, {'end': 12894.07, 'text': 'Now understand over here what this alpha is, okay.', 'start': 12890.848, 'duration': 3.222}, {'end': 12906.058, 'text': 'So this is the entire formula to find out the confidence interval if your Population standard deviation is given and when your sample size is greater than or equal to 30..', 'start': 12894.09, 'duration': 11.968}, {'end': 12908.219, 'text': 'Now, let us go and solve this particular problem.', 'start': 12906.058, 'duration': 2.161}, {'end': 12913.823, 'text': 'Now, when I go and solve this particular problem, the first thing is that I will split this equation into two parts.', 'start': 12908.78, 'duration': 5.043}, {'end': 12917.245, 'text': 'One is I will get one higher confidence interval.', 'start': 12914.624, 'duration': 2.621}, {'end': 12921.088, 'text': 'Alpha value is 0.05 divided by 2.', 'start': 12918.026, 'duration': 3.062}, {'end': 12930.222, 'text': 'Standard deviation is, what is standard deviation over here? It is nothing but 100 divided by root 25.', 'start': 12921.088, 'duration': 9.134}, {'end': 12934.263, 'text': 'Now you understood why I have taken 25 because my calculation will become easier.', 'start': 12930.222, 'duration': 4.041}, {'end': 12936.003, 'text': "Don't fight with me guys.", 'start': 12934.943, 'duration': 1.06}, {'end': 12937.203, 'text': "I don't have energy to fight.", 'start': 12936.123, 'duration': 1.08}, {'end': 12939.464, 'text': 'Nowadays I fight with lot of people.', 'start': 12937.903, 'duration': 1.561}, {'end': 12942.624, 'text': 'So this will basically be my upper bound.', 'start': 12940.304, 'duration': 2.32}, {'end': 12944.925, 'text': 'Upper bound of confidence interval.', 'start': 12943.304, 'duration': 1.621}, {'end': 12949.105, 'text': 'Similarly lower bound of confidence interval I will try to find out.', 'start': 12945.325, 'duration': 3.78}, {'end': 12960.089, 'text': 'That is x bar minus z 0.05 divided by 2 100 divided by root 25.', 'start': 12949.385, 'duration': 10.704}, {'end': 12968.887, 'text': 'Now here I will write 0.05 by 2 is nothing but z is nothing but 0.025.', 'start': 12960.089, 'duration': 8.798}, {'end': 12970.15, 'text': 'I hope everybody is getting this.', 'start': 12968.887, 'duration': 1.263}, {'end': 12977.05, 'text': 'Now, how do I find out this particular value for this? Go and open your browser and open Z table.', 'start': 12970.908, 'duration': 6.142}, {'end': 12988.014, 'text': "So if I go and open Z table, if I open Z table, let me just open a Z table, another Z table, I'll try to open just a second.", 'start': 12977.07, 'duration': 10.944}, {'end': 12991.055, 'text': 'Point, here all minus are basically shown.', 'start': 12988.714, 'duration': 2.341}, {'end': 12997.977, 'text': "So I'll not use this Z table, which I'll use the other one because there are only negative values given.", 'start': 12991.975, 'duration': 6.002}, {'end': 13000.018, 'text': "Here probably I'll be able to find out.", 'start': 12998.557, 'duration': 1.461}, {'end': 13014.837, 'text': 'Okay Now in Z table, always understand, always understand, over here when I say 0.025, okay, my entire area is how much? So my entire area is 1.', 'start': 13000.158, 'duration': 14.679}, {'end': 13023.381, 'text': 'If I subtract 1 with 0.025, that basically means this part, the entire area will become 0.975.', 'start': 13014.837, 'duration': 8.544}, {'end': 13026.924, 'text': 'So 0.975, I have to check in the Z table.', 'start': 13023.381, 'duration': 3.543}, {'end': 13030.486, 'text': 'So for this, what I will do is that I will go to my browser.', 'start': 13027.604, 'duration': 2.882}, {'end': 13037.537, 'text': 'And go and check it, where is 0.975? 0.975 is nothing but this specific area.', 'start': 13031.507, 'duration': 6.03}, {'end': 13041.363, 'text': 'Go and check this, 0.975.', 'start': 13038.398, 'duration': 2.965}, {'end': 13042.465, 'text': 'I hope you are able to see this.', 'start': 13041.363, 'duration': 1.102}, {'end': 13048.93, 'text': 'So what is this value, 1.9? And if I go on top, it is 0.06.', 'start': 13042.605, 'duration': 6.325}, {'end': 13053.331, 'text': 'That basically means the Z value is 1.96.', 'start': 13048.93, 'duration': 4.401}, {'end': 13057.113, 'text': 'So go down over here, you will be able to see 0.9750.', 'start': 13053.331, 'duration': 3.782}, {'end': 13061.974, 'text': 'It is nothing but 1.9 and this is 0.06.', 'start': 13057.113, 'duration': 4.861}, {'end': 13064.175, 'text': 'So this becomes my Z score.', 'start': 13061.974, 'duration': 2.201}, {'end': 13068.536, 'text': 'So finally, I get my value as 1.96.', 'start': 13064.395, 'duration': 4.141}, {'end': 13071.317, 'text': 'Now go and calculate it.', 'start': 13068.536, 'duration': 2.781}, {'end': 13081.483, 'text': 'So what is my X bar for the upper bound? I will say my x bar is nothing but what is the mean of the sample? It is nothing but 520.', 'start': 13071.817, 'duration': 9.666}, {'end': 13089.108, 'text': 'Okay So it is 520 plus 1.96 multiplied by 20.', 'start': 13081.483, 'duration': 7.625}, {'end': 13097.314, 'text': 'Similarly, the lower bound, it is nothing but 520 minus 1.96 multiplied by 20.', 'start': 13089.108, 'duration': 8.206}, {'end': 13098.575, 'text': 'Now go ahead and compute this.', 'start': 13097.314, 'duration': 1.261}, {'end': 13103.797, 'text': '559.2, 480.8.', 'start': 13098.595, 'duration': 5.202}, {'end': 13105.799, 'text': 'So this is my lower bound and upper bound.', 'start': 13103.798, 'duration': 2.001}, {'end': 13126.324, 'text': 'that basically means whenever I am defining my confidence interval for this distribution with alpha is 0.05 and this this value will be 559.2 and this value will be 480.8 and my mean will basically be 520 right.', 'start': 13105.799, 'duration': 20.525}, {'end': 13130.126, 'text': 'So, one stats interview question that I told right.', 'start': 13126.885, 'duration': 3.241}, {'end': 13139.301, 'text': 'average size of the sharks, sharks throughout the world.', 'start': 13133.235, 'duration': 6.066}, {'end': 13148.149, 'text': 'Can you solve this by taking your own example? Because one of my student solved this particular problem and he gave some confidence interval.', 'start': 13139.941, 'duration': 8.208}, {'end': 13153.794, 'text': "He said that let's assume this, this, this, this, this and try to solve in this particular way.", 'start': 13148.709, 'duration': 5.085}, {'end': 13157.364, 'text': "He said that, okay, let's consider.", 'start': 13155.543, 'duration': 1.821}, {'end': 13161.565, 'text': 'Oh, there the interviewer said, you know the population standard deviation.', 'start': 13157.604, 'duration': 3.961}, {'end': 13163.446, 'text': 'You know the X bar value.', 'start': 13162.065, 'duration': 1.381}, {'end': 13164.666, 'text': 'You know the N value.', 'start': 13163.506, 'duration': 1.16}, {'end': 13168.587, 'text': 'Try to solve it with alpha as 0.05.', 'start': 13165.286, 'duration': 3.301}, {'end': 13178.35, 'text': 'Ayush Nautial understand that over here when my alpha value is 0.025, I am just worried about one tail, right? This side.', 'start': 13168.587, 'duration': 9.763}, {'end': 13180.031, 'text': 'This entire area is 1.', 'start': 13178.991, 'duration': 1.04}, {'end': 13183.752, 'text': 'So 1 minus 0.025 is 0.975.', 'start': 13180.031, 'duration': 3.721}, {'end': 13185.854, 'text': 'Now, after performing any experiment.', 'start': 13183.752, 'duration': 2.102}, {'end': 13193.841, 'text': 'if my value falls between these two at that point of time, I will assume that it is.', 'start': 13185.854, 'duration': 7.987}, {'end': 13197.765, 'text': 'we need to accept the null hypothesis and we can go ahead with it.', 'start': 13193.841, 'duration': 3.924}, {'end': 13202.669, 'text': 'If it does not fall within this range, then it is going to fall away from the hypothesis.', 'start': 13197.925, 'duration': 4.744}, {'end': 13204.411, 'text': 'Basically, we need to reject the null hypothesis.', 'start': 13202.93, 'duration': 1.481}, {'end': 13211.006, 'text': 'Now, the next question that we are probably going to see is that what if the population standard deviation is not given?', 'start': 13205.14, 'duration': 5.866}, {'end': 13214.27, 'text': 'Now, in that particular scenario, what will you do?', 'start': 13211.827, 'duration': 2.443}, {'end': 13219.195, 'text': 'For that particular case, you really need to use something called as t-test.', 'start': 13215.291, 'duration': 3.904}, {'end': 13223.68, 'text': 'So, let me just show you one very good example and that also we will try to solve.', 'start': 13220.316, 'duration': 3.364}, {'end': 13228.224, 'text': 'Let us say that the same question this standard deviation is not given.', 'start': 13223.82, 'duration': 4.404}, {'end': 13229.606, 'text': 'standard deviation is not given.', 'start': 13228.224, 'duration': 1.382}, {'end': 13233.41, 'text': 'population standard deviation is not given, but sample standard deviation is given.', 'start': 13229.606, 'duration': 3.804}, {'end': 13239.256, 'text': 'So, I will write down the question over here for you, but I hope you are able to understand it.', 'start': 13233.53, 'duration': 5.726}, {'end': 13250.727, 'text': 'So the question is that on the quant test of CAT exam, on the quant test of a CAT exam,', 'start': 13240.037, 'duration': 10.69}, {'end': 13263.02, 'text': 'a sample of 25 test takers has a mean of 520 score with a standard deviation.', 'start': 13250.727, 'duration': 12.293}, {'end': 13270.525, 'text': 'now this standard deviation that is given is basically your sample standard deviation has a standard deviation of 80.', 'start': 13263.02, 'duration': 7.505}, {'end': 13278.029, 'text': 'construct 95 percent confidence interval about the mean.', 'start': 13270.525, 'duration': 7.504}, {'end': 13280.63, 'text': 'so this is basically my question, right?', 'start': 13278.029, 'duration': 2.601}, {'end': 13282.591, 'text': 'so this is my question.', 'start': 13280.63, 'duration': 1.961}, {'end': 13286.213, 'text': 'so over here you can see that population standard deviation is not given.', 'start': 13282.591, 'duration': 3.622}, {'end': 13291.171, 'text': 'So in this particular case I definitely have to use Z test.', 'start': 13287.589, 'duration': 3.582}, {'end': 13293.012, 'text': 'So, over here, sorry, T test.', 'start': 13291.271, 'duration': 1.741}, {'end': 13295.433, 'text': 'Condition, I will write, okay.', 'start': 13293.612, 'duration': 1.821}, {'end': 13298.414, 'text': 'First of all, we will try to see what all things are given.', 'start': 13296.233, 'duration': 2.181}, {'end': 13300.555, 'text': 'Your n value is given, which is 25.', 'start': 13298.914, 'duration': 1.641}, {'end': 13303.136, 'text': 'Your x bar is given, which is nothing but 520.', 'start': 13300.555, 'duration': 2.581}, {'end': 13308.119, 'text': 'Right, your sample standard deviation is given, that is 80.', 'start': 13303.136, 'duration': 4.983}, {'end': 13310.5, 'text': 'And your alpha is 0.05.', 'start': 13308.119, 'duration': 2.381}, {'end': 13315.602, 'text': 'So, when you see over here, your values have not been given over here.', 'start': 13310.5, 'duration': 5.102}, {'end': 13325.098, 'text': 'That basically means your you know, the conditions, I will not say conditions, but here your population standard deviation is not given.', 'start': 13315.682, 'duration': 9.416}, {'end': 13332.844, 'text': 'So, I can write a condition saying that here population standard deviation is not given.', 'start': 13325.278, 'duration': 7.566}, {'end': 13337.227, 'text': 'So, in this particular case, we use something called as t-test.', 'start': 13334.025, 'duration': 3.202}, {'end': 13342.411, 'text': 'If population standard deviation given, at that point of time you use t-test.', 'start': 13338.068, 'duration': 4.343}, {'end': 13345.459, 'text': "Let's go and try to compute it.", 'start': 13343.918, 'duration': 1.541}, {'end': 13347.441, 'text': 'Here also the same formula will be used.', 'start': 13345.6, 'duration': 1.841}, {'end': 13352.327, 'text': 'Point estimate plus or minus margin of error.', 'start': 13348.623, 'duration': 3.704}, {'end': 13355.85, 'text': 'Here your margin of error formula will change.', 'start': 13352.927, 'duration': 2.923}, {'end': 13361.056, 'text': 'Now what kind of formula it will have? That you need to understand.', 'start': 13357.432, 'duration': 3.624}, {'end': 13371.222, 'text': 'The formula will be something like x bar plus or minus, instead of writing z alpha by 2, here you will be writing t alpha by 2.', 'start': 13361.636, 'duration': 9.586}, {'end': 13374.204, 'text': 'And then you will be using s by root n.', 'start': 13371.222, 'duration': 2.982}, {'end': 13375.886, 'text': 'This is your standard error.', 'start': 13374.204, 'duration': 1.682}, {'end': 13378.047, 'text': 'Now go ahead and substitute it.', 'start': 13376.546, 'duration': 1.501}, {'end': 13380.389, 'text': 'So two things you will be basically having.', 'start': 13378.507, 'duration': 1.882}, {'end': 13381.549, 'text': 'One is upper bound.', 'start': 13380.649, 'duration': 0.9}, {'end': 13387.553, 'text': 'It will be x bar plus t 0.05 by 2, s by root n right?', 'start': 13382.37, 'duration': 5.183}, {'end': 13395.397, 'text': 'Now, first thing, first, always understand.', 'start': 13391.856, 'duration': 3.541}, {'end': 13402.2, 'text': 'to calculate the t, okay, to calculate the t value, you need to find out something called a degree of freedom.', 'start': 13395.397, 'duration': 6.803}, {'end': 13405.601, 'text': 'Because in the t table, you will be asked this.', 'start': 13402.7, 'duration': 2.901}, {'end': 13415.685, 'text': 'And degree of freedom formula is just like your sample variance problem, that is n minus 1, which we also use with respect to basal correction.', 'start': 13406.461, 'duration': 9.224}, {'end': 13419.577, 'text': 'So, this will be 25 minus 1, which is nothing but 24.', 'start': 13415.725, 'duration': 3.852}, {'end': 13420.838, 'text': 'Now I will go to my browser.', 'start': 13419.577, 'duration': 1.261}, {'end': 13424.562, 'text': 'I will open over here T-table.', 'start': 13420.858, 'duration': 3.704}, {'end': 13426.965, 'text': 'So T-table I am having here.', 'start': 13425.063, 'duration': 1.902}, {'end': 13433.051, 'text': 'Now first thing first you need to understand with respect to degree of freedom.', 'start': 13426.985, 'duration': 6.066}, {'end': 13435.814, 'text': 'What is degree of freedom? 24.', 'start': 13433.151, 'duration': 2.663}, {'end': 13436.755, 'text': 'Degree of freedom is 24.', 'start': 13435.814, 'duration': 0.941}, {'end': 13437.396, 'text': "25 Let's see this.", 'start': 13436.755, 'duration': 0.641}, {'end': 13447.592, 'text': 'This is 24, right? I hope everybody is able to see the degree of freedom over here.', 'start': 13442.642, 'duration': 4.95}, {'end': 13449.637, 'text': 'Try to have a look onto this table.', 'start': 13447.973, 'duration': 1.664}, {'end': 13457.192, 'text': '0.025, 0.025.', 'start': 13449.637, 'duration': 7.555}, {'end': 13459.944, 'text': '0.025 is nothing but this one right.', 'start': 13457.192, 'duration': 2.752}, {'end': 13462.046, 'text': 'this is what 0.975.', 'start': 13459.944, 'duration': 2.102}, {'end': 13467.229, 'text': 'so if i see with respect to 2.2, sorry, 24, it is nothing but 2.064.', 'start': 13462.046, 'duration': 5.183}, {'end': 13469.971, 'text': 'is everybody getting it?', 'start': 13467.229, 'duration': 2.742}, {'end': 13474.735, 'text': 'we have to see in this line 24 degree of freedom on the left hand side, on the right hand side.', 'start': 13469.971, 'duration': 4.764}, {'end': 13476.276, 'text': 'you can see on top it is 0.025, 0.05.', 'start': 13474.735, 'duration': 1.541}, {'end': 13476.776, 'text': 'so the answer is 2.06, 2.064.', 'start': 13476.276, 'duration': 0.5}, {'end': 13491.854, 'text': "so here i'm basically going to find your t 0.05, divided by 2 is equal to nothing but 2.064.", 'start': 13476.776, 'duration': 15.078}, {'end': 13493.416, 'text': 'now the next step.', 'start': 13491.854, 'duration': 1.562}, {'end': 13505.168, 'text': 'once you get this, i will go and see what is my x bar 520, 520 plus 2.064, multiplied by s.', 'start': 13493.416, 'duration': 11.752}, {'end': 13505.969, 'text': 'what is s?', 'start': 13505.168, 'duration': 0.801}, {'end': 13506.429, 'text': 'over here?', 'start': 13505.969, 'duration': 0.46}, {'end': 13510.392, 'text': 'it is nothing but 80 by 5.', 'start': 13506.429, 'duration': 3.963}, {'end': 13512.314, 'text': '5 is nothing but root.', 'start': 13510.392, 'duration': 1.922}, {'end': 13516.776, 'text': '25 is 5, 5, 53.024.', 'start': 13512.314, 'duration': 4.462}, {'end': 13522.459, 'text': 'and then, if i go and compute the lower bound 520 minus 2.064, 80 by 5, so this minus 520.', 'start': 13516.776, 'duration': 5.683}, {'end': 13525.821, 'text': "so here i'm actually getting 486 point.", 'start': 13522.459, 'duration': 3.362}, {'end': 13541.154, 'text': 'So my lower bound is nothing but 486.97,.', 'start': 13536.43, 'duration': 4.724}, {'end': 13544.557, 'text': 'the upper bound of the confidence interval is nothing but 553.02..', 'start': 13541.154, 'duration': 3.403}, {'end': 13552.584, 'text': 'So with this we have done, wow I have written so much today.', 'start': 13544.557, 'duration': 8.027}, {'end': 13555.086, 'text': 'We have finished confidence interval.', 'start': 13553.445, 'duration': 1.641}, {'end': 13557.608, 'text': 'Congratulations everybody.', 'start': 13556.387, 'duration': 1.221}, {'end': 13560.151, 'text': 'We have successfully completed.', 'start': 13558.669, 'duration': 1.482}, {'end': 13568.736, 'text': 'Congratulations Why this is not 2 tail? This is 2 tail only, no? I told you, no? This is 2 tail.', 'start': 13560.171, 'duration': 8.565}, {'end': 13570.837, 'text': 'Why you are getting confused? See over here.', 'start': 13569.316, 'duration': 1.521}, {'end': 13577.8, 'text': 'If I see over here, 0.025 for 1 tail, for 2 tail this is 0.05.', 'start': 13572.238, 'duration': 5.562}, {'end': 13580.562, 'text': "Now let's go ahead and try to do the first Z test.", 'start': 13577.8, 'duration': 2.762}, {'end': 13584.991, 'text': 'I hope everybody is understood why do we use Z test.', 'start': 13582.347, 'duration': 2.644}, {'end': 13592.263, 'text': 'So the first question that we are going to solve is one sample Z test now we will perform hypothesis testing.', 'start': 13586.394, 'duration': 5.869}, {'end': 13599.916, 'text': 'So the first problem that we are going to solve is one sample Z test.', 'start': 13592.724, 'duration': 7.192}, {'end': 13602.138, 'text': 'now we are going to perform hypothesis testing.', 'start': 13599.916, 'duration': 2.222}, {'end': 13604.019, 'text': 'what exactly is one sample Z test?', 'start': 13602.138, 'duration': 1.881}, {'end': 13607.321, 'text': 'first of all, I told you two conditions with respect to Z test.', 'start': 13604.019, 'duration': 3.302}, {'end': 13612.745, 'text': 'the first condition is that the population standard deviation is given.', 'start': 13607.321, 'duration': 5.424}, {'end': 13614.746, 'text': 'at that time you use Z test.', 'start': 13612.745, 'duration': 2.001}, {'end': 13618.768, 'text': 'the second thing is that your sample size should be.', 'start': 13614.746, 'duration': 4.022}, {'end': 13621.21, 'text': 'sample size should be, have a size at least.', 'start': 13618.768, 'duration': 2.442}, {'end': 13624.232, 'text': 'n is greater than or equal to 30.', 'start': 13621.21, 'duration': 3.022}, {'end': 13634.802, 'text': 'just to make calculation easier, I just put it at n is equal to 25, because root of 25 was 5, so because of that I put it do not fight with me.', 'start': 13624.232, 'duration': 10.57}, {'end': 13635.803, 'text': 'I have no energy.', 'start': 13634.802, 'duration': 1.001}], 'summary': 'The transcript discusses confidence interval formulas, z test, and t test for hypothesis testing with examples and calculations.', 'duration': 53.456, 'max_score': 12813.377, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs12813377.jpg'}, {'end': 13855.686, 'src': 'embed', 'start': 13836.075, 'weight': 8, 'content': [{'end': 13847.422, 'text': 'in the decision rule you need to specify this graph and here you will basically say that since my alpha is 0.05, what kind of test this will be?', 'start': 13836.075, 'duration': 11.347}, {'end': 13851.825, 'text': 'Did the medication affect the intelligence? The question.', 'start': 13848.303, 'duration': 3.522}, {'end': 13852.745, 'text': 'understand this question.', 'start': 13851.825, 'duration': 0.92}, {'end': 13855.686, 'text': 'Did the medication affect the intelligence?', 'start': 13853.305, 'duration': 2.381}], 'summary': 'Decision rule: specify graph, alpha 0.05, test impact of medication on intelligence.', 'duration': 19.611, 'max_score': 13836.075, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs13836075.jpg'}, {'end': 13961.02, 'src': 'embed', 'start': 13932.782, 'weight': 10, 'content': [{'end': 13938.007, 'text': 'See this? We got 1.96.', 'start': 13932.782, 'duration': 5.225}, {'end': 13942.39, 'text': 'Okay, so 1.96 plus minus 1.96.', 'start': 13938.007, 'duration': 4.383}, {'end': 13949.514, 'text': 'Now I know my decision rule, my whatever experiment I will perform later on, whatever Z score value I will be getting,', 'start': 13942.39, 'duration': 7.124}, {'end': 13954.257, 'text': 'I should be getting within this minus 1.96 to plus 1.96..', 'start': 13949.514, 'duration': 4.743}, {'end': 13957.079, 'text': 'Now here I will use my test statistics.', 'start': 13954.257, 'duration': 2.822}, {'end': 13961.02, 'text': "Now, what will be my test statistics over here? It's very simple.", 'start': 13957.799, 'duration': 3.221}], 'summary': 'Experiment requires z score within ±1.96 for test statistics.', 'duration': 28.238, 'max_score': 13932.782, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs13932782.jpg'}, {'end': 14533.986, 'src': 'embed', 'start': 14502.683, 'weight': 6, 'content': [{'end': 14506.346, 'text': 'So it is T table, degree of freedom 29.', 'start': 14502.683, 'duration': 3.663}, {'end': 14512.014, 'text': 'So 2.045, so 2.045.', 'start': 14506.346, 'duration': 5.668}, {'end': 14519.078, 'text': 'So here you will be able to see plus 2.0, what was that? 2.045, sorry.', 'start': 14512.014, 'duration': 7.064}, {'end': 14524.801, 'text': 'I am minus 2.045, right? So this is your decision rule.', 'start': 14519.238, 'duration': 5.563}, {'end': 14528.883, 'text': 'Now your t value that you should be getting should be between this.', 'start': 14525.581, 'duration': 3.302}, {'end': 14532.145, 'text': 'If it is greater or lesser than this, you reject the null hypothesis.', 'start': 14528.963, 'duration': 3.182}, {'end': 14533.986, 'text': 'That is what you have to probably do.', 'start': 14532.325, 'duration': 1.661}], 'summary': 'T table, df 29, t value 2.045. decision rule: reject if t > 2.045 or t < -2.045.', 'duration': 31.303, 'max_score': 14502.683, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs14502683.jpg'}, {'end': 14652.844, 'src': 'embed', 'start': 14617.356, 'weight': 11, 'content': [{'end': 14621.157, 'text': 'the null hypothesis except the null hypothesis.', 'start': 14617.356, 'duration': 3.801}, {'end': 14625.398, 'text': 'sorry, reject the null hypothesis except the alternate.', 'start': 14621.157, 'duration': 4.241}, {'end': 14630.999, 'text': 'so from my teaching, did your iq increased or not?', 'start': 14625.398, 'duration': 5.601}, {'end': 14637.921, 'text': "now let's see a real world problem, and probably you can do this from yourself.", 'start': 14630.999, 'duration': 6.922}, {'end': 14645.963, 'text': 'a bank wants to open an atm machine in a specific area.', 'start': 14637.921, 'duration': 8.042}, {'end': 14652.844, 'text': 'So this problem you have to formulate and you have to think over it how we can apply hypothesis testing.', 'start': 14647.281, 'duration': 5.563}], 'summary': 'Teaching on rejecting null hypothesis, applying hypothesis testing to opening atm in a specific area.', 'duration': 35.488, 'max_score': 14617.356, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs14617356.jpg'}], 'start': 11815.343, 'title': 'Statistical analysis', 'summary': 'Covers concepts of type 1 and type 2 errors, one-tailed and two-tailed tests, analysis of college placement rates, confidence interval calculation, point estimate, t-test, one sample z test for hypothesis testing, and test statistics, with real-world scenarios and specific examples, such as comparing college placement rates in karnataka and computing confidence intervals and test scores with given data.', 'chapters': [{'end': 11998.73, 'start': 11815.343, 'title': 'Understanding type 1 and type 2 errors', 'summary': 'Explains the concepts of type 1 and type 2 errors, using confusion matrix and real-world scenarios. it also introduces the topic of one-tailed and two-tailed tests.', 'duration': 183.387, 'highlights': ['The chapter focuses on explaining type 1 and type 2 errors using real-world scenarios and the concept of a confusion matrix, emphasizing the importance of making the correct decisions in hypothesis testing.', 'The concept of type 1 and type 2 errors is clearly explained, with a specific focus on understanding the implications of accepting the null hypothesis when it is false (type 1 error) and rejecting the null hypothesis when it is true (type 2 error).', 'Introduction of the topic of one-tailed and two-tailed tests is highlighted as an important concept, indicating a shift in the discussion towards this new topic.']}, {'end': 12308.933, 'start': 11999.511, 'title': 'College placement rate analysis', 'summary': 'Discusses the analysis of college placement rates in karnataka, comparing the placement rate of a new college with a sample of 150 students showing an 88% placement rate with a standard deviation of 4%, to the 85% placement rate of other colleges, determining whether it has a different placement rate using one-tailed and two-tailed tests with a significance level of 0.05.', 'duration': 309.422, 'highlights': ["Comparing the placement rate of a new college with a sample of 150 students showing an 88% placement rate with a standard deviation of 4%, to the 85% placement rate of other colleges The new college's placement rate of 88% with a sample of 150 students is compared to the 85% placement rate of other colleges in Karnataka.", "Determining whether the new college has a different placement rate using one-tailed and two-tailed tests with a significance level of 0.05 The analysis involves using both one-tailed and two-tailed tests with a significance level of 0.05 to determine if the new college's placement rate is different from the 85% rate of other colleges."]}, {'end': 13193.841, 'start': 12309.413, 'title': 'Confidence interval calculation and understanding point estimate', 'summary': 'Explains the concept of confidence interval, point estimate, and the calculation of a 95% confidence interval for a given problem statement with a sample size of 25 and a population standard deviation of 100, using relevant formulas and z-scores.', 'duration': 884.428, 'highlights': ['The chapter discusses the calculation of a 95% confidence interval for a given problem statement with a sample size of 25 and a population standard deviation of 100. The chapter delves into the practical application of confidence interval calculation, demonstrating how to compute the upper and lower bounds for a 95% confidence interval using the formula and Z-scores.', 'The concept of point estimate is explained, emphasizing the estimation of a population parameter using sample statistics. The chapter provides a clear definition of point estimate as a value of any statistic that estimates the value of a parameter, with a focus on inferring population data from sample data through the estimation of population mean.', 'The relationship between alpha value, one-tailed test, and confidence interval is highlighted, demonstrating the significance of alpha in determining the confidence level and the corresponding Z-scores. The chapter elucidates the interplay between alpha value, one-tailed test, and confidence interval, showcasing how the alpha value dictates the confidence level and the subsequent determination of Z-scores for confidence interval computation.']}, {'end': 13552.584, 'start': 13193.841, 'title': 'T-test and confidence interval example', 'summary': 'Discusses the use of t-test when population standard deviation is not given, and demonstrates the computation of a 95 percent confidence interval about the mean using a sample standard deviation of 80 and a sample of 25 test takers with a mean of 520 score.', 'duration': 358.743, 'highlights': ['The population standard deviation is not given, requiring the use of t-test, and a 95 percent confidence interval about the mean is to be constructed using a sample standard deviation of 80 and a sample of 25 test takers with a mean of 520 score.', 'The degree of freedom for the t-test is calculated as n-1, resulting in a value of 24, and a t-value of 2.064 is obtained from the t-table for a significance level of 0.05.', 'The upper bound of the confidence interval is calculated as 553.02 and the lower bound as 486.97, demonstrating the computation of the 95 percent confidence interval about the mean.']}, {'end': 13836.075, 'start': 13553.445, 'title': 'One sample z test hypothesis', 'summary': 'Covers the completion of confidence intervals and the application of one sample z test for hypothesis testing, including conditions for using z test, defining null and alternate hypotheses, and setting the alpha value and decision rule.', 'duration': 282.63, 'highlights': ['The chapter discusses the completion of confidence intervals and the transition to one sample Z test for hypothesis testing, emphasizing the conditions for using Z test and the importance of population standard deviation and sample size, with the criteria of n greater than or equal to 30 for easier calculations.', "The instructor provides a detailed example of a problem statement involving testing a new medication's effect on intelligence using one sample Z test, reinforcing the process of defining null and alternate hypotheses, setting the alpha value, and establishing the decision rule with a specific alpha value of 0.05."]}, {'end': 14734.548, 'start': 13836.075, 'title': 'Hypothesis testing and test statistics', 'summary': 'Covers hypothesis testing using z-test and one-sample t-test with detailed explanations, providing insights into decision rules, test statistics, and practical applications.', 'duration': 898.473, 'highlights': ['The chapter provides detailed explanations on hypothesis testing using Z-test and one-sample t-test. The transcript includes in-depth discussions on hypothesis testing using Z-test and one-sample t-test, covering decision rules, test statistics, and practical applications.', 'Detailed examples and calculations are provided for Z-test decision rule and test statistics. The transcript offers detailed examples and calculations for Z-test decision rule, including the determination of critical values and computation of test statistics.', 'Clear explanations and practical implementation guidance for hypothesis testing are provided. The transcript includes clear explanations and practical implementation guidance for hypothesis testing, along with real-world problem formulation and application of hypothesis testing in a banking scenario.']}], 'duration': 2919.205, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs11815343.jpg', 'highlights': ['The chapter emphasizes the importance of making correct decisions in hypothesis testing, focusing on type 1 and type 2 errors using real-world scenarios.', 'The chapter explains the concept of one-tailed and two-tailed tests, shifting the discussion towards this new topic.', "The analysis involves using both one-tailed and two-tailed tests with a significance level of 0.05 to determine if the new college's placement rate is different from the 85% rate of other colleges.", 'The chapter delves into the practical application of confidence interval calculation, demonstrating how to compute the upper and lower bounds for a 95% confidence interval using the formula and Z-scores.', 'The chapter provides a clear definition of point estimate as a value of any statistic that estimates the value of a parameter, with a focus on inferring population data from sample data through the estimation of population mean.', 'The chapter elucidates the interplay between alpha value, one-tailed test, and confidence interval, showcasing how the alpha value dictates the confidence level and the subsequent determination of Z-scores for confidence interval computation.', 'The chapter demonstrates the computation of the 95 percent confidence interval about the mean using t-test, resulting in a value of 24 for the degree of freedom and a t-value of 2.064 obtained from the t-table for a significance level of 0.05.', 'The chapter discusses the transition to one sample Z test for hypothesis testing, emphasizing the conditions for using Z test and the importance of population standard deviation and sample size, with the criteria of n greater than or equal to 30 for easier calculations.', "The instructor provides a detailed example of a problem statement involving testing a new medication's effect on intelligence using one sample Z test, reinforcing the process of defining null and alternate hypotheses, setting the alpha value, and establishing the decision rule with a specific alpha value of 0.05.", 'The chapter provides detailed explanations on hypothesis testing using Z-test and one-sample t-test, covering decision rules, test statistics, and practical applications.', 'The transcript offers detailed examples and calculations for Z-test decision rule, including the determination of critical values and computation of test statistics.', 'The transcript includes clear explanations and practical implementation guidance for hypothesis testing, along with real-world problem formulation and application of hypothesis testing in a banking scenario.']}, {'end': 16056, 'segs': [{'end': 14761.706, 'src': 'embed', 'start': 14735.029, 'weight': 6, 'content': [{'end': 14744.257, 'text': 'Now in this practical implementation, we will try to perform Z-test, T-test and probably also see how to perform chi-square test.', 'start': 14735.029, 'duration': 9.228}, {'end': 14751.921, 'text': 'Sys topic we will also see F test which is the last topic which is also called as ANOVA test.', 'start': 14744.818, 'duration': 7.103}, {'end': 14761.706, 'text': 'The reason why I have kept F test as last because the calculation will be very very, the calculation is quite complex in that particular case.', 'start': 14752.482, 'duration': 9.224}], 'summary': 'Practical implementation includes z-test, t-test, chi-square test, and anova test with focus on complex f test calculation.', 'duration': 26.677, 'max_score': 14735.029, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs14735029.jpg'}, {'end': 14818.13, 'src': 'embed', 'start': 14789.744, 'weight': 0, 'content': [{'end': 14798.751, 'text': 'that basically means, if someone ask you, krish, okay, someone ask you in the interview, why is chi-square test use that why it is used?', 'start': 14789.744, 'duration': 9.007}, {'end': 14814.368, 'text': 'you can just say that it is a non-parametric test that is performed on categorical variables, categorical, it can be nominal or ordinal data.', 'start': 14798.751, 'duration': 15.617}, {'end': 14818.13, 'text': 'so this is how you basically define a chi-square test.', 'start': 14814.368, 'duration': 3.762}], 'summary': 'Chi-square test is a non-parametric test for categorical variables, both nominal and ordinal data.', 'duration': 28.386, 'max_score': 14789.744, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs14789744.jpg'}, {'end': 14911.314, 'src': 'embed', 'start': 14864.119, 'weight': 4, 'content': [{'end': 14865.159, 'text': 'So this is my question.', 'start': 14864.119, 'duration': 1.04}, {'end': 14889.216, 'text': 'In 2000 Indian census, the ages of the individual in a small town in the small town were found to be the following.', 'start': 14866.039, 'duration': 23.177}, {'end': 14891.059, 'text': 'Now over here.', 'start': 14889.677, 'duration': 1.382}, {'end': 14901.23, 'text': 'you have three categories less than 18 years, 18 to 35 years and greater than 35 years.', 'start': 14891.059, 'duration': 10.171}, {'end': 14906.972, 'text': 'so you had this information in the 2000 census.', 'start': 14901.23, 'duration': 5.742}, {'end': 14911.314, 'text': 'that basically means less than 18 years were basically 20 percent.', 'start': 14906.972, 'duration': 4.342}], 'summary': 'In the 2000 indian census, less than 18 years accounted for 20% in a small town.', 'duration': 47.195, 'max_score': 14864.119, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs14864119.jpg'}, {'end': 15082.199, 'src': 'embed', 'start': 15057.729, 'weight': 2, 'content': [{'end': 15065.793, 'text': 'Whenever you are given some kind of proportions of data, at that point of time, you cannot specifically use a kind of parametric test.', 'start': 15057.729, 'duration': 8.064}, {'end': 15068.133, 'text': 'So you have to go with non-parametric test.', 'start': 15065.993, 'duration': 2.14}, {'end': 15074.416, 'text': 'Now, here you can actually see that this is the original data with respect to the population.', 'start': 15068.294, 'duration': 6.122}, {'end': 15076.677, 'text': 'Then you sample the data and you found it out right?', 'start': 15074.436, 'duration': 2.241}, {'end': 15079.198, 'text': 'And then we are just trying to see that.', 'start': 15077.037, 'duration': 2.161}, {'end': 15082.199, 'text': 'what is the difference between this to this, this to this?', 'start': 15079.198, 'duration': 3.001}], 'summary': 'Non-parametric tests are used when dealing with proportions of data and comparing original population data with the sampled data.', 'duration': 24.47, 'max_score': 15057.729, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs15057729.jpg'}, {'end': 15577.74, 'src': 'heatmap', 'start': 15376.125, 'weight': 0.983, 'content': [{'end': 15378.867, 'text': 'Meets the distribution, meets the distribution.', 'start': 15376.125, 'duration': 2.742}, {'end': 15383.838, 'text': 'This is the data, right? This is the data that Observation data, it meets the distribution of 2010 census.', 'start': 15378.927, 'duration': 4.911}, {'end': 15387.643, 'text': 'Sorry, of 2000 sensors.', 'start': 15386.062, 'duration': 1.581}, {'end': 15394.148, 'text': 'My alternate hypothesis will say that the data does not meet the distribution of 2000 sensors.', 'start': 15388.204, 'duration': 5.944}, {'end': 15399.853, 'text': 'So I hope everybody is able to understand the null hypothesis and the alternate hypothesis.', 'start': 15394.208, 'duration': 5.645}, {'end': 15402.955, 'text': 'Then the second step is my alpha value.', 'start': 15400.173, 'duration': 2.782}, {'end': 15404.937, 'text': 'My alpha is 0.05.', 'start': 15403.095, 'duration': 1.842}, {'end': 15407.799, 'text': 'That basically means 95% confidence interval.', 'start': 15404.937, 'duration': 2.862}, {'end': 15414.564, 'text': 'Now, the third step in this is that whenever we do a chi-square test, we also need to know the degree of freedom.', 'start': 15408.339, 'duration': 6.225}, {'end': 15420.175, 'text': 'So how do we calculate degree of freedom? This is the steps, guys.', 'start': 15415.325, 'duration': 4.85}, {'end': 15421.796, 'text': 'And always this will be like this only.', 'start': 15420.395, 'duration': 1.401}, {'end': 15423.116, 'text': 'N minus 1.', 'start': 15422.416, 'duration': 0.7}, {'end': 15429.758, 'text': 'What is N over here? N is nothing but this is 1, 2, and 3.', 'start': 15423.116, 'duration': 6.642}, {'end': 15432.118, 'text': 'This is where number of categories are coming into picture.', 'start': 15429.758, 'duration': 2.36}, {'end': 15433.859, 'text': 'Categories are coming into picture.', 'start': 15432.158, 'duration': 1.701}, {'end': 15435.239, 'text': '1, 2, and 3.', 'start': 15433.879, 'duration': 1.36}, {'end': 15437.84, 'text': 'So 3 minus 1 is basically 2.', 'start': 15435.239, 'duration': 2.601}, {'end': 15440.56, 'text': 'Age is now categorical, right? Absolutely, perfectly fine.', 'start': 15437.84, 'duration': 2.72}, {'end': 15442.101, 'text': 'You know your degree of freedom.', 'start': 15440.9, 'duration': 1.201}, {'end': 15444.441, 'text': 'Your degree of freedom is 2.', 'start': 15442.141, 'duration': 2.3}, {'end': 15447.362, 'text': 'And your alpha value is 0.05.', 'start': 15444.441, 'duration': 2.921}, {'end': 15455.826, 'text': 'All you have to do is that go and check in the chi-square table, okay, to find out your decision boundary.', 'start': 15447.362, 'duration': 8.464}, {'end': 15458.367, 'text': 'Is this a one-tailed test or two-tailed test??', 'start': 15456.206, 'duration': 2.161}, {'end': 15461.228, 'text': 'The data may be less than your distribution.', 'start': 15458.747, 'duration': 2.481}, {'end': 15462.689, 'text': 'it may be more right?', 'start': 15461.228, 'duration': 1.461}, {'end': 15468.872, 'text': 'So here is this a two-tailed test, because alpha is 0.05..', 'start': 15463.47, 'duration': 5.402}, {'end': 15472.774, 'text': 'Guys, we have to pick 3 as n because there are three age categories.', 'start': 15468.872, 'duration': 3.902}, {'end': 15475.273, 'text': 'So this will become a two-tailed test.', 'start': 15473.732, 'duration': 1.541}, {'end': 15478.815, 'text': 'Now in two-tailed test, all I have to do is that open a chi-square table.', 'start': 15475.593, 'duration': 3.222}, {'end': 15479.395, 'text': "Let's see.", 'start': 15478.875, 'duration': 0.52}, {'end': 15481.276, 'text': 'Now this is my chi-square table.', 'start': 15479.956, 'duration': 1.32}, {'end': 15484.318, 'text': 'Hope so I get the answer quickly.', 'start': 15482.697, 'duration': 1.621}, {'end': 15488.18, 'text': 'So Df is 2.', 'start': 15485.339, 'duration': 2.841}, {'end': 15492.963, 'text': 'To look upon an area on the left, subtract it from the 1.', 'start': 15488.18, 'duration': 4.783}, {'end': 15497.706, 'text': '0.05 See, 0.05 is here and degree of freedom is here.', 'start': 15492.963, 'duration': 4.743}, {'end': 15498.986, 'text': 'So this becomes 5.991.', 'start': 15497.846, 'duration': 1.14}, {'end': 15509.349, 'text': 'So over here, your we usually mention chi-square by x-square.', 'start': 15498.986, 'duration': 10.363}, {'end': 15512.131, 'text': 'And chi-square is basically denoted by x-square.', 'start': 15509.849, 'duration': 2.282}, {'end': 15522.097, 'text': 'And my decision boundary is that if chi-square is greater than 5.99, I have to reject H0.', 'start': 15513.111, 'duration': 8.986}, {'end': 15525.379, 'text': "Now let's go ahead and compute the chi-square test.", 'start': 15522.618, 'duration': 2.761}, {'end': 15528.461, 'text': 'As usual, very simple definition.', 'start': 15526.2, 'duration': 2.261}, {'end': 15539.206, 'text': 'So my definition will be that fifth is calculate the test statistics, which is called as chi-square test.', 'start': 15529.502, 'duration': 9.704}, {'end': 15547.828, 'text': 'this is nothing but x square is equal to summation of f0 minus fe, whole square divided by fe.', 'start': 15539.206, 'duration': 8.622}, {'end': 15552.37, 'text': 'again, notation can be used in all different ways, but let me talk about what is f0.', 'start': 15547.828, 'duration': 4.542}, {'end': 15555.81, 'text': 'f0 basically means observed.', 'start': 15552.37, 'duration': 3.44}, {'end': 15557.431, 'text': 'okay, observed.', 'start': 15555.81, 'duration': 1.621}, {'end': 15559.371, 'text': 'fe basically means expected.', 'start': 15557.431, 'duration': 1.94}, {'end': 15561.832, 'text': 'so i am going to do the summation of all these three values.', 'start': 15559.371, 'duration': 2.461}, {'end': 15569.216, 'text': 'So here I will first of all write 121 is my first observed value.', 'start': 15562.432, 'duration': 6.784}, {'end': 15570.796, 'text': 'See 121, 100.', 'start': 15569.276, 'duration': 1.52}, {'end': 15577.74, 'text': 'So 121 minus 100 whole square divided by 100.', 'start': 15570.797, 'duration': 6.943}], 'summary': 'Conducting chi-square test to determine distribution adherence with 95% confidence interval and 2 degrees of freedom.', 'duration': 201.615, 'max_score': 15376.125, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs15376125.jpg'}, {'end': 15432.118, 'src': 'embed', 'start': 15404.937, 'weight': 3, 'content': [{'end': 15407.799, 'text': 'That basically means 95% confidence interval.', 'start': 15404.937, 'duration': 2.862}, {'end': 15414.564, 'text': 'Now, the third step in this is that whenever we do a chi-square test, we also need to know the degree of freedom.', 'start': 15408.339, 'duration': 6.225}, {'end': 15420.175, 'text': 'So how do we calculate degree of freedom? This is the steps, guys.', 'start': 15415.325, 'duration': 4.85}, {'end': 15421.796, 'text': 'And always this will be like this only.', 'start': 15420.395, 'duration': 1.401}, {'end': 15423.116, 'text': 'N minus 1.', 'start': 15422.416, 'duration': 0.7}, {'end': 15429.758, 'text': 'What is N over here? N is nothing but this is 1, 2, and 3.', 'start': 15423.116, 'duration': 6.642}, {'end': 15432.118, 'text': 'This is where number of categories are coming into picture.', 'start': 15429.758, 'duration': 2.36}], 'summary': 'Chi-square test requires 95% confidence interval and degree of freedom, calculated as n-1 where n represents the number of categories.', 'duration': 27.181, 'max_score': 15404.937, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs15404937.jpg'}, {'end': 15539.206, 'src': 'embed', 'start': 15475.593, 'weight': 7, 'content': [{'end': 15478.815, 'text': 'Now in two-tailed test, all I have to do is that open a chi-square table.', 'start': 15475.593, 'duration': 3.222}, {'end': 15479.395, 'text': "Let's see.", 'start': 15478.875, 'duration': 0.52}, {'end': 15481.276, 'text': 'Now this is my chi-square table.', 'start': 15479.956, 'duration': 1.32}, {'end': 15484.318, 'text': 'Hope so I get the answer quickly.', 'start': 15482.697, 'duration': 1.621}, {'end': 15488.18, 'text': 'So Df is 2.', 'start': 15485.339, 'duration': 2.841}, {'end': 15492.963, 'text': 'To look upon an area on the left, subtract it from the 1.', 'start': 15488.18, 'duration': 4.783}, {'end': 15497.706, 'text': '0.05 See, 0.05 is here and degree of freedom is here.', 'start': 15492.963, 'duration': 4.743}, {'end': 15498.986, 'text': 'So this becomes 5.991.', 'start': 15497.846, 'duration': 1.14}, {'end': 15509.349, 'text': 'So over here, your we usually mention chi-square by x-square.', 'start': 15498.986, 'duration': 10.363}, {'end': 15512.131, 'text': 'And chi-square is basically denoted by x-square.', 'start': 15509.849, 'duration': 2.282}, {'end': 15522.097, 'text': 'And my decision boundary is that if chi-square is greater than 5.99, I have to reject H0.', 'start': 15513.111, 'duration': 8.986}, {'end': 15525.379, 'text': "Now let's go ahead and compute the chi-square test.", 'start': 15522.618, 'duration': 2.761}, {'end': 15528.461, 'text': 'As usual, very simple definition.', 'start': 15526.2, 'duration': 2.261}, {'end': 15539.206, 'text': 'So my definition will be that fifth is calculate the test statistics, which is called as chi-square test.', 'start': 15529.502, 'duration': 9.704}], 'summary': 'In a two-tailed test, using a chi-square table, the critical value is 5.991, and if the chi-square value is greater than this, the null hypothesis is rejected.', 'duration': 63.613, 'max_score': 15475.593, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs15475593.jpg'}, {'end': 15635.671, 'src': 'embed', 'start': 15609.47, 'weight': 1, 'content': [{'end': 15615.795, 'text': 'that basically means my x square is 232.94, which is obviously greater than 5.99.', 'start': 15609.47, 'duration': 6.325}, {'end': 15616.916, 'text': 'so what we have to do?', 'start': 15615.795, 'duration': 1.121}, {'end': 15625.965, 'text': 'we have to reject the null hypothesis and which is absolutely true because the population distribution has changed.', 'start': 15616.916, 'duration': 9.049}, {'end': 15628.827, 'text': 'So 232 is greater than 5.99..', 'start': 15627.166, 'duration': 1.661}, {'end': 15630.548, 'text': 'So, we are rejecting the null hypothesis.', 'start': 15628.827, 'duration': 1.721}, {'end': 15632.969, 'text': 'Okay, it is 494.', 'start': 15630.788, 'duration': 2.181}, {'end': 15635.671, 'text': 'Okay, let me write 494.', 'start': 15632.969, 'duration': 2.702}], 'summary': 'X square value is 232.94, rejecting null hypothesis as population distribution changed.', 'duration': 26.201, 'max_score': 15609.47, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs15609470.jpg'}, {'end': 15703.202, 'src': 'embed', 'start': 15676.701, 'weight': 5, 'content': [{'end': 15681.646, 'text': 'Now I am going to show you the code in Python to determine if the new drug causes a significant effect or not.', 'start': 15676.701, 'duration': 4.945}, {'end': 15683.907, 'text': 'So I am just going to execute this.', 'start': 15682.426, 'duration': 1.481}, {'end': 15687.831, 'text': "and let's say that i have this 20 records for z test.", 'start': 15684.688, 'duration': 3.143}, {'end': 15692.574, 'text': 'we use this library, which is called as stat models, dot stats, dot weight stats.', 'start': 15687.831, 'duration': 4.743}, {'end': 15694.015, 'text': 'import z test as z test.', 'start': 15692.574, 'duration': 1.441}, {'end': 15700.06, 'text': 'so these are my 20 patients and i have recorded the iq after the medication is basically applied.', 'start': 15694.015, 'duration': 6.045}, {'end': 15703.202, 'text': 'now, in order to apply z test, no need to do that much calculation.', 'start': 15700.06, 'duration': 3.142}], 'summary': "Python code uses z-test to analyze drug effect on 20 patients' iq.", 'duration': 26.501, 'max_score': 15676.701, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs15676701.jpg'}, {'end': 15796.279, 'src': 'embed', 'start': 15762.64, 'weight': 9, 'content': [{'end': 15766.404, 'text': 'This p-value can be used along with significance value.', 'start': 15762.64, 'duration': 3.764}, {'end': 15768.866, 'text': 'And suppose right now the p-value is 0.001.', 'start': 15766.924, 'duration': 1.942}, {'end': 15769.306, 'text': "Let's say that.", 'start': 15768.866, 'duration': 0.44}, {'end': 15778.072, 'text': '0.11 This 0.11, suppose if it is less than significance level.', 'start': 15769.306, 'duration': 8.766}, {'end': 15783.214, 'text': "Now in this particular case, let's consider that I am going to take a significance level of 0.05.", 'start': 15778.092, 'duration': 5.122}, {'end': 15788.776, 'text': 'So if this is less than this, then obviously we reject the null hypothesis.', 'start': 15783.214, 'duration': 5.562}, {'end': 15796.279, 'text': 'This is just saying that based on this p-value, it is basically falling in this region.', 'start': 15789.476, 'duration': 6.803}], 'summary': 'The p-value of 0.001 is less than the significance level of 0.05, indicating rejection of the null hypothesis.', 'duration': 33.639, 'max_score': 15762.64, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs15762640.jpg'}], 'start': 14735.029, 'title': 'Chi-square and z-test analysis', 'summary': 'Discusses the application of chi-square test in non-parametric analysis, population age distribution comparison, testing process, and the application of z-test and chi-square analysis in python, highlighting the significance in problem-solving and hypothesis testing.', 'chapters': [{'end': 14864.059, 'start': 14735.029, 'title': 'Chi-square test and its application', 'summary': 'Discusses the application of chi-square test, emphasizing its use as a non-parametric test on categorical or ordinal data, and its importance in problem-solving in interviews.', 'duration': 129.03, 'highlights': ['Chi-square test is a non-parametric test used on categorical or ordinal data, emphasizing its significance in problem-solving interviews.', 'The F test (ANOVA test) is discussed, with a focus on its complexity in calculation.', 'The chapter also covers Z-test, T-test, and highlights the practical implementation of statistical tests such as Z-test, T-test, and chi-square test.']}, {'end': 15355.911, 'start': 14864.119, 'title': 'Population age distribution analysis', 'summary': 'Presents a comparison of the population distribution of age groups in a small town in the 2000 indian census and a sample of 500 individuals in 2010, indicating a noticeable difference, leading to the exploration of a non-parametric test to assess if the population distribution has changed over the last 10 years.', 'duration': 491.792, 'highlights': ['The 2000 Indian census revealed age distribution percentages as follows: <18 years - 20%, 18-35 years - 30%, >35 years - 50%. Key points: percentages of age groups in the small town as per the 2000 Indian census.', 'The 2010 sample of 500 individuals showed the following distribution: <18 years - 121 people, 18-35 years - 288 people, >35 years - 91 people. Key points: distribution of age groups in the sample of 500 individuals in 2010.', 'The comparison between the expected and observed distributions indicated a significant difference, suggesting a potential change in the population distribution. Key points: observation of a substantial deviation between the expected and observed distributions, implying a potential shift in the population distribution.', 'The analysis involves the utilization of a non-parametric test due to the nature of the data being proportions, leading to the exploration of hypotheses regarding the potential change in population distribution. Key points: explanation of the choice of a non-parametric test and the exploration of hypotheses regarding population distribution change.']}, {'end': 15609.47, 'start': 15355.931, 'title': 'Chi-square testing process', 'summary': 'Covers the process of conducting a chi-square test to determine if data meets a specific distribution, including defining null and alternate hypotheses, calculating degree of freedom, and computing the test statistics, with an alpha value of 0.05 and a decision boundary of 5.991.', 'duration': 253.539, 'highlights': ['The process involves defining null and alternate hypotheses and calculating degree of freedom based on the number of categories, with a specific example using 3 categories resulting in a degree of freedom of 2. The process begins with defining the null hypothesis as data meeting the distribution of 2000 census, and the alternate hypothesis as the data not meeting the distribution. It then proceeds to calculate the degree of freedom using the formula N-1, where N represents the number of categories, resulting in a degree of freedom of 2.', 'Alpha value of 0.05 is used, signifying a 95% confidence interval for the chi-square test. The significance level, or alpha, is set at 0.05, indicating a 95% confidence interval for the chi-square test, influencing the determination of the decision boundary.', 'The process includes computing the test statistics using the chi-square test formula, involving observed and expected values, resulting in a test statistic of 232.94. The computation of the test statistics involves applying the chi-square test formula, which includes the observed and expected values, resulting in a test statistic of 232.94.']}, {'end': 16056, 'start': 15609.47, 'title': 'Z test and chi square analysis', 'summary': 'Discusses the application of z-test and chi-square analysis in python, demonstrating their usage in hypothesis testing and population proposition, with examples of calculating z-test values and p-values, evaluating the significance level, and understanding covariance in a dataset.', 'duration': 446.53, 'highlights': ['The chapter explains the process of rejecting the null hypothesis based on a chi-square value of 232.94, which is greater than 5.99, indicating a change in the population distribution. The chi-square value of 232.94 is compared to the critical value of 5.99, leading to the rejection of the null hypothesis, signifying a significant change in the population distribution.', "Demonstration of using Python to perform Z-test to determine the significance of a new drug's effect on IQ levels, including calculating Z-test values, P-values, and evaluating the significance level. The transcript includes a Python example demonstrating the application of Z-test in determining the significance of a new drug's effect on IQ levels, involving the calculation of Z-test values, P-values, and the evaluation of the significance level.", 'Explanation of the relationship between P-value and significance level, with examples of comparing P-values to the significance level to either accept or reject the null hypothesis. The chapter provides an explanation of the relationship between P-value and significance level, offering examples of comparing P-values to the significance level to determine whether to accept or reject the null hypothesis.', 'Discussion on the concept of covariance, illustrated with examples of weight and height data, demonstrating the relationship between two variables in a dataset. The chapter delves into the concept of covariance, using weight and height data as examples to demonstrate the relationship between two variables in a dataset.']}], 'duration': 1320.971, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs14735029.jpg', 'highlights': ['Chi-square test is a non-parametric test used on categorical or ordinal data, emphasizing its significance in problem-solving interviews.', 'The chapter explains the process of rejecting the null hypothesis based on a chi-square value of 232.94, which is greater than 5.99, indicating a change in the population distribution.', 'The analysis involves the utilization of a non-parametric test due to the nature of the data being proportions, leading to the exploration of hypotheses regarding the potential change in population distribution.', 'The process involves defining null and alternate hypotheses and calculating degree of freedom based on the number of categories, with a specific example using 3 categories resulting in a degree of freedom of 2.', 'The 2000 Indian census revealed age distribution percentages as follows: <18 years - 20%, 18-35 years - 30%, >35 years - 50%.', "Demonstration of using Python to perform Z-test to determine the significance of a new drug's effect on IQ levels, including calculating Z-test values, P-values, and evaluating the significance level.", 'The F test (ANOVA test) is discussed, with a focus on its complexity in calculation.', 'The process includes computing the test statistics using the chi-square test formula, involving observed and expected values, resulting in a test statistic of 232.94.', 'Alpha value of 0.05 is used, signifying a 95% confidence interval for the chi-square test.', 'Explanation of the relationship between P-value and significance level, with examples of comparing P-values to the significance level to either accept or reject the null hypothesis.']}, {'end': 16844.376, 'segs': [{'end': 16160.506, 'src': 'embed', 'start': 16100.453, 'weight': 0, 'content': [{'end': 16103.035, 'text': 'If I am studying for 4 hours, I am playing for 3 hours.', 'start': 16100.453, 'duration': 2.582}, {'end': 16112.16, 'text': 'In this particular case, What is the relationship? You can see that when x is increasing, y is decreasing or where x is decreasing, y is increasing.', 'start': 16103.096, 'duration': 9.064}, {'end': 16115.661, 'text': 'So this relationship is basically used over here.', 'start': 16112.2, 'duration': 3.461}, {'end': 16119.463, 'text': 'So here you can see these two conditions, right? These two conditions.', 'start': 16116.442, 'duration': 3.021}, {'end': 16123.224, 'text': 'Now, this is what you can observe.', 'start': 16120.463, 'duration': 2.761}, {'end': 16135.538, 'text': 'But the main thing is that how do I quantify? How can I quantify or show some relationship? quantify relationship through numbers between x and y.', 'start': 16123.604, 'duration': 11.934}, {'end': 16139.279, 'text': 'Now in that particular case, I can use a formula which is called as covariance.', 'start': 16135.538, 'duration': 3.741}, {'end': 16160.506, 'text': 'Now covariance is basically given by Cov which is nothing but summation of i is equal to 1 to n, x of i minus x bar, y of i minus y bar, divided by n.', 'start': 16141.079, 'duration': 19.427}], 'summary': 'Studying for 4 hours relates to playing for 3 hours. covariance quantifies the relationship between x and y.', 'duration': 60.053, 'max_score': 16100.453, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs16100453.jpg'}, {'end': 16348.161, 'src': 'embed', 'start': 16322.854, 'weight': 5, 'content': [{'end': 16329.741, 'text': 'If my data set is like this, then what will be my value of covariance? Covariance will be 0 because there is no relationship.', 'start': 16322.854, 'duration': 6.887}, {'end': 16331.543, 'text': 'Covariance will be basically 0.', 'start': 16329.802, 'duration': 1.741}, {'end': 16333.645, 'text': "Now let's understand one basic..", 'start': 16331.543, 'duration': 2.102}, {'end': 16337.189, 'text': 'disadvantage of covariance.', 'start': 16334.346, 'duration': 2.843}, {'end': 16338.15, 'text': 'now, covariance.', 'start': 16337.189, 'duration': 0.961}, {'end': 16341.594, 'text': 'over here you will definitely be able to see positive or negative.', 'start': 16338.15, 'duration': 3.444}, {'end': 16344.857, 'text': 'you will be able to find out the positive or negative correlation.', 'start': 16341.594, 'duration': 3.263}, {'end': 16348.161, 'text': 'but with respect to the disadvantage, there is no fixed value.', 'start': 16344.857, 'duration': 3.304}], 'summary': 'Covariance value is 0, indicating no relationship, with potential positive or negative correlation, but no fixed value.', 'duration': 25.307, 'max_score': 16322.854, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs16322854.jpg'}, {'end': 16558.152, 'src': 'heatmap', 'start': 16359.469, 'weight': 0.84, 'content': [{'end': 16366.813, 'text': 'you will definitely be able to see the direction, whether it is positive or negative, but this magnitude is not limited.', 'start': 16359.469, 'duration': 7.344}, {'end': 16371.316, 'text': 'so if we have two distribution, how much positive, how much negative that part?', 'start': 16366.813, 'duration': 4.503}, {'end': 16376.938, 'text': 'If, probably if you have two distributions, one is plus 100, the other one is plus 1000,', 'start': 16372.134, 'duration': 4.804}, {'end': 16380.22, 'text': 'you will not be able to identify because it is just a magnitude value.', 'start': 16376.938, 'duration': 3.282}, {'end': 16381.761, 'text': 'It is just a magnitude value.', 'start': 16380.48, 'duration': 1.281}, {'end': 16386.465, 'text': 'Now that is the reason we really need to restrict these values between some range.', 'start': 16382.222, 'duration': 4.243}, {'end': 16393.09, 'text': 'So for that specific region, we use another one which is called as Pearson correlation.', 'start': 16387.225, 'duration': 5.865}, {'end': 16403.125, 'text': 'A Pearson correlation coefficient, what it does is that it basically restricts all your value between Minus 1 to plus 1.', 'start': 16393.27, 'duration': 9.855}, {'end': 16409.387, 'text': 'The more towards plus 1 or minus 1, more positively it is correlated.', 'start': 16403.125, 'duration': 6.262}, {'end': 16413.97, 'text': 'Sorry, the more towards plus 1, more positively it is correlated.', 'start': 16409.668, 'duration': 4.302}, {'end': 16418.491, 'text': 'The more towards minus 1, more negatively it is correlated.', 'start': 16414.11, 'duration': 4.381}, {'end': 16420.233, 'text': 'You should be able to see that.', 'start': 16418.973, 'duration': 1.26}, {'end': 16423.494, 'text': 'Okay, then what is the difference between covariance with respect to the formula?', 'start': 16420.332, 'duration': 3.162}, {'end': 16427.035, 'text': 'Now for the Pearson correlation.', 'start': 16424.172, 'duration': 2.863}, {'end': 16430.157, 'text': 'you can basically use something like this x comma y.', 'start': 16427.035, 'duration': 3.122}, {'end': 16439.284, 'text': 'It is nothing but, it is very simple, covariance of x comma y divided by standard deviation of x and standard deviation of y.', 'start': 16430.157, 'duration': 9.127}, {'end': 16443.787, 'text': 'Because of this multiplication, all your values will be between minus 1 to plus 1.', 'start': 16439.284, 'duration': 4.503}, {'end': 16448.08, 'text': 'So here you will be able to see that it is always between minus 1 to plus 1.', 'start': 16443.787, 'duration': 4.293}, {'end': 16450.682, 'text': 'Now, let me show you some examples in Wikipedia.', 'start': 16448.08, 'duration': 2.602}, {'end': 16455.704, 'text': 'So if you go and search for Pearson correlation coefficient here, you will be able to see this.', 'start': 16451.042, 'duration': 4.662}, {'end': 16459.866, 'text': 'okay?. Now tell me this particular diagram.', 'start': 16455.704, 'duration': 4.162}, {'end': 16461.582, 'text': 'Here you can see.', 'start': 16460.683, 'duration': 0.899}, {'end': 16464.223, 'text': 'all the points are in one straight line.', 'start': 16461.582, 'duration': 2.641}, {'end': 16466.844, 'text': 'So, when you draw this particular line, your correlation obviously.', 'start': 16464.243, 'duration': 2.601}, {'end': 16472.165, 'text': 'in this particular case, when x is decreasing, y is increasing.', 'start': 16466.844, 'duration': 5.321}, {'end': 16474.966, 'text': 'right?. In this particular case, if x is decreasing, y is increasing.', 'start': 16472.165, 'duration': 2.801}, {'end': 16477.087, 'text': 'If x is increasing, y is decreasing.', 'start': 16475.506, 'duration': 1.581}, {'end': 16478.526, 'text': 'This is the relation that is found.', 'start': 16477.127, 'duration': 1.399}, {'end': 16484.207, 'text': 'So it is negatively correlated and if it falls all in the straight line, it is minus 1.', 'start': 16478.607, 'duration': 5.6}, {'end': 16488.91, 'text': 'Then, here you will be able to see that, over here you have some of the data points distributed in this.', 'start': 16484.207, 'duration': 4.703}, {'end': 16491.071, 'text': 'Here also you can actually see negative correlation.', 'start': 16489.009, 'duration': 2.062}, {'end': 16493.732, 'text': 'But not all are in the straight line.', 'start': 16491.591, 'duration': 2.141}, {'end': 16498.254, 'text': 'So your value, your correlation will be ranging between minus 1 to 0.', 'start': 16493.851, 'duration': 4.403}, {'end': 16503.116, 'text': 'Similarly, in this particular case, here you can see that when x is increasing, y is also increasing.', 'start': 16498.254, 'duration': 4.862}, {'end': 16505.537, 'text': 'So here, we will have a positive correlation.', 'start': 16503.196, 'duration': 2.341}, {'end': 16509.558, 'text': 'Since it does not follow in the straight line, it is between 0 to 1.', 'start': 16505.577, 'duration': 3.981}, {'end': 16512.48, 'text': 'If it falls in the straight line, then it is plus 1.', 'start': 16509.558, 'duration': 2.922}, {'end': 16515.541, 'text': 'So it captures the linear properties very well.', 'start': 16512.48, 'duration': 3.061}, {'end': 16518.422, 'text': 'Because everywhere you can see that there is a linear line.', 'start': 16516.142, 'duration': 2.28}, {'end': 16520.903, 'text': 'It captures it in an amazing way.', 'start': 16518.843, 'duration': 2.06}, {'end': 16526.026, 'text': 'That is the most advantageous thing with respect to Pearson correlation.', 'start': 16521.344, 'duration': 4.682}, {'end': 16529.81, 'text': 'Now in this particular case, here you can see that the correlation is 0.', 'start': 16526.807, 'duration': 3.003}, {'end': 16533.412, 'text': 'Why? Because we cannot identify when x is increasing, y is also increasing.', 'start': 16529.81, 'duration': 3.602}, {'end': 16536.575, 'text': 'The data is completely distributed here and there.', 'start': 16534.373, 'duration': 2.202}, {'end': 16544.461, 'text': 'Now some more examples, here you can see this is 1, this is 0.8, 0.4, 0, minus 0.4, minus 0.8 and minus 1.', 'start': 16537.115, 'duration': 7.346}, {'end': 16547.421, 'text': 'And similarly, these all are 1, 1, 1, 1.', 'start': 16544.461, 'duration': 2.96}, {'end': 16550.466, 'text': 'This is 0, minus 1, minus 1, minus 1.', 'start': 16547.424, 'duration': 3.042}, {'end': 16552.488, 'text': 'And similarly, here you can see some more 0s.', 'start': 16550.466, 'duration': 2.022}, {'end': 16553.989, 'text': 'You can also see some more 0s.', 'start': 16552.628, 'duration': 1.361}, {'end': 16558.152, 'text': 'This, you cannot definitely identify what exactly is this.', 'start': 16554.049, 'duration': 4.103}], 'summary': 'Pearson correlation restricts values between -1 to +1, capturing linear properties effectively.', 'duration': 198.683, 'max_score': 16359.469, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs16359469.jpg'}, {'end': 16413.97, 'src': 'embed', 'start': 16387.225, 'weight': 2, 'content': [{'end': 16393.09, 'text': 'So for that specific region, we use another one which is called as Pearson correlation.', 'start': 16387.225, 'duration': 5.865}, {'end': 16403.125, 'text': 'A Pearson correlation coefficient, what it does is that it basically restricts all your value between Minus 1 to plus 1.', 'start': 16393.27, 'duration': 9.855}, {'end': 16409.387, 'text': 'The more towards plus 1 or minus 1, more positively it is correlated.', 'start': 16403.125, 'duration': 6.262}, {'end': 16413.97, 'text': 'Sorry, the more towards plus 1, more positively it is correlated.', 'start': 16409.668, 'duration': 4.302}], 'summary': 'Pearson correlation restricts values between -1 and +1, indicating positive correlation.', 'duration': 26.745, 'max_score': 16387.225, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs16387224.jpg'}, {'end': 16852.803, 'src': 'embed', 'start': 16821.348, 'weight': 3, 'content': [{'end': 16824.509, 'text': 'So this value will be taken, this will be completely ignored.', 'start': 16821.348, 'duration': 3.161}, {'end': 16831.612, 'text': 'So this is what is basically the entire Spearman-Rank correlation and I hope you have understood.', 'start': 16824.99, 'duration': 6.622}, {'end': 16836.634, 'text': 'But understand, if someone asks you why do you use Spearman-Rank correlation coefficient,', 'start': 16832.033, 'duration': 4.601}, {'end': 16840.156, 'text': 'you should basically say that it captures the non-linear properties.', 'start': 16836.634, 'duration': 3.522}, {'end': 16844.376, 'text': 'It captures the non-linear properties.', 'start': 16842.495, 'duration': 1.881}, {'end': 16847.959, 'text': "Now let's go ahead and let's try to do this one example.", 'start': 16844.737, 'duration': 3.222}, {'end': 16852.803, 'text': "Let's go and see something like t-test and try to do it.", 'start': 16848.399, 'duration': 4.404}], 'summary': 'Spearman-rank captures non-linear properties, used for correlation coefficient.', 'duration': 31.455, 'max_score': 16821.348, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs16821348.jpg'}], 'start': 16056, 'title': 'Relationship between variables in data', 'summary': 'Discusses the relationship between variables in a dataset, showing the varying impact of one variable on another through examples like hours of study and play. it also explores the concepts of covariance and correlation, highlighting the quantification of relationships, limitations of covariance, and the advantages of pearson and spearman rank correlation coefficients.', 'chapters': [{'end': 16123.224, 'start': 16056, 'title': 'Relationship between variables in data', 'summary': 'Discusses the relationship between variables in a dataset, demonstrating that as one variable increases, the other may either increase or decrease, using examples of hours of study and play.', 'duration': 67.224, 'highlights': ['The relationship between variables is demonstrated through examples of hours of study and play, showing that as one variable increases, the other may either increase or decrease, illustrating the concept of correlation.', 'The example of studying for 2 hours and playing for 6 hours, studying for 3 hours and playing for 4 hours, and studying for 4 hours and playing for 3 hours, illustrates the relationship between the variables and the concept of correlation.']}, {'end': 16844.376, 'start': 16123.604, 'title': 'Covariance and correlation in data analysis', 'summary': 'Discusses the concepts of covariance and correlation in data analysis, emphasizing the quantification of the relationship between variables, the limitations of covariance, and the advantages of pearson and spearman rank correlation coefficients.', 'duration': 720.772, 'highlights': ['Covariance Formula and Interpretation The formula for covariance is discussed, indicating the quantification of the relationship between variables x and y through positive, negative, or zero values, reflecting the direction and strength of the relationship.', 'Limitations of Covariance The disadvantages of covariance are highlighted, particularly its lack of limitation in magnitude, making it challenging to interpret the strength of the relationship between variables.', 'Advantages of Pearson Correlation The benefits of using Pearson correlation coefficient are explained, emphasizing its restriction of values between -1 and +1, effectively capturing the linear properties and providing a clear interpretation of the relationship between variables.', 'Spearman Rank Correlation for Non-linear Properties The significance of Spearman rank correlation is elucidated, emphasizing its ability to capture non-linear properties, unlike Pearson correlation, through the assignment of ranks to variables and the subsequent calculation of correlation.']}], 'duration': 788.376, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs16056000.jpg', 'highlights': ['The example of studying for 2 hours and playing for 6 hours, studying for 3 hours and playing for 4 hours, and studying for 4 hours and playing for 3 hours, illustrates the relationship between the variables and the concept of correlation.', 'The relationship between variables is demonstrated through examples of hours of study and play, showing that as one variable increases, the other may either increase or decrease, illustrating the concept of correlation.', 'Advantages of Pearson Correlation The benefits of using Pearson correlation coefficient are explained, emphasizing its restriction of values between -1 and +1, effectively capturing the linear properties and providing a clear interpretation of the relationship between variables.', 'Spearman Rank Correlation for Non-linear Properties The significance of Spearman rank correlation is elucidated, emphasizing its ability to capture non-linear properties, unlike Pearson correlation, through the assignment of ranks to variables and the subsequent calculation of correlation.', 'Covariance Formula and Interpretation The formula for covariance is discussed, indicating the quantification of the relationship between variables x and y through positive, negative, or zero values, reflecting the direction and strength of the relationship.', 'Limitations of Covariance The disadvantages of covariance are highlighted, particularly its lack of limitation in magnitude, making it challenging to interpret the strength of the relationship between variables.']}, {'end': 19710.571, 'segs': [{'end': 17025.318, 'src': 'embed', 'start': 16997.375, 'weight': 0, 'content': [{'end': 17001.579, 'text': "So here you can see that I'm getting the p-value as 0.76.", 'start': 16997.375, 'duration': 4.204}, {'end': 17008.246, 'text': "If you don't believe me, just go and compute the np.mean of age underscore sample.", 'start': 17001.579, 'duration': 6.667}, {'end': 17012.991, 'text': "I'm getting 31.5, right? Which is little bit away from here.", 'start': 17008.706, 'duration': 4.285}, {'end': 17014.492, 'text': 'Now it is up to you.', 'start': 17013.591, 'duration': 0.901}, {'end': 17017.535, 'text': 'I got the p-value as 0.76.', 'start': 17014.993, 'duration': 2.542}, {'end': 17021.817, 'text': 'Now if I say my alpha value, my alpha value is 0.05.', 'start': 17017.535, 'duration': 4.282}, {'end': 17025.318, 'text': 'In this particular case, my p-value is greater than the alpha value.', 'start': 17021.817, 'duration': 3.501}], 'summary': 'P-value is 0.76, alpha is 0.05, indicating statistical insignificance.', 'duration': 27.943, 'max_score': 16997.375, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs16997375.jpg'}, {'end': 17461.044, 'src': 'embed', 'start': 17429.324, 'weight': 1, 'content': [{'end': 17430.345, 'text': 'Again, I am going to repeat it.', 'start': 17429.324, 'duration': 1.021}, {'end': 17440.814, 'text': 'See, if p value is less than or equal to 0.05, in this particular case, we reject the null hypothesis.', 'start': 17430.405, 'duration': 10.409}, {'end': 17442.975, 'text': 'The reason why we do this?', 'start': 17441.514, 'duration': 1.461}, {'end': 17447.859, 'text': 'because p, p is basically defining the probability part right?', 'start': 17442.975, 'duration': 4.884}, {'end': 17454.483, 'text': 'Now, in this particular case, they are just saying that It is less than 5% probability that the null is correct.', 'start': 17448.14, 'duration': 6.343}, {'end': 17461.044, 'text': 'It basically says that 5% probability the null hypothesis is correct.', 'start': 17454.543, 'duration': 6.501}], 'summary': 'P value less than 0.05 leads to rejecting null hypothesis, indicating less than 5% probability of null being correct.', 'duration': 31.72, 'max_score': 17429.324, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs17429324.jpg'}, {'end': 17801.504, 'src': 'embed', 'start': 17772.186, 'weight': 2, 'content': [{'end': 17777.61, 'text': 'so this is what it is given, and our confidence interval is basically 95 percentage.', 'start': 17772.186, 'duration': 5.424}, {'end': 17779.231, 'text': 'so over here, you know what is my.', 'start': 17777.61, 'duration': 1.621}, {'end': 17780.887, 'text': 'What is your mean?', 'start': 17780.026, 'duration': 0.861}, {'end': 17783.409, 'text': 'My mean is 168 points.', 'start': 17781.187, 'duration': 2.222}, {'end': 17787.432, 'text': 'The standard deviation is 3.9.', 'start': 17784.27, 'duration': 3.162}, {'end': 17794.839, 'text': 'The x bar is nothing but 169.5 and my n sample is greater than 36.', 'start': 17787.432, 'duration': 7.407}, {'end': 17798.722, 'text': 'And obviously my n sample is given, my population standard deviation is given.', 'start': 17794.839, 'duration': 3.883}, {'end': 17800.804, 'text': 'So I am going to basically use z test.', 'start': 17798.802, 'duration': 2.002}, {'end': 17801.504, 'text': 'Very good.', 'start': 17801.104, 'duration': 0.4}], 'summary': 'Confidence interval is 95%, mean is 168 points, x bar is 169.5, and z test will be used.', 'duration': 29.318, 'max_score': 17772.186, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs17772186.jpg'}, {'end': 18699.758, 'src': 'embed', 'start': 18671.741, 'weight': 3, 'content': [{'end': 18674.782, 'text': 'So it is 1 multiplied by 6 divided by 1.5.', 'start': 18671.741, 'duration': 3.041}, {'end': 18675.703, 'text': 'Go and calculate it.', 'start': 18674.782, 'duration': 0.921}, {'end': 18678.405, 'text': 'The Z score is 1.2.', 'start': 18676.323, 'duration': 2.082}, {'end': 18679.625, 'text': 'You know the decision boundary.', 'start': 18678.405, 'duration': 1.22}, {'end': 18685.288, 'text': 'What is the decision boundary? Plus 1.96 plus 1.96, right? Now you are getting 1.2.', 'start': 18679.665, 'duration': 5.623}, {'end': 18695.455, 'text': 'If you are getting 1.2, then obviously 1.2 is less than 1.96, should we reject or accept the null hypothesis??', 'start': 18685.288, 'duration': 10.167}, {'end': 18697.196, 'text': '4, are we getting 4?', 'start': 18695.475, 'duration': 1.721}, {'end': 18699.197, 'text': 'Oh sorry, it is 4..', 'start': 18697.196, 'duration': 2.001}, {'end': 18699.758, 'text': 'Excuse me, sorry.', 'start': 18699.197, 'duration': 0.561}], 'summary': 'Calculation resulted in a z score of 1.2, indicating a decision boundary below 1.96, leading to the acceptance of the null hypothesis.', 'duration': 28.017, 'max_score': 18671.741, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs18671741.jpg'}, {'end': 18834.952, 'src': 'embed', 'start': 18804.146, 'weight': 4, 'content': [{'end': 18807.047, 'text': "so let's go ahead with log normal distribution.", 'start': 18804.146, 'duration': 2.901}, {'end': 18813.818, 'text': 'okay, guys, so Log normal distribution, usually log normal distribution, will have this kind of shape.', 'start': 18807.047, 'duration': 6.771}, {'end': 18818.321, 'text': 'Obviously, we have seen a lot of examples like wealth distribution.', 'start': 18814.138, 'duration': 4.183}, {'end': 18820.222, 'text': 'These all things are actually there.', 'start': 18818.921, 'duration': 1.301}, {'end': 18822.904, 'text': 'So this was the example of log normal distribution.', 'start': 18820.342, 'duration': 2.562}, {'end': 18834.952, 'text': "Now suppose I say that if y is a random variable that belongs to a log, normal distribution with mean as with some mean, let's say that this is there.", 'start': 18822.944, 'duration': 12.008}], 'summary': 'Log normal distribution has a characteristic shape, seen in wealth distribution examples.', 'duration': 30.806, 'max_score': 18804.146, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs18804146.jpg'}], 'start': 16844.737, 'title': 'Hypothesis testing and distribution', 'summary': "Explores hypothesis testing with t-test, p-value, significance level, z-test for mean with 95% confidence, z-score calculation, and various distributions including log normal, bernoulli's, and pareto, with practical examples and calculations based on sample data.", 'chapters': [{'end': 17364.705, 'start': 16844.737, 'title': 'Hypothesis testing with t-test', 'summary': 'Explores hypothesis testing using t-test to compare sample means with population mean, with examples demonstrating p-values and acceptance/rejection based on alpha values.', 'duration': 519.968, 'highlights': ['The chapter demonstrates computing the mean of ages and taking a sample to verify if it is nearer to the population mean using t-test, with p-values such as 0.76, 0.918, 0.48, 0.27, and 0.015, and discusses acceptance/rejection based on alpha values.', 'It illustrates using t-test to compare the mean of class A ages with the population mean, showcasing p-values such as 10^-13 and 53, and discusses acceptance/rejection based on alpha values.', 'The chapter briefly touches on correlation using the iris dataset, showcasing a diagram displaying the positive correlation between sepal length and petal length.']}, {'end': 17772.186, 'start': 17364.705, 'title': 'Understanding p-value and significance', 'summary': 'Explains the concept of p-value and significance level, detailing the relationship between them and their application in hypothesis testing, with an emphasis on the significance of 0.05 and practical examples. the discussion also covers various distributions and an upcoming topic on f-test (anova).', 'duration': 407.481, 'highlights': ['The chapter emphasizes the importance of the significance level of 0.05, explaining that if the p-value is less than or equal to 0.05, the null hypothesis is rejected, indicating less than 5% probability that it is true. The significance level of 0.05 is crucial in hypothesis testing, as a p-value less than or equal to 0.05 implies the rejection of the null hypothesis, indicating less than 5% probability that it is true.', 'The discussion addresses the reversal of acceptance and rejection conditions, correcting the statement that if the p-value is less than the significance level, the null hypothesis should be accepted. The speaker acknowledges and corrects the confusion regarding the acceptance and rejection conditions, clarifying that if the p-value is less than the significance level, the null hypothesis should be rejected, contrary to the previously stated code-wise explanation.', "The upcoming topics include discussions on various distributions such as Bernoulli's, binomial, Pareto's, and log normal distribution, providing a comprehensive overview of statistical concepts. The chapter outlines upcoming discussions on various distributions including Bernoulli's, binomial, Pareto's, and log normal distribution, promising a comprehensive overview of statistical concepts."]}, {'end': 18295.258, 'start': 17772.186, 'title': 'Z-test for mean with 95% confidence', 'summary': 'Discusses the process of conducting a z-test for a mean with a 95% confidence interval, using a sample mean of 169.5, a population mean of 168, a standard deviation of 3.9, and a sample size greater than 36, resulting in a calculated z-value of 2.307 and a p-value of 0.0088.', 'duration': 523.072, 'highlights': ['Conducting a Z-test for a mean with a 95% confidence interval The chapter explains the process of conducting a Z-test to determine the significance of the difference between a sample mean of 169.5 and a population mean of 168, with a confidence level of 95%.', 'Calculation of z-value and p-value The process involves calculating a z-value of 2.307, which exceeds the critical value of 1.96, leading to the rejection of the null hypothesis; further calculation results in a p-value of 0.0088, which is less than the significance value of 0.05, reinforcing the rejection of the null hypothesis.', 'Explanation of p-value significance The significance of the p-value is emphasized, indicating that if the p-value is less than or equal to the significance value, the null hypothesis is rejected; conversely, if the p-value is greater than the significance value, the null hypothesis is not rejected.']}, {'end': 18804.146, 'start': 18295.298, 'title': 'Z-score calculation and hypothesis testing', 'summary': 'Discusses the calculation of z-scores, hypothesis testing using z-scores, and p-value calculation. it highlights the process of z-score calculation, hypothesis testing, and p-value calculation, showcasing the rejection of null hypothesis based on z-scores and p-values.', 'duration': 508.848, 'highlights': ['The chapter discusses the process of z-score calculation to determine the rejection of null hypothesis. Z-score calculation, determination of rejection of null hypothesis', 'The transcript explains the concept of hypothesis testing using z-scores and the decision-making process based on z-score values. Hypothesis testing, decision-making based on z-score values', 'It outlines the process of p-value calculation, showcasing its significance in hypothesis testing and the determination of null hypothesis rejection. P-value calculation, significance in hypothesis testing']}, {'end': 19710.571, 'start': 18804.146, 'title': "Log normal distribution, bernoulli's distribution, and pareto distribution", 'summary': "Discusses log normal distribution, with examples like wealth distribution, and the relationship between log normal and power law distribution. it also covers bernoulli's distribution, explaining its two outcomes and the probability of head and tail. additionally, it explores pareto distribution, highlighting the 80-20 rule and its relationship with log normal distribution. the chapter ends with an assignment on poisson distribution.", 'duration': 906.425, 'highlights': ['Log Normal Distribution Log Normal Distribution is discussed, with examples like wealth distribution and the relationship between log normal and power law distribution.', "Bernoulli's Distribution Explanation of Bernoulli's Distribution, including its two outcomes and the probability of head and tail.", 'Pareto Distribution Discussion of Pareto Distribution, highlighting the 80-20 rule and its relationship with log normal distribution.', 'Central Limit Theorem Explanation of the Central Limit Theorem, emphasizing that sample means, with a sample size greater than or equal to 30, will follow a normal distribution.', 'Assignment on Poisson Distribution An assignment is given to research Poisson Distribution, a non-Gaussian distribution.']}], 'duration': 2865.834, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/LZzq1zSL1bs/pics/LZzq1zSL1bs16844737.jpg', 'highlights': ['The chapter demonstrates computing the mean of ages and taking a sample to verify if it is nearer to the population mean using t-test, with p-values such as 0.76, 0.918, 0.48, 0.27, and 0.015, and discusses acceptance/rejection based on alpha values.', 'The chapter emphasizes the importance of the significance level of 0.05, explaining that if the p-value is less than or equal to 0.05, the null hypothesis is rejected, indicating less than 5% probability that it is true.', 'Conducting a Z-test for a mean with a 95% confidence interval The chapter explains the process of conducting a Z-test to determine the significance of the difference between a sample mean of 169.5 and a population mean of 168, with a confidence level of 95%.', 'The chapter discusses the process of z-score calculation to determine the rejection of null hypothesis. Z-score calculation, determination of rejection of null hypothesis', 'Log Normal Distribution Log Normal Distribution is discussed, with examples like wealth distribution and the relationship between log normal and power law distribution.']}], 'highlights': ['The chapter covers statistics basics and advanced topics for data science, including measures of central tendency and dispersion, hypothesis testing, data visualization techniques, probability, permutation, chi-square and z-test analysis, relationship between variables, and various statistical analysis with practical applications demonstrated using python programming.', 'The chapter covers the basics and advanced topics related to data science roles like data scientist and data analyst, including the differences between descriptive and inferential statistics.', "Covering various statistical concepts such as histograms, PDF, CDF, probability permutations, and distributions like gaussian, log normal, binomial, bernoulli's, Pareto, and standard normal distributions.", 'Exploring inferential statistics including Z test, T test, ANOVA test, and chi-square test, with demonstrations using Python programming language.', 'The chapter introduces the concept of variables and differentiates between quantitative and qualitative variables, providing examples such as age, weight, and height for quantitative variables and gender for qualitative variables.', 'The chapter covers measures of central tendency, dispersion, z score, and standard normal distribution for data science.', 'Understanding the 68, 95, 99.7% rule for data distribution within 1, 2, and 3 standard deviations.', 'The chapter discusses data normalization techniques like division by 255 and practical application of Z score using a cricket example.', 'Probability is defined as a measure of the likelihood of an event, calculated as the number of ways an event can occur divided by the number of possible outcomes.', 'The chapter covers upcoming topics on probability, permutation, combination, confidence intervals, and hypothesis testing.', 'The chapter emphasizes the importance of making correct decisions in hypothesis testing, focusing on type 1 and type 2 errors using real-world scenarios.', 'Chi-square test is a non-parametric test used on categorical or ordinal data, emphasizing its significance in problem-solving interviews.', 'The example of studying for 2 hours and playing for 6 hours, studying for 3 hours and playing for 4 hours, and studying for 4 hours and playing for 3 hours, illustrates the relationship between the variables and the concept of correlation.']}