title
Power Analysis, Clearly Explained!!!

description
If you're doing an experiment, a Power Analysis is a must. It ensures reproducibility by helping you avoid p-hacking and being fooled by false positives. NOTE: This StatQuest assumes that you are already familiar with the concept of Statistical Power, Population Parameters vs Estimated Parameters and p-hacking. If not, check out the 'Quests: Power: https://youtu.be/Rsc5znwR5FA Population Parmeters: https://youtu.be/vikkiwjQqfU p-hacking: https://youtu.be/HDCOUXE3HMM And if you'd like to learn more about how the concepts shown here apply to more than just the normal distribution, check out the StatQuest on the Central Limit Theorem: https://youtu.be/YAlJCEDH2uY For a complete index of all the StatQuest videos, check out: https://statquest.org/video-index/ If you'd like to support StatQuest, please consider... Support StatQuest by buying The StatQuest Illustrated Guide to Machine Learning!!! PDF - https://statquest.gumroad.com/l/wvtmc Paperback - https://www.amazon.com/dp/B09ZCKR4H6 Kindle eBook - https://www.amazon.com/dp/B09ZG79HXC Patreon: https://www.patreon.com/statquest ...or... YouTube Membership: https://www.youtube.com/channel/UCtYLUTtgS3k1Fg4y5tAhLbw/join ...a cool StatQuest t-shirt or sweatshirt: https://shop.spreadshirt.com/statquest-with-josh-starmer/ ...buying one or two of my songs (or go large and get a whole album!) https://joshuastarmer.bandcamp.com/ ...or just donating to StatQuest! https://www.paypal.me/statquest Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter: https://twitter.com/joshuastarmer 0:00 Awesome song and introduction 0:42 Why we do a power analysis 2:40 Power analysis defined 3:14 Two factors that affect Power 4:06 How sample size affects Power 11:48 How to do a power analysis 15:10 Review of concepts #statquest #statistics

detail
{'title': 'Power Analysis, Clearly Explained!!!', 'heatmap': [{'end': 213.955, 'start': 201.641, 'weight': 0.703}, {'end': 796.174, 'start': 742.747, 'weight': 0.753}, {'end': 918.597, 'start': 870.923, 'weight': 0.703}], 'summary': 'Explains the importance of power analysis in statistical testing and sample size determination, using examples of drug effectiveness comparison and estimating population mean. it emphasizes the impact of sample size on estimating means and the relationship between sample size, confidence, and correctly rejecting the null hypothesis.', 'chapters': [{'end': 262.423, 'segs': [{'end': 33.862, 'src': 'embed', 'start': 6.112, 'weight': 1, 'content': [{'end': 9.353, 'text': "Hello, I'm Josh Starmer and welcome to StatQuest.", 'start': 6.112, 'duration': 3.241}, {'end': 14.935, 'text': "Today we're going to talk about power analysis and it's going to be clearly explained.", 'start': 9.893, 'duration': 5.042}, {'end': 20.978, 'text': 'This StatQuest assumes that you are already familiar with what power means.', 'start': 16.196, 'duration': 4.782}, {'end': 23.199, 'text': 'If not, check out the quest.', 'start': 21.538, 'duration': 1.661}, {'end': 31.58, 'text': 'It would also be helpful if you understood the difference between population parameters and estimated population parameters.', 'start': 24.417, 'duration': 7.163}, {'end': 33.862, 'text': 'If not, check out the quest.', 'start': 32.22, 'duration': 1.642}], 'summary': 'Josh starmer explains power analysis in statquest.', 'duration': 27.75, 'max_score': 6.112, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/VX_M3tIyiYk/pics/VX_M3tIyiYk6112.jpg'}, {'end': 92.736, 'src': 'embed', 'start': 61.717, 'weight': 2, 'content': [{'end': 69.259, 'text': 'Just looking at the data makes us think that drug A might be better since those people tended to recover from the virus more quickly.', 'start': 61.717, 'duration': 7.542}, {'end': 80.632, 'text': 'So we calculate the means for both drugs and do a statistical test to compare the means and get a p-value equal to 0.06.', 'start': 70.799, 'duration': 9.833}, {'end': 81.852, 'text': 'Wah, wah.', 'start': 80.632, 'duration': 1.22}, {'end': 92.736, 'text': 'Because the p-value is greater than 0.05, the threshold that we are using to define a statistically significant difference.', 'start': 83.393, 'duration': 9.343}], 'summary': 'Drug a may be better as it led to quicker virus recovery, with a p-value of 0.06, not meeting statistical significance.', 'duration': 31.019, 'max_score': 61.717, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/VX_M3tIyiYk/pics/VX_M3tIyiYk61717.jpg'}, {'end': 262.423, 'src': 'heatmap', 'start': 201.641, 'weight': 0, 'content': [{'end': 207.568, 'text': 'One, how much overlap there is between the two distributions we want to identify with our study.', 'start': 201.641, 'duration': 5.927}, {'end': 213.955, 'text': 'Two, the sample size, the number of measurements we collect from each group.', 'start': 209.129, 'duration': 4.826}, {'end': 226.456, 'text': 'For example, if I want to have power equal to 0.8, meaning I want to have at least an 80% chance of correctly rejecting the null hypothesis,', 'start': 215.287, 'duration': 11.169}, {'end': 234.762, 'text': 'then if there is very little overlap, a small sample size will give me power equal to 0.8..', 'start': 226.456, 'duration': 8.306}, {'end': 246.653, 'text': 'However, the more overlap there is between the two distributions, the larger the sample size needs to be in order to have power equal to 0.8.', 'start': 234.762, 'duration': 11.891}, {'end': 251.376, 'text': 'To understand the relationship between overlap and sample size.', 'start': 246.653, 'duration': 4.723}, {'end': 258.281, 'text': "the first thing we need to realize is that when we do a statistical test, we usually don't compare the individual measurements.", 'start': 251.376, 'duration': 6.905}, {'end': 262.423, 'text': 'Instead, we compare summaries of the data.', 'start': 259.702, 'duration': 2.721}], 'summary': 'Overlap affects sample size needed for 80% power in statistical tests.', 'duration': 105.806, 'max_score': 201.641, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/VX_M3tIyiYk/pics/VX_M3tIyiYk201641.jpg'}], 'start': 6.112, 'title': 'The importance of power analysis in statistical testing and sample size determination', 'summary': 'Emphasizes the significance of power analysis in statistical testing to avoid p-hacking, using the example of drug effectiveness comparison, and in determining sample size for achieving desired statistical power and correctly rejecting the null hypothesis.', 'chapters': [{'end': 155.136, 'start': 6.112, 'title': 'Understanding power analysis', 'summary': 'Discusses the importance of power analysis in statistical testing to avoid p-hacking, using the example of comparing the effectiveness of two drugs in treating a virus, and emphasizes the need to resist the temptation of p-hacking when faced with inconclusive results.', 'duration': 149.024, 'highlights': ['Power analysis is crucial in statistical testing to avoid p-hacking and make informed decisions based on significance levels such as p-value 0.05. The chapter emphasizes the importance of power analysis in making informed decisions in statistical testing, particularly in situations where the p-value is close to the significance level of 0.05, to avoid the temptation of p-hacking.', 'Example of comparing the effectiveness of two drugs in treating a virus illustrates the need for statistical rigor and the potential for inconclusive results. The example of comparing the effectiveness of two drugs in treating a virus demonstrates the potential for inconclusive results in statistical testing and the importance of resisting the temptation of p-hacking when faced with borderline p-values.', 'Emphasizes the importance of understanding power and the difference between population parameters and estimated population parameters for effective power analysis. The chapter stresses the significance of understanding power and the distinction between population parameters and estimated population parameters to conduct effective power analysis in statistical testing.']}, {'end': 262.423, 'start': 156.617, 'title': 'Power analysis for sample size determination', 'summary': 'Discusses the importance of power analysis in determining the sample size for experiments, emphasizing the relationship between overlap in distributions and sample size for achieving a desired level of statistical power, with a focus on ensuring a high probability of correctly rejecting the null hypothesis.', 'duration': 105.806, 'highlights': ['Power analysis determines the sample size to ensure a high probability of correctly rejecting the null hypothesis, with power affected by overlap between distributions and sample size.', 'Aiming for a power of 0.8 indicates a desire for at least an 80% chance of correctly rejecting the null hypothesis, with the required sample size influenced by the overlap between distributions.', 'The larger the overlap between distributions, the larger the sample size needed to achieve a power of 0.8, highlighting the importance of considering the relationship between overlap and sample size.']}], 'duration': 256.311, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/VX_M3tIyiYk/pics/VX_M3tIyiYk6112.jpg', 'highlights': ['Power analysis is crucial in statistical testing to avoid p-hacking and make informed decisions based on significance levels such as p-value 0.05.', 'Emphasizes the importance of understanding power and the difference between population parameters and estimated population parameters for effective power analysis.', 'Example of comparing the effectiveness of two drugs in treating a virus illustrates the need for statistical rigor and the potential for inconclusive results.', 'Power analysis determines the sample size to ensure a high probability of correctly rejecting the null hypothesis, with power affected by overlap between distributions and sample size.', 'Aiming for a power of 0.8 indicates a desire for at least an 80% chance of correctly rejecting the null hypothesis, with the required sample size influenced by the overlap between distributions.', 'The larger the overlap between distributions, the larger the sample size needed to achieve a power of 0.8, highlighting the importance of considering the relationship between overlap and sample size.']}, {'end': 663.625, 'segs': [{'end': 429.561, 'src': 'embed', 'start': 404.648, 'weight': 3, 'content': [{'end': 412.632, 'text': "In summary, when we only use one measurement to estimate the population, mean the probability that we'll get something far from.", 'start': 404.648, 'duration': 7.984}, {'end': 416.354, 'text': 'it is too high for us to be confident that we have a good estimate.', 'start': 412.632, 'duration': 3.722}, {'end': 421.797, 'text': "Now let's do the same thing for drug B.", 'start': 418.095, 'duration': 3.702}, {'end': 426.379, 'text': 'Collect one measurement and use that to estimate the population mean.', 'start': 421.797, 'duration': 4.582}, {'end': 429.561, 'text': "Now let's do that a bunch of times.", 'start': 427.52, 'duration': 2.041}], 'summary': 'Using one measurement to estimate population mean results in high uncertainty.', 'duration': 24.913, 'max_score': 404.648, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/VX_M3tIyiYk/pics/VX_M3tIyiYk404648.jpg'}, {'end': 590.221, 'src': 'embed', 'start': 542.765, 'weight': 0, 'content': [{'end': 551.153, 'text': 'there is a 75% probability that the next measurement will be in this range and pull the estimated mean back to the population mean.', 'start': 542.765, 'duration': 8.388}, {'end': 557.999, 'text': 'In summary, when we use more than one measurement to estimate the population mean,', 'start': 553.014, 'duration': 4.985}, {'end': 565.33, 'text': 'extreme measurements have less effect on how far the estimated mean is from the population mean.', 'start': 559.088, 'duration': 6.242}, {'end': 576.893, 'text': 'And, as a result, the estimated means are closer to the population mean compared to the means we estimated with a single observation.', 'start': 567.19, 'duration': 9.703}, {'end': 590.221, 'text': 'This suggests that we should have more confidence that averages estimated with two observations will be closer to the population mean than averages estimated with one observation.', 'start': 578.715, 'duration': 11.506}], 'summary': 'Using more than one measurement reduces effect of extreme values, leading to estimated means closer to population mean.', 'duration': 47.456, 'max_score': 542.765, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/VX_M3tIyiYk/pics/VX_M3tIyiYk542765.jpg'}, {'end': 673.484, 'src': 'embed', 'start': 645.034, 'weight': 2, 'content': [{'end': 655.6, 'text': 'the closer they are to the population mean and the more confidence we can have that an individual estimated mean will be close to the population mean.', 'start': 645.034, 'duration': 10.566}, {'end': 663.625, 'text': 'In this example, the estimated means are so close to the population means that they no longer overlap.', 'start': 657.241, 'duration': 6.384}, {'end': 673.484, 'text': 'and that suggests there is a high probability that we will correctly reject the null hypothesis that both samples were taken from the same distribution.', 'start': 664.861, 'duration': 8.623}], 'summary': 'Estimated means close to population means, high probability of correctly rejecting null hypothesis.', 'duration': 28.45, 'max_score': 645.034, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/VX_M3tIyiYk/pics/VX_M3tIyiYk645034.jpg'}], 'start': 262.704, 'title': 'Estimating population mean', 'summary': 'Demonstrates how sample size and multiple measurements impact estimating the population mean, with variation in estimated means affecting confidence and a 75% probability of measurements pulling the estimated mean back to the population mean.', 'chapters': [{'end': 481.502, 'start': 262.704, 'title': 'Effect of sample size on estimating population mean', 'summary': 'Demonstrates the impact of using one measurement to estimate the population mean for drug a and drug b, showing how the variation in estimated means makes it difficult to be confident in the estimate, leading to a relatively large p-value and failure to reject the null hypothesis.', 'duration': 218.798, 'highlights': ['Using one measurement to estimate the population mean leads to a high probability of obtaining estimates far from the actual mean, making it difficult to have confidence in the estimate, resulting in a large p-value and failure to reject the null hypothesis.', 'The estimated means for both drugs A and B show a significant variation, with 50% of the estimates being far from the population mean, affecting the confidence in the estimates and leading to a large p-value.', 'The demonstration illustrates the challenges of estimating the population mean with one measurement, highlighting the significant variation in estimated means and the resulting lack of confidence in the estimates, leading to a large p-value and failure to reject the null hypothesis.']}, {'end': 663.625, 'start': 483.663, 'title': 'Effect of multiple measurements on population mean', 'summary': 'Discusses the impact of using multiple measurements on estimating the population mean, showing that as the number of measurements increases, the estimated means become closer to the population mean, with a 75% probability of the next measurement pulling the estimated mean back to the population mean.', 'duration': 179.962, 'highlights': ['Using more than one measurement to estimate the population mean results in estimated means being closer to the population mean, with a 75% probability of the next measurement pulling the estimated mean back to the population mean, compared to a single observation.', 'As the number of measurements increases, the estimated means become closer to the population mean, leading to more confidence that an individual estimated mean will be close to the population mean.', 'When using 10 measurements to estimate the population mean, the estimated means are so close to the population means that they no longer overlap.', 'The impact of using multiple measurements on estimating the population mean is demonstrated through the example of drug B, showing that using two measurements provides more confidence that the estimated means are closer to the population mean compared to a single observation.']}], 'duration': 400.921, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/VX_M3tIyiYk/pics/VX_M3tIyiYk262704.jpg', 'highlights': ['Using more than one measurement increases the probability of the next measurement pulling the estimated mean back to the population mean by 75%.', 'As the number of measurements increases, the estimated means become closer to the population mean, leading to more confidence in the estimates.', 'When using 10 measurements to estimate the population mean, the estimated means are so close to the population means that they no longer overlap.', 'Using one measurement to estimate the population mean leads to a high probability of obtaining estimates far from the actual mean, resulting in a large p-value and failure to reject the null hypothesis.', 'The estimated means for both drugs A and B show a significant variation, with 50% of the estimates being far from the population mean, affecting the confidence in the estimates and leading to a large p-value.']}, {'end': 1003.893, 'segs': [{'end': 719.976, 'src': 'embed', 'start': 664.861, 'weight': 0, 'content': [{'end': 673.484, 'text': 'and that suggests there is a high probability that we will correctly reject the null hypothesis that both samples were taken from the same distribution.', 'start': 664.861, 'duration': 8.623}, {'end': 682.288, 'text': 'In other words, even when the distributions overlap, if the sample size is large, we can have high power.', 'start': 674.865, 'duration': 7.423}, {'end': 689.811, 'text': 'Bam! Note although we used normal distributions in this example,', 'start': 683.488, 'duration': 6.323}, {'end': 696.508, 'text': 'The central limit theorem tells us that these results apply to any underlying distribution.', 'start': 691.206, 'duration': 5.302}, {'end': 704.291, 'text': 'Shameless self-promotion! For more details about the central limit theorem, check out the Quest.', 'start': 698.088, 'duration': 6.203}, {'end': 706.631, 'text': 'The link is in the description below.', 'start': 704.731, 'duration': 1.9}, {'end': 713.214, 'text': "Now, at long last, let's talk about how to actually do a power analysis.", 'start': 708.352, 'duration': 4.862}, {'end': 719.976, 'text': 'First, remember that a power analysis will tell us what sample size we need to have power.', 'start': 714.434, 'duration': 5.542}], 'summary': 'Large sample sizes can lead to high power in rejecting null hypothesis, applicable to any distribution.', 'duration': 55.115, 'max_score': 664.861, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/VX_M3tIyiYk/pics/VX_M3tIyiYk664861.jpg'}, {'end': 796.174, 'src': 'heatmap', 'start': 742.747, 'weight': 0.753, 'content': [{'end': 748.389, 'text': 'The second thing we need to do is determine the threshold for significance, often called alpha.', 'start': 742.747, 'duration': 5.642}, {'end': 757.817, 'text': "We can use any value between 0 and 1, but a very common threshold is 0.05, so we'll use that.", 'start': 750.011, 'duration': 7.806}, {'end': 763.901, 'text': 'Lastly, we need to estimate the overlap between the two distributions.', 'start': 759.578, 'duration': 4.323}, {'end': 772.448, 'text': 'Overlap is affected by both the distance between the population means and the standard deviations.', 'start': 765.062, 'duration': 7.386}, {'end': 784.152, 'text': 'A common way to combine the distance between the means and the standard deviations into a single metric is to calculate an effect size,', 'start': 774.85, 'duration': 9.302}, {'end': 787.052, 'text': 'which is also called d.', 'start': 784.152, 'duration': 2.9}, {'end': 796.174, 'text': 'In the numerator, we have the estimated difference in the means, and in the denominator, we have the pooled estimated standard deviations.', 'start': 787.052, 'duration': 9.122}], 'summary': 'Determining significance threshold (alpha) at 0.05 and estimating overlap using effect size (d)', 'duration': 53.427, 'max_score': 742.747, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/VX_M3tIyiYk/pics/VX_M3tIyiYk742747.jpg'}, {'end': 946.683, 'src': 'heatmap', 'start': 870.923, 'weight': 1, 'content': [{'end': 875.505, 'text': 'So the effect size is 1.5.', 'start': 870.923, 'duration': 4.582}, {'end': 886.141, 'text': 'Once we know the effect size and the amount of power we want and the threshold for significance, We Google statistics power calculator.', 'start': 875.505, 'duration': 10.636}, {'end': 890.643, 'text': 'Pretty much every statistics department in the world has one online.', 'start': 886.761, 'duration': 3.882}, {'end': 893.464, 'text': 'Then we plug in the numbers.', 'start': 892.043, 'duration': 1.421}, {'end': 897.205, 'text': 'I got sample size equals nine.', 'start': 895.044, 'duration': 2.161}, {'end': 906.589, 'text': 'This means that if I get nine measurements per group, I will have an 80% chance that I will correctly reject the null hypothesis.', 'start': 898.426, 'duration': 8.163}, {'end': 918.597, 'text': 'Double bam! In summary, When two distributions overlap, we need a relatively large sample size to have a lot of power.', 'start': 907.809, 'duration': 10.788}, {'end': 922.219, 'text': 'When the sample size is small,', 'start': 919.857, 'duration': 2.362}, {'end': 936.448, 'text': 'we have low confidence that the estimated means are close to the population means and that lack of confidence is reflected in a low probability that we will correctly reject the null hypothesis.', 'start': 922.219, 'duration': 14.229}, {'end': 946.683, 'text': 'In contrast, when we increase the sample size, we have more confidence that the estimated means are close to the population means,', 'start': 937.689, 'duration': 8.994}], 'summary': 'Effect size of 1.5 yields 80% power with sample size of 9 per group.', 'duration': 59.922, 'max_score': 870.923, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/VX_M3tIyiYk/pics/VX_M3tIyiYk870923.jpg'}], 'start': 664.861, 'title': 'Power analysis and sample size', 'summary': 'Discusses power analysis, sample size determination, and the impact of overlap between distributions, emphasizing the relationship between sample size, confidence in estimated means, and probability of correctly rejecting the null hypothesis.', 'chapters': [{'end': 719.976, 'start': 664.861, 'title': 'Power analysis and central limit theorem', 'summary': 'Discusses the concept of power analysis and the application of the central limit theorem, emphasizing the high probability of correctly rejecting the null hypothesis with large sample sizes and the universality of the theorem, and it also mentions the need to determine the sample size for adequate power.', 'duration': 55.115, 'highlights': ['The central limit theorem indicates a high probability of correctly rejecting the null hypothesis with large sample sizes, even when distributions overlap, and applies to any underlying distribution.', 'A power analysis determines the necessary sample size for achieving adequate power.']}, {'end': 1003.893, 'start': 721.638, 'title': 'Power analysis and sample size', 'summary': 'Explains the process of power analysis and sample size determination, emphasizing the impact of overlap between distributions and the relationship between sample size, confidence in estimated means, and probability of correctly rejecting the null hypothesis.', 'duration': 282.255, 'highlights': ['When two distributions overlap, we need a relatively large sample size to have a lot of power. The overlap between distributions requires a large sample size for increased power.', 'Increasing the sample size provides more confidence that the estimated means are close to the population means, reducing the overlap between distributions and increasing the probability of correctly rejecting the null hypothesis. Larger sample size increases confidence in estimated means, reduces overlap, and increases the probability of rejecting the null hypothesis.', 'A sample size of nine provides an 80% chance of correctly rejecting the null hypothesis, demonstrating the impact of sample size on statistical power. Sample size of nine yields 80% chance of correctly rejecting the null hypothesis.']}], 'duration': 339.032, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/VX_M3tIyiYk/pics/VX_M3tIyiYk664861.jpg', 'highlights': ['The central limit theorem indicates a high probability of correctly rejecting the null hypothesis with large sample sizes, even when distributions overlap, and applies to any underlying distribution.', 'Increasing the sample size provides more confidence that the estimated means are close to the population means, reducing the overlap between distributions and increasing the probability of correctly rejecting the null hypothesis. Larger sample size increases confidence in estimated means, reduces overlap, and increases the probability of rejecting the null hypothesis.', 'A power analysis determines the necessary sample size for achieving adequate power.', 'When two distributions overlap, we need a relatively large sample size to have a lot of power. The overlap between distributions requires a large sample size for increased power.', 'A sample size of nine provides an 80% chance of correctly rejecting the null hypothesis, demonstrating the impact of sample size on statistical power. Sample size of nine yields 80% chance of correctly rejecting the null hypothesis.']}], 'highlights': ['Power analysis is crucial in statistical testing to avoid p-hacking and make informed decisions based on significance levels such as p-value 0.05.', 'Aiming for a power of 0.8 indicates a desire for at least an 80% chance of correctly rejecting the null hypothesis, with the required sample size influenced by the overlap between distributions.', 'The larger the overlap between distributions, the larger the sample size needed to achieve a power of 0.8, highlighting the importance of considering the relationship between overlap and sample size.', 'Using more than one measurement increases the probability of the next measurement pulling the estimated mean back to the population mean by 75%.', 'Increasing the sample size provides more confidence that the estimated means are close to the population means, reducing the overlap between distributions and increasing the probability of correctly rejecting the null hypothesis. Larger sample size increases confidence in estimated means, reduces overlap, and increases the probability of rejecting the null hypothesis.', 'A sample size of nine provides an 80% chance of correctly rejecting the null hypothesis, demonstrating the impact of sample size on statistical power. Sample size of nine yields 80% chance of correctly rejecting the null hypothesis.']}