title

How to calculate p-values

description

In this StatQuest we learn how to calculate p-values using both discrete data (like coin tosses) and continuous data (like height measurements). At the end, we explain the differences between 1 and 2-sided p-values and why you should avoid 1-sided p-values if possible.
NOTE: This StatQuest assumes that you are already familiar with what p-values are and how to interpret them. If not, check out the quest:
p-values: What they are and how to interpret them.
For a complete index of all the StatQuest videos, check out:
https://statquest.org/video-index/
If you'd like to support StatQuest, please consider...
Support StatQuest by buying The StatQuest Illustrated Guide to Machine Learning!!!
PDF - https://statquest.gumroad.com/l/wvtmc
Paperback - https://www.amazon.com/dp/B09ZCKR4H6
Kindle eBook - https://www.amazon.com/dp/B09ZG79HXC
Patreon: https://www.patreon.com/statquest
...or...
YouTube Membership: https://www.youtube.com/channel/UCtYLUTtgS3k1Fg4y5tAhLbw/join
...a cool StatQuest t-shirt or sweatshirt:
https://shop.spreadshirt.com/statquest-with-josh-starmer/
...buying one or two of my songs (or go large and get a whole album!)
https://joshuastarmer.bandcamp.com/
...or just donating to StatQuest!
https://www.paypal.me/statquest
Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
https://twitter.com/joshuastarmer
0:00 Awesome song and introduction
0:58 p-value for getting two heads
6:39 p-value defined as the sum of three parts
9:30 p-value for getting four heads and 1 tails
12:31 p-values for continuous data, like how tall people are
14:31 A borderline p-value
16:59 A significant p-value
17:47 An insignificant p-value
20:12 One-sided vs two-sided p-values
24:20 Summary of concepts
#statquest #pvalue

detail

{'title': 'How to calculate p-values', 'heatmap': [{'end': 954.526, 'start': 864.537, 'weight': 0.961}, {'end': 1080.879, 'start': 1061.621, 'weight': 0.737}, {'end': 1112.11, 'start': 1088.324, 'weight': 0.716}], 'summary': 'Explains how to calculate two-sided p-values, cautioning against one-sided p-values, demonstrates hypothesis testing using coin flip examples with a p-value result of 0.5, and discusses probability distributions and statistical analysis of heights with specific p-values such as 0.016 and 0.05, emphasizing the significance and relevance of p-values in detecting unusual events.', 'chapters': [{'end': 57.568, 'segs': [{'end': 57.568, 'src': 'embed', 'start': 0.149, 'weight': 0, 'content': [{'end': 5.431, 'text': "Calculating p-values is kind of fun, and not just when you're done.", 'start': 0.149, 'duration': 5.282}, {'end': 12.014, 'text': "StatQuest Hello, I'm Josh Starmer and welcome to StatQuest.", 'start': 5.891, 'duration': 6.123}, {'end': 15.676, 'text': "Today we're going to talk about how to calculate p-values.", 'start': 12.494, 'duration': 3.182}, {'end': 23.539, 'text': 'this StatQuest assumes that you are already familiar with what p-values are and how to interpret them.', 'start': 16.896, 'duration': 6.643}, {'end': 26.12, 'text': 'If not, check out the Quest.', 'start': 24.259, 'duration': 1.861}, {'end': 36.58, 'text': 'Also note, before we get started, I want to mention that there are two types of p-values, one-sided and two-sided.', 'start': 27.589, 'duration': 8.991}, {'end': 43.449, 'text': 'Two-sided p-values are the most common and this quest focuses on calculating them.', 'start': 38.002, 'duration': 5.447}, {'end': 51.944, 'text': 'In contrast, one-sided p-values are rarely used and, to be honest, potentially dangerous.', 'start': 45.04, 'duration': 6.904}, {'end': 57.568, 'text': "I won't mention them again until the very end when I give an example of why they should be avoided.", 'start': 52.605, 'duration': 4.963}], 'summary': 'Statquest explains how to calculate two-sided p-values, emphasizing their importance over one-sided p-values.', 'duration': 57.419, 'max_score': 0.149, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E149.jpg'}], 'start': 0.149, 'title': 'Calculating p-values', 'summary': 'Explains how to calculate two-sided p-values, highlighting their relevance while cautioning against the use of one-sided p-values.', 'chapters': [{'end': 57.568, 'start': 0.149, 'title': 'Calculating p-values', 'summary': 'Explains the calculation of p-values, focusing on two-sided p-values, as one-sided p-values are rarely used and potentially dangerous.', 'duration': 57.419, 'highlights': ['The chapter focuses on calculating two-sided p-values, which are the most common type of p-values.', 'One-sided p-values are rarely used and potentially dangerous, and the chapter provides an example of why they should be avoided.', 'The StatQuest assumes familiarity with p-values and their interpretation.']}], 'duration': 57.419, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E149.jpg', 'highlights': ['The chapter focuses on calculating two-sided p-values, which are the most common type of p-values.', 'One-sided p-values are rarely used and potentially dangerous, and the chapter provides an example of why they should be avoided.', 'The StatQuest assumes familiarity with p-values and their interpretation.']}, {'end': 543.33, 'segs': [{'end': 206.928, 'src': 'embed', 'start': 85.49, 'weight': 0, 'content': [{'end': 95.818, 'text': 'However, in statistics lingo, the hypothesis is, even though I got two heads in a row, my coin is no different from a normal coin.', 'start': 85.49, 'duration': 10.328}, {'end': 108.162, 'text': 'Note, although we want to know if our coin is special, the statistics lingo version says the opposite, that our coin is the same as a normal coin.', 'start': 97.256, 'duration': 10.906}, {'end': 115.726, 'text': 'Statisticians call this the null hypothesis, and a small p-value will tell us to reject it.', 'start': 109.282, 'duration': 6.444}, {'end': 122.009, 'text': 'And if we reject this null hypothesis, we will know that our coin is special.', 'start': 117.206, 'duration': 4.803}, {'end': 127.102, 'text': "So let's test this hypothesis by calculating a p-value.", 'start': 123.499, 'duration': 3.603}, {'end': 136.45, 'text': "P-values are determined by adding up probabilities, so let's start by figuring out the probability of getting two heads in a row.", 'start': 128.163, 'duration': 8.287}, {'end': 143.616, 'text': "When we flip a normal, everyday coin, there's a 50% chance we'll get heads, and a 50% chance we'll get tails.", 'start': 138.071, 'duration': 5.545}, {'end': 157.323, 'text': 'Now, if we got heads on the first flip and flip the coin a second time, then, just like before,', 'start': 148.86, 'duration': 8.463}, {'end': 162.165, 'text': "there's a 50% chance we'll get heads and a 50% chance we'll get tails.", 'start': 157.323, 'duration': 4.842}, {'end': 172.368, 'text': 'Likewise, if we got tails on the first flip and flip the coin again, then, just like before,', 'start': 163.725, 'duration': 8.643}, {'end': 174.229, 'text': "there's a 50% chance we'll get heads and a 50% chance we'll get tails.", 'start': 172.368, 'duration': 1.861}, {'end': 183.941, 'text': 'Ultimately, these are the four possible outcomes after flipping a coin two times.', 'start': 178.599, 'duration': 5.342}, {'end': 192.663, 'text': 'Because each outcome is equally probable, we can calculate the probability of getting two heads with the following formula.', 'start': 185.321, 'duration': 7.342}, {'end': 198.925, 'text': 'The number of times we got two heads divided by the total number of outcomes.', 'start': 193.964, 'duration': 4.961}, {'end': 203.727, 'text': 'In this case, we only got two heads one time.', 'start': 200.366, 'duration': 3.361}, {'end': 206.928, 'text': 'So we put a 1 in the numerator.', 'start': 204.967, 'duration': 1.961}], 'summary': 'Using statistics, we can test if a coin is special by calculating a p-value to reject the null hypothesis.', 'duration': 121.438, 'max_score': 85.49, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E85490.jpg'}, {'end': 288.4, 'src': 'embed', 'start': 248.64, 'weight': 4, 'content': [{'end': 251.021, 'text': "In this case, the order doesn't matter,", 'start': 248.64, 'duration': 2.381}, {'end': 258.827, 'text': "because getting a heads on the first flip doesn't change the probabilities of getting heads or tails on the second flip.", 'start': 251.021, 'duration': 7.806}, {'end': 268.387, 'text': "Likewise, getting tails on the first flip doesn't change the probabilities of getting heads or tails on the second flip.", 'start': 260.221, 'duration': 8.166}, {'end': 276.112, 'text': 'Because order does not affect the probabilities of getting heads and tails, we treat these outcomes as the same.', 'start': 269.808, 'duration': 6.304}, {'end': 288.4, 'text': "Now let's move the outcomes over to the left and list the probability of each outcome and calculate the p-value for getting two heads.", 'start': 277.513, 'duration': 10.887}], 'summary': "Order doesn't affect probabilities of getting heads or tails, treating outcomes as the same.", 'duration': 39.76, 'max_score': 248.64, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E248640.jpg'}, {'end': 409.302, 'src': 'embed', 'start': 344.403, 'weight': 5, 'content': [{'end': 350.205, 'text': 'and the p-value for getting two heads equals 0.5.', 'start': 344.403, 'duration': 5.802}, {'end': 355.666, 'text': 'Now remember, the reason we calculated the p-value was to test this hypothesis.', 'start': 350.205, 'duration': 5.461}, {'end': 362.268, 'text': 'Even though I got two heads in a row, my coin is no different from a normal coin.', 'start': 356.826, 'duration': 5.442}, {'end': 371.519, 'text': 'Typically, we only reject a hypothesis if the p-value is less than 0.05.', 'start': 363.868, 'duration': 7.651}, {'end': 378.561, 'text': 'and since 0.5 is greater than 0.05, we fail to reject the hypothesis.', 'start': 371.519, 'duration': 7.042}, {'end': 386.964, 'text': 'In other words, the data, getting two heads in a row, failed to convince us that our coin is special.', 'start': 379.942, 'duration': 7.022}, {'end': 399.38, 'text': 'Note, the probability of getting two heads, 0.25, is different from the p-value for getting two heads, 0.5.', 'start': 388.625, 'duration': 10.755}, {'end': 403.101, 'text': 'This is because the p-value is the sum of three parts.', 'start': 399.38, 'duration': 3.721}, {'end': 409.302, 'text': 'The first part is the probability random chance would result in the observation.', 'start': 404.441, 'duration': 4.861}], 'summary': 'P-value of getting two heads is 0.5, failing to reject the hypothesis.', 'duration': 64.899, 'max_score': 344.403, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E344403.jpg'}, {'end': 514.352, 'src': 'embed', 'start': 485.989, 'weight': 7, 'content': [{'end': 492.434, 'text': 'Note, even though these flowers are different colors, just knowing that they are equally rare would be a bummer.', 'start': 485.989, 'duration': 6.445}, {'end': 500.76, 'text': 'Because a lot of equally rare things would make something less special, we add part 2 to the p-value.', 'start': 493.975, 'duration': 6.785}, {'end': 505.824, 'text': 'And we add rarer things to the p-value for a similar reason.', 'start': 502.261, 'duration': 3.563}, {'end': 514.352, 'text': 'Going back to our flower example, imagine telling your loved one, this is the rarest flower of this species.', 'start': 507.128, 'duration': 7.224}], 'summary': 'Adding part 2 to the p-value for equally rare things to maintain uniqueness.', 'duration': 28.363, 'max_score': 485.989, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E485989.jpg'}], 'start': 59.029, 'title': 'Hypothesis testing and probability in statistics', 'summary': 'Covers hypothesis testing using the example of coin flips, discussing the probability of different outcomes, and calculating the p-value, with a specific result of 0.5, demonstrating statistical significance.', 'chapters': [{'end': 143.616, 'start': 59.029, 'title': 'Hypothesis testing in statistics', 'summary': 'Explains the concept of hypothesis testing in statistics, using the example of flipping a coin and obtaining two heads in a row to demonstrate the null hypothesis and the role of p-values in determining statistical significance.', 'duration': 84.587, 'highlights': ['The null hypothesis in statistics states that even though we obtain two heads in a row, the coin is not different from a normal coin.', 'A small p-value indicates the rejection of the null hypothesis, suggesting that the coin is indeed special.', 'The probability of obtaining two heads in a row with a normal coin is 50% * 50% = 25%.']}, {'end': 276.112, 'start': 148.86, 'title': 'Probability of coin flips', 'summary': 'Discusses the probability of getting different outcomes when flipping a coin twice, such as the 50% chance of getting two heads or two tails and the 0.5 probability of getting one head and one tail, with a specific example showing the probability of getting two heads is 0.25.', 'duration': 127.252, 'highlights': ['Each outcome is equally probable, enabling the calculation of the probability of getting two heads with the formula: the number of times two heads occurred divided by the total outcomes, resulting in a 0.25 probability.', "The 50% chance of getting two heads or two tails, along with the 0.5 probability of getting one head and one tail, are explained through the concept that the order of the flips doesn't affect the probabilities of getting heads and tails."]}, {'end': 543.33, 'start': 277.513, 'title': 'Calculating p-value and hypothesis testing', 'summary': 'Discusses the calculation of p-value for getting two heads, with a resulting value of 0.5, failing to reject the hypothesis that the coin is special, and explains the significance of adding equally rare or rarer outcomes to the p-value.', 'duration': 265.817, 'highlights': ['The p-value for getting two heads equals 0.5, leading to the failure to reject the hypothesis that the coin is special.', 'The probability of getting two heads, 0.25, is different from the p-value for getting two heads, 0.5, as the p-value is the sum of three parts: probability of random chance, probability of equally rare outcome, and probability of rarer or more extreme outcome.', 'Adding equally rare outcomes to the p-value is significant as it diminishes the perceived specialness of the observed event, analogous to presenting equally rare flowers, which lowers their perceived uniqueness.', 'Similarly, adding rarer outcomes to the p-value is important as it decreases the perceived specialness of the observed event, akin to presenting flowers that are rarer, which reduces their perceived uniqueness.']}], 'duration': 484.301, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E59029.jpg', 'highlights': ['A small p-value indicates the rejection of the null hypothesis, suggesting that the coin is indeed special.', 'The null hypothesis in statistics states that even though we obtain two heads in a row, the coin is not different from a normal coin.', 'The probability of obtaining two heads in a row with a normal coin is 50% * 50% = 25%.', 'Each outcome is equally probable, enabling the calculation of the probability of getting two heads with the formula: the number of times two heads occurred divided by the total outcomes, resulting in a 0.25 probability.', "The 50% chance of getting two heads or two tails, along with the 0.5 probability of getting one head and one tail, are explained through the concept that the order of the flips doesn't affect the probabilities of getting heads and tails.", 'The p-value for getting two heads equals 0.5, leading to the failure to reject the hypothesis that the coin is special.', 'The probability of getting two heads, 0.25, is different from the p-value for getting two heads, 0.5, as the p-value is the sum of three parts: probability of random chance, probability of equally rare outcome, and probability of rarer or more extreme outcome.', 'Adding equally rare outcomes to the p-value is significant as it diminishes the perceived specialness of the observed event, analogous to presenting equally rare flowers, which lowers their perceived uniqueness.', 'Similarly, adding rarer outcomes to the p-value is important as it decreases the perceived specialness of the observed event, akin to presenting flowers that are rarer, which reduces their perceived uniqueness.']}, {'end': 750.822, 'segs': [{'end': 572.87, 'src': 'embed', 'start': 544.871, 'weight': 0, 'content': [{'end': 551.115, 'text': 'Thus, because rarer things make something less special, we add part three to the p-value.', 'start': 544.871, 'duration': 6.244}, {'end': 559.4, 'text': 'Okay, now that we know that getting two heads in a row is not very special or statistically significant,', 'start': 552.856, 'duration': 6.544}, {'end': 563.588, 'text': 'What about getting four heads and one tails?', 'start': 560.447, 'duration': 3.141}, {'end': 567.128, 'text': 'Would that suggest that our coin is special?', 'start': 564.728, 'duration': 2.4}, {'end': 572.87, 'text': 'In other words, we can calculate a p-value to test this hypothesis.', 'start': 568.309, 'duration': 4.561}], 'summary': 'Testing the hypothesis of a special coin by calculating a p-value for getting four heads and one tail.', 'duration': 27.999, 'max_score': 544.871, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E544871.jpg'}, {'end': 659.04, 'src': 'embed', 'start': 631.073, 'weight': 4, 'content': [{'end': 640.755, 'text': 'Likewise, there are ten ways that we can flip a coin and get three heads and two tails, and ten ways to get two heads and three tails.', 'start': 631.073, 'duration': 9.682}, {'end': 652.296, 'text': 'and five ways to get one heads and four tails, and lastly, one way to flip a coin five times and get five tails.', 'start': 642.409, 'duration': 9.887}, {'end': 659.04, 'text': 'All in all, when we flip a coin five times, there are 32 possible outcomes.', 'start': 653.937, 'duration': 5.103}], 'summary': 'When flipping a coin five times, there are 32 possible outcomes with varying numbers of heads and tails.', 'duration': 27.967, 'max_score': 631.073, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E631073.jpg'}, {'end': 757.849, 'src': 'embed', 'start': 727.059, 'weight': 2, 'content': [{'end': 731.563, 'text': 'So, in this case, we will fail to reject the null hypothesis.', 'start': 727.059, 'duration': 4.504}, {'end': 740.332, 'text': 'In other words, the data, getting four heads and one tails, did not convince us that our coin was special.', 'start': 733.265, 'duration': 7.067}, {'end': 750.822, 'text': "With coin tosses, it's pretty easy to calculate probabilities and p-values because it's pretty easy to list all of the possible outcomes.", 'start': 742.213, 'duration': 8.609}, {'end': 757.849, 'text': 'But what if we wanted to calculate probabilities and p-values for how tall or short people are?', 'start': 752.143, 'duration': 5.706}], 'summary': 'Failing to reject null hypothesis with 4 heads and 1 tail data; challenges in calculating probabilities for human traits.', 'duration': 30.79, 'max_score': 727.059, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E727059.jpg'}], 'start': 544.871, 'title': 'Coin toss probabilities', 'summary': 'Discusses how to calculate the p-value for testing the hypothesis of a special coin with the example of four heads and one tail, evaluating 32 outcomes and concluding that the null hypothesis is not rejected.', 'chapters': [{'end': 601.886, 'start': 544.871, 'title': 'Calculating p-value for coin toss', 'summary': 'Discusses how to calculate the p-value to test the hypothesis of a coin being special, using the example of getting four heads and one tail to determine statistical significance.', 'duration': 57.015, 'highlights': ['The p-value is used to determine statistical significance, and in the case of getting four heads and one tail, it can indicate if the coin is special or not.', 'Understanding that a small p-value and rejection of the null hypothesis would indicate that the coin is special, providing a method to test for uniqueness.', 'The chapter emphasizes the significance of the p-value in determining the uniqueness of the coin, showcasing the importance of statistical testing in coin analysis.']}, {'end': 750.822, 'start': 603.266, 'title': 'Coin toss probabilities', 'summary': 'Discusses the 32 possible outcomes of flipping a coin five times, calculates the p-value for getting four heads and one tails, and concludes that the null hypothesis is not rejected.', 'duration': 147.556, 'highlights': ['There are 32 possible outcomes when flipping a coin five times. The chapter explains that there are 32 possible outcomes when flipping a coin five times, including various combinations of heads and tails.', 'The p-value for getting four heads and one tails is 0.375. The p-value for getting four heads and one tails is calculated as 0.375, indicating that the data does not convince that the coin is special.', 'The null hypothesis is not rejected. The chapter concludes that the null hypothesis is not rejected, indicating that the data of getting four heads and one tails did not convince that the coin was special.']}], 'duration': 205.951, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E544871.jpg', 'highlights': ['Understanding that a small p-value and rejection of the null hypothesis would indicate that the coin is special, providing a method to test for uniqueness.', 'The chapter emphasizes the significance of the p-value in determining the uniqueness of the coin, showcasing the importance of statistical testing in coin analysis.', 'The p-value for getting four heads and one tails is 0.375, indicating that the data does not convince that the coin is special.', 'The null hypothesis is not rejected, indicating that the data of getting four heads and one tails did not convince that the coin was special.', 'There are 32 possible outcomes when flipping a coin five times, including various combinations of heads and tails.']}, {'end': 1017.829, 'segs': [{'end': 788.1, 'src': 'embed', 'start': 752.143, 'weight': 1, 'content': [{'end': 757.849, 'text': 'But what if we wanted to calculate probabilities and p-values for how tall or short people are?', 'start': 752.143, 'duration': 5.706}, {'end': 764.055, 'text': 'In theory, we could try to list every single possible value for height.', 'start': 759.37, 'duration': 4.685}, {'end': 773.067, 'text': 'However, in practice, when we calculate probabilities and p-values for something continuous like height,', 'start': 765.543, 'duration': 7.524}, {'end': 776.309, 'text': 'we usually use something called a statistical distribution.', 'start': 773.067, 'duration': 3.242}, {'end': 788.1, 'text': 'Here we have a distribution of height measurements from Brazilian women between 15 and 49 years old, taken in 1996.', 'start': 777.87, 'duration': 10.23}], 'summary': 'Calculating probabilities and p-values for height using statistical distribution, based on measurements from brazilian women.', 'duration': 35.957, 'max_score': 752.143, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E752143.jpg'}, {'end': 842.972, 'src': 'embed', 'start': 814.597, 'weight': 0, 'content': [{'end': 823.745, 'text': 'In other words, there is a 95% probability that each time we measure a Brazilian woman, their height will be between 142 and 169 centimeters.', 'start': 814.597, 'duration': 9.148}, {'end': 834.608, 'text': '2.5% of the total area under the curve is greater than 169.', 'start': 823.765, 'duration': 10.843}, {'end': 842.972, 'text': 'And that means there is a 2.5% probability that each time we measure a Brazilian woman, their height will be greater than 169 centimeters.', 'start': 834.608, 'duration': 8.364}], 'summary': "95% probability for brazilian women's height between 142-169cm, 2.5% chance of height exceeding 169cm.", 'duration': 28.375, 'max_score': 814.597, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E814597.jpg'}, {'end': 954.526, 'src': 'heatmap', 'start': 864.537, 'weight': 0.961, 'content': [{'end': 870.638, 'text': 'To calculate p-values with a distribution, you add up the percentages of area under the curve.', 'start': 864.537, 'duration': 6.101}, {'end': 877.16, 'text': 'For example, imagine we measured someone who is 142 centimeters tall.', 'start': 871.939, 'duration': 5.221}, {'end': 885.762, 'text': 'If we measured someone who is 142 centimeters tall, we might wonder if it came from this distribution of heights,', 'start': 878.62, 'duration': 7.142}, {'end': 891.19, 'text': 'which has an average value of 155.7..', 'start': 885.762, 'duration': 5.428}, {'end': 893.912, 'text': 'or if it came from another distribution of heights.', 'start': 891.19, 'duration': 2.722}, {'end': 900.736, 'text': 'For example, this green distribution has an average value of 142.', 'start': 894.412, 'duration': 6.324}, {'end': 911.882, 'text': 'So the question is is this measurement 142 centimeters so far away from the mean of the blue distribution that we can reject the idea that it came from it?', 'start': 900.736, 'duration': 11.146}, {'end': 920.879, 'text': 'If so, then that would suggest that another distribution like this green one might do a better job explaining the data.', 'start': 913.234, 'duration': 7.645}, {'end': 924.681, 'text': 'The p-value for the hypothesis.', 'start': 922.26, 'duration': 2.421}, {'end': 931.406, 'text': 'this measurement comes from the blue distribution starts with the 2.5% of the area for people less than or equal to 142 centimeters.', 'start': 924.681, 'duration': 6.725}, {'end': 944.86, 'text': 'When we are working with a distribution, we are interested in adding more extreme values to the p-value rather than rarer values.', 'start': 936.916, 'duration': 7.944}, {'end': 954.526, 'text': 'In this case, all heights further than 142 cm from the mean are considered more extreme than what we observed.', 'start': 946.121, 'duration': 8.405}], 'summary': 'To calculate p-values, we compare measurements to distribution averages, using 2.5% as threshold for rejecting the hypothesis.', 'duration': 89.989, 'max_score': 864.537, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E864537.jpg'}, {'end': 954.526, 'src': 'embed', 'start': 924.681, 'weight': 2, 'content': [{'end': 931.406, 'text': 'this measurement comes from the blue distribution starts with the 2.5% of the area for people less than or equal to 142 centimeters.', 'start': 924.681, 'duration': 6.725}, {'end': 944.86, 'text': 'When we are working with a distribution, we are interested in adding more extreme values to the p-value rather than rarer values.', 'start': 936.916, 'duration': 7.944}, {'end': 954.526, 'text': 'In this case, all heights further than 142 cm from the mean are considered more extreme than what we observed.', 'start': 946.121, 'duration': 8.405}], 'summary': 'Blue distribution shows 2.5% below 142cm; focuses on extreme p-values.', 'duration': 29.845, 'max_score': 924.681, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E924681.jpg'}, {'end': 1017.829, 'src': 'embed', 'start': 991.912, 'weight': 3, 'content': [{'end': 1003.519, 'text': 'And since the cutoff for significance is usually 0.05, we would say, hmm, maybe it could come from this distribution, maybe not.', 'start': 991.912, 'duration': 11.607}, {'end': 1007.602, 'text': "It's hard to tell since the p-value is right on the borderline.", 'start': 1004.04, 'duration': 3.562}, {'end': 1014.086, 'text': 'So maybe they come from this distribution, or maybe they come from this distribution.', 'start': 1009.043, 'duration': 5.043}, {'end': 1016.207, 'text': 'The data are inconclusive.', 'start': 1014.506, 'duration': 1.701}, {'end': 1017.829, 'text': 'Wah, wah.', 'start': 1016.848, 'duration': 0.981}], 'summary': 'Data analysis is inconclusive with p-value at 0.05 significance cutoff.', 'duration': 25.917, 'max_score': 991.912, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E991912.jpg'}], 'start': 752.143, 'title': "Brazilian women's height distribution", 'summary': 'Discusses the probability distribution of heights for brazilian women aged 15-49, indicating a 95% probability of heights being between 142 and 169 centimeters and examines hypothesis testing with a p-value of 0.05.', 'chapters': [{'end': 900.736, 'start': 752.143, 'title': "Probability distribution of brazilian women's height", 'summary': 'Discusses the use of statistical distribution to calculate probabilities and p-values for the height of brazilian women between 15 and 49 years old, with the red area under the curve indicating specific probabilities, such as a 95% probability of heights being between 142 and 169 centimeters.', 'duration': 148.593, 'highlights': ["The red area under the curve indicates a 95% probability that Brazilian women's height will be between 142 and 169 centimeters, with 2.5% of the total area indicating heights greater than 169 and another 2.5% indicating heights less than 142.", 'The use of statistical distribution allows for the calculation of p-values, by adding up the percentages of area under the curve, indicating the probability of a measured height coming from a specific distribution.']}, {'end': 1017.829, 'start': 900.736, 'title': 'Hypothesis testing and p-value calculation', 'summary': 'Examines the measurement of 142 centimeters in relation to the blue distribution, calculating a p-value of 0.05 and concluding that the data are inconclusive.', 'duration': 117.093, 'highlights': ['By calculating the p-value for a measurement of 142 centimeters, it is determined to be 0.05, indicating inconclusive results.', 'The process involves considering extreme values and calculating the area under the distribution curve, leading to the determination of the p-value.', 'The significance cutoff of 0.05 is mentioned, signifying the borderline nature of the p-value and the inconclusive nature of the data.']}], 'duration': 265.686, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E752143.jpg', 'highlights': ["The red area under the curve indicates a 95% probability that Brazilian women's height will be between 142 and 169 centimeters, with 2.5% of the total area indicating heights greater than 169 and another 2.5% indicating heights less than 142.", 'The use of statistical distribution allows for the calculation of p-values, by adding up the percentages of area under the curve, indicating the probability of a measured height coming from a specific distribution.', 'The process involves considering extreme values and calculating the area under the distribution curve, leading to the determination of the p-value.', 'By calculating the p-value for a measurement of 142 centimeters, it is determined to be 0.05, indicating inconclusive results.', 'The significance cutoff of 0.05 is mentioned, signifying the borderline nature of the p-value and the inconclusive nature of the data.']}, {'end': 1208.358, 'segs': [{'end': 1117.13, 'src': 'heatmap', 'start': 1088.324, 'weight': 0, 'content': [{'end': 1094.027, 'text': 'so far away from the mean of the blue distribution that we can reject the idea that it came from it?', 'start': 1088.324, 'duration': 5.703}, {'end': 1101.993, 'text': 'If the p-value is small, then that suggests that some other distribution would do a better job explaining the data.', 'start': 1095.508, 'duration': 6.485}, {'end': 1112.11, 'text': 'Note, the probability of someone being between 155.4 and 156 centimeters is only 0.04.', 'start': 1103.348, 'duration': 8.762}, {'end': 1117.13, 'text': 'The red area is pretty small, barely a lime.', 'start': 1112.11, 'duration': 5.02}], 'summary': 'P-value suggests better fit from another distribution. probability of someone being 155.4-156cm is only 0.04.', 'duration': 55.509, 'max_score': 1088.324, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E1088324.jpg'}, {'end': 1179.335, 'src': 'embed', 'start': 1148.594, 'weight': 3, 'content': [{'end': 1154.117, 'text': 'And because 48% of the area under the curve is for heights less than 155.4, we add 0.48 to the p-value.', 'start': 1148.594, 'duration': 5.523}, {'end': 1165.465, 'text': 'On the right side, all of the heights greater than 156 are further from the mean.', 'start': 1159.901, 'duration': 5.564}, {'end': 1168.567, 'text': 'Thus, they are all more extreme.', 'start': 1166.045, 'duration': 2.522}, {'end': 1179.335, 'text': 'And because 48% of the area under the curve is for heights greater than 156, we add 0.48 to the p-value.', 'start': 1170.248, 'duration': 9.087}], 'summary': '48% of the area under the curve is for heights less than 155.4, and 48% is for heights greater than 156, affecting the p-value.', 'duration': 30.741, 'max_score': 1148.594, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E1148594.jpg'}], 'start': 1019.326, 'title': 'Statistical analysis of heights and calculating p-value for distribution', 'summary': 'Presents a statistical analysis of height measurements, with a measurement of 141cm yielding a p-value of 0.016, and a measurement between 155.4 and 156cm centered around the average height. it also explains the process of calculating the p-value for a distribution using the example of measuring heights, concluding that the data does not suggest another distribution would do a better job.', 'chapters': [{'end': 1088.324, 'start': 1019.326, 'title': 'Statistical analysis of heights', 'summary': 'Presents a statistical analysis of height measurements, showing that a measurement of 141cm yields a p-value of 0.016, leading to the rejection of the hypothesis that it is normal, while a measurement between 155.4 and 156cm is centered around the average height.', 'duration': 68.998, 'highlights': ["A measurement of 141cm yields a p-value of 0.016, allowing for the rejection of the hypothesis that it is normal, indicating that it's special to measure someone that short.", 'A measurement between 155.4 and 156cm is around the average height, suggesting a different distribution of heights.']}, {'end': 1208.358, 'start': 1088.324, 'title': 'Calculating p-value for distribution', 'summary': 'Explains the process of calculating the p-value for a distribution, where a small p-value suggests a better fit for another distribution. it illustrates the calculation using the example of measuring heights and concludes that the data does not suggest another distribution would do a better job explaining the data.', 'duration': 120.034, 'highlights': ['The p-value is small, suggesting that another distribution would do a better job explaining the data, with a probability of 0.04 for someone being between 155.4 and 156 centimeters.', 'The probability of someone being between 155.4 and 156 centimeters is only 0.04, indicating the unlikeliness of such measurements in the given distribution.', '48% of the area under the curve is for heights less than 155.4, adding 0.48 to the p-value, indicating the extent of extremeness on the left side of the distribution.', 'The p-value equals 1, suggesting that given this distribution of heights, it is not unusual to measure someone close to the average, despite the small probability, and does not suggest another distribution would do a better job explaining the data.']}], 'duration': 189.032, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E1019326.jpg', 'highlights': ['A measurement of 141cm yields a p-value of 0.016, indicating its uniqueness in the distribution.', 'The probability of someone being between 155.4 and 156cm is only 0.04, indicating the unlikeliness of such measurements.', 'The p-value is small, suggesting that another distribution would do a better job explaining the data.', '48% of the area under the curve is for heights less than 155.4, adding 0.48 to the p-value, indicating the extent of extremeness on the left side of the distribution.']}, {'end': 1514.641, 'segs': [{'end': 1236.9, 'src': 'embed', 'start': 1209.619, 'weight': 1, 'content': [{'end': 1215.883, 'text': "Bam! So far, we've only talked about two-sided p-values.", 'start': 1209.619, 'duration': 6.264}, {'end': 1222.972, 'text': "Now I'll give you an example of a one-sided p-value and tell you why it has the potential to be dangerous.", 'start': 1216.969, 'duration': 6.003}, {'end': 1228.535, 'text': 'Imagine we measured how long it took a bunch of people to recover from an illness.', 'start': 1224.173, 'duration': 4.362}, {'end': 1236.9, 'text': 'Now imagine we created a new drug, Superdrug, and wanted to see if it helped people recover in fewer days.', 'start': 1229.696, 'duration': 7.204}], 'summary': 'Introduction to one-sided p-value and its potential dangers in assessing the effectiveness of a new drug.', 'duration': 27.281, 'max_score': 1209.619, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E1209619.jpg'}, {'end': 1291.338, 'src': 'embed', 'start': 1265.603, 'weight': 0, 'content': [{'end': 1278.05, 'text': 'And since 0.03 is less than 0.05, the two-sided p-value tells us that, given this distribution of recovery times, Superdrug did something unusual.', 'start': 1265.603, 'duration': 12.447}, {'end': 1284.074, 'text': 'And that suggests that some other distribution does a better job explaining the data.', 'start': 1279.351, 'duration': 4.723}, {'end': 1291.338, 'text': 'For a one-sided p-value, the first thing we do is decide which direction we want to see change in.', 'start': 1285.535, 'duration': 5.803}], 'summary': "Superdrug's recovery times are statistically unusual with p-value of 0.03.", 'duration': 25.735, 'max_score': 1265.603, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E1265603.jpg'}, {'end': 1389.817, 'src': 'embed', 'start': 1359.092, 'weight': 3, 'content': [{'end': 1368.314, 'text': 'Just like before, the two-sided p-value would be the sum of this area under the curve, 0.016, plus this area under the curve, 0.016.', 'start': 1359.092, 'duration': 9.222}, {'end': 1377.173, 'text': 'And the total is 0.03.', 'start': 1368.314, 'duration': 8.859}, {'end': 1386.056, 'text': 'In other words, regardless of whether Superdrug is super and makes things better or if it is not so super and makes things worse,', 'start': 1377.173, 'duration': 8.883}, {'end': 1389.817, 'text': 'a two-sided p-value will detect something unusual happened.', 'start': 1386.056, 'duration': 3.761}], 'summary': 'The two-sided p-value is 0.03, indicating a detection of unusual occurrence.', 'duration': 30.725, 'max_score': 1359.092, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E1359092.jpg'}], 'start': 1209.619, 'title': 'P-values and their significance', 'summary': 'Covers the concept of one-sided and two-sided p-values, highlighting their significance in detecting unusual events. it explains the caution in using one-sided p-values and mentions the three components of a p-value, with a call to action for support.', 'chapters': [{'end': 1317.442, 'start': 1209.619, 'title': 'One-sided p-value example', 'summary': 'Explains the concept of one-sided p-value using an example of measuring the effectiveness of a new drug, superdrug, where a two-sided p-value of 0.03 suggests that the drug did something unusual, but a one-sided p-value would provide insight into whether the drug shortens recovery time.', 'duration': 107.823, 'highlights': ['A two-sided p-value of 0.03 suggests that Superdrug did something unusual, indicating that some other distribution does a better job explaining the data.', 'The concept of one-sided p-value is demonstrated by considering the direction in which change is desired, in this case, to see if recovery times are shorter with Superdrug.', 'The example illustrates the potential danger of one-sided p-values and the importance of considering the specific direction of change when interpreting the results.']}, {'end': 1514.641, 'start': 1318.92, 'title': 'Understanding p-values and their significance', 'summary': 'Explains the concept of one-sided and two-sided p-values, their significance in detecting unusual events, and the caution in using one-sided p-values, with a mention of the three components of a p-value and a call to action for support.', 'duration': 195.721, 'highlights': ['The chapter explains the concept of one-sided and two-sided p-values, their significance in detecting unusual events, and the caution in using one-sided p-values, with a mention of the three components of a p-value and a call to action for support.', 'A two-sided p-value would be the sum of the area under the curve in both directions, 0.016 and 0.016, totaling 0.03, detecting unusual events regardless of the direction of change.', 'A one-sided p-value would only use the area in the direction of change, resulting in an area of 0.98 when looking for shorter recovery times, failing to detect an unusual event in this case.', 'The explanation emphasizes the caution in using one-sided p-values, recommending their avoidance or usage only by experts due to their tricky nature and potential impact in detecting unusual events.', 'The chapter concludes with a call to action for support through subscriptions, contributions to the Patreon campaign, and the purchase of original songs, t-shirts, or hoodies.']}], 'duration': 305.022, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/JQc3yx0-Q9E/pics/JQc3yx0-Q9E1209619.jpg', 'highlights': ['A two-sided p-value of 0.03 suggests that Superdrug did something unusual, indicating that some other distribution does a better job explaining the data.', 'The chapter explains the concept of one-sided and two-sided p-values, their significance in detecting unusual events, and the caution in using one-sided p-values, with a mention of the three components of a p-value and a call to action for support.', 'The example illustrates the potential danger of one-sided p-values and the importance of considering the specific direction of change when interpreting the results.', 'A two-sided p-value would be the sum of the area under the curve in both directions, 0.016 and 0.016, totaling 0.03, detecting unusual events regardless of the direction of change.', 'The explanation emphasizes the caution in using one-sided p-values, recommending their avoidance or usage only by experts due to their tricky nature and potential impact in detecting unusual events.']}], 'highlights': ['The chapter focuses on calculating two-sided p-values, which are the most common type of p-values.', 'A small p-value indicates the rejection of the null hypothesis, suggesting that the coin is indeed special.', "The red area under the curve indicates a 95% probability that Brazilian women's height will be between 142 and 169 centimeters, with 2.5% of the total area indicating heights greater than 169 and another 2.5% indicating heights less than 142.", 'A measurement of 141cm yields a p-value of 0.016, indicating its uniqueness in the distribution.', 'A two-sided p-value of 0.03 suggests that Superdrug did something unusual, indicating that some other distribution does a better job explaining the data.']}