title
Hypothesis test comparing population proportions | Probability and Statistics | Khan Academy

description
Courses on Khan Academy are always 100% free. Start practicing—and saving your progress—now: https://www.khanacademy.org/math/statistics-probability/significance-tests-confidence-intervals-two-samples/comparing-two-proportions/v/hypothesis-test-comparing-population-proportions Hypothesis Test Comparing Population Proportions Watch the next lesson: https://www.khanacademy.org/math/probability/statistics-inferential/chi-square/v/chi-square-distribution-introduction?utm_source=YT&utm_medium=Desc&utm_campaign=ProbabilityandStatistics Missed the previous lesson? https://www.khanacademy.org/math/probability/statistics-inferential/hypothesis-testing-two-samples/v/comparing-population-proportions-2?utm_source=YT&utm_medium=Desc&utm_campaign=ProbabilityandStatistics Probability and statistics on Khan Academy: We dare you to go through a day in which you never consider or use probability. Did you check the weather forecast? Busted! Did you decide to go through the drive through lane vs walk in? Busted again! We are constantly creating hypotheses, making predictions, testing, and analyzing. Our lives are full of probabilities! Statistics is related to probability because much of the data we use when determining probable outcomes comes from our understanding of statistics. In these tutorials, we will cover a range of topics, some which include: independent events, dependent probability, combinatorics, hypothesis testing, descriptive statistics, random variables, probability distributions, regression, and inferential statistics. So buckle up and hop on for a wild ride. We bet you're going to be challenged AND love it! About Khan Academy: Khan Academy offers practice exercises, instructional videos, and a personalized learning dashboard that empower learners to study at their own pace in and outside of the classroom. We tackle math, science, computer programming, history, art history, economics, and more. Our math missions guide learners from kindergarten to calculus using state-of-the-art, adaptive technology that identifies strengths and learning gaps. We've also partnered with institutions like NASA, The Museum of Modern Art, The California Academy of Sciences, and MIT to offer specialized content. For free. For everyone. Forever. #YouCanLearnAnything Subscribe to KhanAcademy’s Probability and Statistics channel: https://www.youtube.com/channel/UCRXuOXLW3LcQLWvxbZiIZ0w?sub_confirmation=1 Subscribe to KhanAcademy: https://www.youtube.com/subscription_center?add_user=khanacademy

detail
{'title': 'Hypothesis test comparing population proportions | Probability and Statistics | Khan Academy', 'heatmap': [{'end': 946.714, 'start': 875.288, 'weight': 0.783}], 'summary': 'Explores hypothesis testing of men and women voting proportions at a 95% confidence interval, using z-scores to compare proportions with a notable difference of 0.8, resulting in a statistical difference between the proportion of men and women voters.', 'chapters': [{'end': 242.406, 'segs': [{'end': 30.982, 'src': 'embed', 'start': 0.583, 'weight': 0, 'content': [{'end': 1.885, 'text': 'In the last couple of videos,', 'start': 0.583, 'duration': 1.302}, {'end': 9.553, 'text': 'we were trying to figure out whether there was a meaningful difference between the proportion of men likely to vote for a candidate and the proportion of women.', 'start': 1.885, 'duration': 7.668}, {'end': 14.827, 'text': 'And in the last video we actually estimated that using a 95% confidence interval,', 'start': 9.963, 'duration': 4.864}, {'end': 21.673, 'text': 'a 95% confidence interval for the difference in the proportion of men and the difference in the proportion of women.', 'start': 14.827, 'duration': 6.846}, {'end': 28.619, 'text': 'What I want to do in this video is just to ask the question more directly or just do a straight up hypothesis test to see is there a difference?', 'start': 21.994, 'duration': 6.625}, {'end': 30.982, 'text': "So we're going to make our null hypothesis.", 'start': 29.18, 'duration': 1.802}], 'summary': 'Analyzing the difference in voting preferences by gender, conducting hypothesis test.', 'duration': 30.399, 'max_score': 0.583, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/dvSa_tx04hw/pics/dvSa_tx04hw583.jpg'}, {'end': 128.615, 'src': 'embed', 'start': 101.323, 'weight': 1, 'content': [{'end': 109.184, 'text': "We're going to do the hypothesis test with a significance level of 5%.", 'start': 101.323, 'duration': 7.861}, {'end': 114.745, 'text': "And all that means, and we've done this multiple times, is we are going to assume the null hypothesis.", 'start': 109.184, 'duration': 5.561}, {'end': 120.57, 'text': 'We are going to assume the null hypothesis.', 'start': 115.606, 'duration': 4.964}, {'end': 123.812, 'text': 'And then, assuming the null hypothesis is true,', 'start': 120.89, 'duration': 2.922}, {'end': 128.615, 'text': "we're going to then figure out the probability of getting our actual difference of our sample proportions.", 'start': 123.812, 'duration': 4.803}], 'summary': 'Conducting hypothesis test at 5% significance level to determine probability of actual difference in sample proportions.', 'duration': 27.292, 'max_score': 101.323, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/dvSa_tx04hw/pics/dvSa_tx04hw101323.jpg'}, {'end': 242.406, 'src': 'embed', 'start': 218.516, 'weight': 2, 'content': [{'end': 226.198, 'text': 'so this is our sample proportion of men who are going to vote for, at least in our poll, said they would vote for the candidate.', 'start': 218.516, 'duration': 7.682}, {'end': 229.299, 'text': 'This is the proportion of women who said they would vote for the candidate.', 'start': 226.578, 'duration': 2.721}, {'end': 232.12, 'text': 'The difference between the two was 0.8.', 'start': 229.679, 'duration': 2.441}, {'end': 242.406, 'text': "So what we can do is figure out what's the probability? Assuming that the true proportions are equal,", 'start': 232.12, 'duration': 10.286}], 'summary': 'Sample proportion: 0.8 difference in men and women voting for the candidate.', 'duration': 23.89, 'max_score': 218.516, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/dvSa_tx04hw/pics/dvSa_tx04hw218516.jpg'}], 'start': 0.583, 'title': 'Gender voting proportions analysis', 'summary': 'Explores hypothesis testing of men and women voting proportions at a 95% confidence interval, conducting tests at a 5% significance level, and analyzing a notable difference of 0.8 to determine equality probability.', 'chapters': [{'end': 101.103, 'start': 0.583, 'title': 'Hypothesis testing gender voting proportions', 'summary': 'Explores the hypothesis testing of the difference in proportions of men and women likely to vote for a candidate, estimating a 95% confidence interval and setting null and alternative hypotheses for the difference.', 'duration': 100.52, 'highlights': ['The chapter estimates a 95% confidence interval for the difference in the proportion of men and women likely to vote for a candidate.', 'Null hypothesis states no difference between the proportion of men and women likely to vote for the candidate.', 'Alternative hypothesis suggests a difference in the proportions of men and women likely to vote for the candidate.']}, {'end': 192.931, 'start': 101.323, 'title': 'Hypothesis testing at 5% significance level', 'summary': 'Discusses conducting a hypothesis test at a 5% significance level, determining the probability of obtaining the actual difference between sample proportions, and rejecting the null hypothesis if the probability is less than 5%.', 'duration': 91.608, 'highlights': ['The process involves assuming the null hypothesis, calculating the probability of the difference between sample proportions, and rejecting the null hypothesis if the probability is less than 5%.', 'The significance level for the hypothesis test is set at 5%.', 'The chapter emphasizes the importance of understanding the sampling distribution of the statistic when assuming the null hypothesis.']}, {'end': 242.406, 'start': 192.931, 'title': 'Gender proportion analysis', 'summary': 'Discusses analyzing the difference in proportions of men and women who indicated voting for a candidate, with a notable difference of 0.8, to determine the probability of true proportions being equal.', 'duration': 49.475, 'highlights': ['The difference between the proportion of men and women who said they would vote for the candidate was 0.8.', 'The chapter discusses analyzing the difference in proportions of men and women who indicated voting for a candidate.', 'The chapter discusses the probability of true proportions being equal.']}], 'duration': 241.823, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/dvSa_tx04hw/pics/dvSa_tx04hw583.jpg', 'highlights': ['The chapter estimates a 95% confidence interval for the difference in the proportion of men and women likely to vote for a candidate.', 'The process involves assuming the null hypothesis, calculating the probability of the difference between sample proportions, and rejecting the null hypothesis if the probability is less than 5%.', 'The difference between the proportion of men and women who said they would vote for the candidate was 0.8.']}, {'end': 643.559, 'segs': [{'end': 293.716, 'src': 'embed', 'start': 242.406, 'weight': 0, 'content': [{'end': 251.691, 'text': "that the mean of the sampling distribution of this statistic is actually 0, what's the probability that we get a difference of 0.051?", 'start': 242.406, 'duration': 9.285}, {'end': 257.795, 'text': "So what's the likelihood that we get something that extreme?", 'start': 251.691, 'duration': 6.104}, {'end': 267.335, 'text': "And what we're going to do here is just figure out a z-score for this, essentially figure out how many standard deviations away from the mean this is.", 'start': 258.567, 'duration': 8.768}, {'end': 268.636, 'text': 'That would be our z-score.', 'start': 267.415, 'duration': 1.221}, {'end': 277.705, 'text': 'And then figure out is the likelihood of getting a standard deviation or that extreme of a result or that many standard deviations away from the mean?', 'start': 268.997, 'duration': 8.708}, {'end': 279.747, 'text': 'is that likelihood more or less than 5%?', 'start': 277.705, 'duration': 2.042}, {'end': 284.411, 'text': "If it is less than 5%, we're going to reject the null hypothesis.", 'start': 279.747, 'duration': 4.664}, {'end': 288.029, 'text': "So let's first of all figure out our z-score.", 'start': 284.906, 'duration': 3.123}, {'end': 293.716, 'text': "So we're assuming the null hypothesis, p1 is equal to p2.", 'start': 288.63, 'duration': 5.086}], 'summary': 'Calculating z-score and likelihood of extreme difference to test null hypothesis', 'duration': 51.31, 'max_score': 242.406, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/dvSa_tx04hw/pics/dvSa_tx04hw242406.jpg'}, {'end': 413.038, 'src': 'embed', 'start': 379.953, 'weight': 3, 'content': [{'end': 390.481, 'text': 'We know that the standard deviation, We know that this guy right over here, the standard deviation of our sampling, distribution of this statistic,', 'start': 379.953, 'duration': 10.528}, {'end': 394.662, 'text': 'of the sample mean of p1 minus the sample proportion or sample mean of p2,', 'start': 390.481, 'duration': 4.181}, {'end': 411.557, 'text': 'is equal to the square root of p1 times 1 minus p1 over 1,000 plus p2 times 1 minus p2 over 1,000..', 'start': 394.662, 'duration': 16.895}, {'end': 413.038, 'text': "We've seen this in several videos.", 'start': 411.557, 'duration': 1.481}], 'summary': 'Standard deviation of sample mean difference is calculated using a formula involving p1 and p2 over 1,000.', 'duration': 33.085, 'max_score': 379.953, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/dvSa_tx04hw/pics/dvSa_tx04hw379953.jpg'}, {'end': 610.828, 'src': 'embed', 'start': 580.757, 'weight': 2, 'content': [{'end': 584.799, 'text': 'gives us 0.6165.', 'start': 580.757, 'duration': 4.042}, {'end': 596.684, 'text': 'And this is our best estimate of this consistent population proportion.', 'start': 584.799, 'duration': 11.885}, {'end': 600.725, 'text': 'that is true of both men and women, because we are assuming that they are no different.', 'start': 596.684, 'duration': 4.041}, {'end': 603.787, 'text': 'So we can substitute this value in for p.', 'start': 601.066, 'duration': 2.721}, {'end': 610.828, 'text': 'to estimate the standard deviation of the sampling distribution of this statistic right over here,', 'start': 604.767, 'duration': 6.061}], 'summary': 'Best estimate for population proportion is 0.6165.', 'duration': 30.071, 'max_score': 580.757, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/dvSa_tx04hw/pics/dvSa_tx04hw580757.jpg'}], 'start': 242.406, 'title': 'Statistics and hypothesis testing', 'summary': 'Covers the use of z-scores in hypothesis testing, with a specific example of a difference of 0.051 and the comparison to the mean of 0. it also discusses the standard deviation of the sampling distribution of the proportion difference between two groups, resulting in a value of 0.6165 for the consistent population proportion among men and women.', 'chapters': [{'end': 319.526, 'start': 242.406, 'title': 'Hypothesis testing and z-scores', 'summary': 'Explains the process of using z-scores to determine the likelihood of obtaining an extreme result and deciding whether to reject the null hypothesis based on a calculated z-score, with a specific example of a difference of 0.051 and the comparison to the mean of 0.', 'duration': 77.12, 'highlights': ['The process involves calculating a z-score to determine how many standard deviations away from the mean a result is, and then comparing the likelihood of obtaining that extreme result to a 5% threshold for rejecting the null hypothesis.', 'The specific example of a difference of 0.051 between two groups is used to illustrate the calculation of the z-score and its comparison to the assumed mean of 0.']}, {'end': 643.559, 'start': 319.526, 'title': 'Standard deviation of proportions', 'summary': 'Discusses the standard deviation of the sampling distribution of the proportion difference between two groups, with a focus on estimating the true population proportion based on the assumption of no difference between the groups, resulting in a value of 0.6165 for the consistent population proportion among men and women.', 'duration': 324.033, 'highlights': ['The consistent population proportion among men and women is estimated to be 0.6165 based on the assumption of no difference between the groups. The consistent population proportion among men and women is estimated to be 0.6165 based on the assumption of no difference between the groups.', 'The standard deviation of the sampling distribution of the proportion difference is calculated using the estimated population proportion. The standard deviation of the sampling distribution of the proportion difference is calculated using the estimated population proportion.', 'The formula for the standard deviation involves the square root of 2 times the estimated population proportion times 1 minus the estimated population proportion, divided by 1,000. The formula for the standard deviation involves the square root of 2 times the estimated population proportion times 1 minus the estimated population proportion, divided by 1,000.']}], 'duration': 401.153, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/dvSa_tx04hw/pics/dvSa_tx04hw242406.jpg', 'highlights': ['The process involves calculating a z-score to determine how many standard deviations away from the mean a result is, and then comparing the likelihood of obtaining that extreme result to a 5% threshold for rejecting the null hypothesis.', 'The specific example of a difference of 0.051 between two groups is used to illustrate the calculation of the z-score and its comparison to the assumed mean of 0.', 'The consistent population proportion among men and women is estimated to be 0.6165 based on the assumption of no difference between the groups.', 'The standard deviation of the sampling distribution of the proportion difference is calculated using the estimated population proportion.', 'The formula for the standard deviation involves the square root of 2 times the estimated population proportion times 1 minus the estimated population proportion, divided by 1,000.']}, {'end': 971.808, 'segs': [{'end': 713.041, 'src': 'embed', 'start': 643.559, 'weight': 3, 'content': [{'end': 646.601, 'text': "We're taking the square root of the whole thing.", 'start': 643.559, 'duration': 3.042}, {'end': 661.084, 'text': 'And so we get a standard deviation of 0.0217.', 'start': 646.661, 'duration': 14.423}, {'end': 663.05, 'text': 'Let me write this over here.', 'start': 661.084, 'duration': 1.966}, {'end': 664.895, 'text': 'So this thing right over here.', 'start': 663.11, 'duration': 1.785}, {'end': 672.02, 'text': 'This right over here is 0.0217.', 'start': 670.259, 'duration': 1.761}, {'end': 677.703, 'text': 'So, if we want to figure out our z-score,', 'start': 672.02, 'duration': 5.683}, {'end': 686.888, 'text': 'if we want to figure out how many standard deviations the actual sample that we got of this statistic right over here,', 'start': 677.703, 'duration': 9.185}, {'end': 699.195, 'text': "if we want to figure out how many standard deviations that is away from our assumed mean that the assumed mean is that there's no difference then we just divide 0.051 by this standard deviation right over here.", 'start': 686.888, 'duration': 12.307}, {'end': 701.151, 'text': "So let's do that.", 'start': 700.39, 'duration': 0.761}, {'end': 707.136, 'text': 'So we have 0.051 divided by this standard deviation.', 'start': 701.731, 'duration': 5.405}, {'end': 708.317, 'text': 'That was our answer up here.', 'start': 707.196, 'duration': 1.121}, {'end': 710.699, 'text': "So I'll just do divided by our answer.", 'start': 708.337, 'duration': 2.362}, {'end': 713.041, 'text': 'And we are 2.35 standard deviations away.', 'start': 711.36, 'duration': 1.681}], 'summary': 'The sample statistic is 2.35 standard deviations away from the assumed mean of 0.0217.', 'duration': 69.482, 'max_score': 643.559, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/dvSa_tx04hw/pics/dvSa_tx04hw643559.jpg'}, {'end': 821.88, 'src': 'embed', 'start': 786.538, 'weight': 2, 'content': [{'end': 792.16, 'text': 'We want to have a significance level of 5%, which means the entire area of our rejection,', 'start': 786.538, 'duration': 5.622}, {'end': 802.899, 'text': 'The entire area in which we would reject the null hypothesis, is 5%.', 'start': 797.776, 'duration': 5.123}, {'end': 804.219, 'text': 'This is a two-tailed test.', 'start': 802.899, 'duration': 1.32}, {'end': 811.983, 'text': 'An extreme event either far above the mean or far below the mean will allow us to reject the hypothesis.', 'start': 804.799, 'duration': 7.184}, {'end': 814.264, 'text': 'So we care about area over here.', 'start': 812.443, 'duration': 1.821}, {'end': 815.925, 'text': 'And over here, we would put 2.5%.', 'start': 814.744, 'duration': 1.181}, {'end': 821.88, 'text': 'And over here, we would have 2.5%.', 'start': 815.925, 'duration': 5.955}], 'summary': 'Conducting a two-tailed test with a significance level of 5% to reject the null hypothesis based on extreme events above or below the mean.', 'duration': 35.342, 'max_score': 786.538, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/dvSa_tx04hw/pics/dvSa_tx04hw786538.jpg'}, {'end': 946.714, 'src': 'heatmap', 'start': 870.407, 'weight': 0, 'content': [{'end': 871.587, 'text': 'I even wrote it over there.', 'start': 870.407, 'duration': 1.18}, {'end': 872.947, 'text': 'So this critical z-value is 1.96.', 'start': 871.607, 'duration': 1.34}, {'end': 875.268, 'text': 'So what that tells you is there is a 5% chance.', 'start': 872.947, 'duration': 2.321}, {'end': 914.455, 'text': 'So this tells us that there is a 5% chance chance of sampling a z statistic greater than 1.96, assuming the null hypothesis is correct.', 'start': 875.288, 'duration': 39.167}, {'end': 920.26, 'text': 'Now, we just figured out that we just sampled a z-statistic of 2.34, assuming the null hypothesis is correct.', 'start': 914.915, 'duration': 5.345}, {'end': 935.014, 'text': 'So the probability of sampling this, given the null hypothesis is correct, is going to be less than 5%.', 'start': 920.44, 'duration': 14.574}, {'end': 938.722, 'text': 'It is more extreme than this critical z-value.', 'start': 935.014, 'duration': 3.708}, {'end': 940.285, 'text': "It's going to be out here someplace.", 'start': 938.742, 'duration': 1.543}, {'end': 943.792, 'text': 'And because of that, we can reject the null hypothesis.', 'start': 940.706, 'duration': 3.086}, {'end': 946.714, 'text': 'And sorry for jumping around so much in this video.', 'start': 944.673, 'duration': 2.041}], 'summary': 'A z-statistic of 2.34 indicates a less than 5% chance of sampling, leading to rejection of the null hypothesis.', 'duration': 49.853, 'max_score': 870.407, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/dvSa_tx04hw/pics/dvSa_tx04hw870407.jpg'}], 'start': 643.559, 'title': 'Z-score calculation and hypothesis testing', 'summary': "Covers the calculation of z-scores with a standard deviation of 0.0217, determining the sample's distance from the mean, and explains hypothesis testing, demonstrating the process of calculating z-scores, determining critical z-values, and using them to reject the null hypothesis with a 5% significance level, ultimately inferring a statistical difference between the proportion of men and women voters.", 'chapters': [{'end': 713.041, 'start': 643.559, 'title': 'Z-score calculation', 'summary': 'Discusses the calculation of the z-score, with a standard deviation of 0.0217, and determines that the sample is 2.35 standard deviations away from the assumed mean.', 'duration': 69.482, 'highlights': ['Calculation of z-score with a standard deviation of 0.0217', 'Determination that the sample is 2.35 standard deviations away from the assumed mean']}, {'end': 971.808, 'start': 713.061, 'title': 'Hypothesis testing and z-scores', 'summary': 'Explains the concept of z-scores and hypothesis testing, demonstrating the process of calculating z-scores, determining critical z-values, and using them to reject the null hypothesis with a 5% significance level, ultimately inferring a statistical difference between the proportion of men and women voters.', 'duration': 258.747, 'highlights': ['The critical z-value is 1.96, indicating a 5% chance of sampling a z-statistic greater than 1.96 assuming the null hypothesis is correct. The critical z-value of 1.96 signifies a 5% chance of obtaining a z-statistic greater than 1.96 under the assumption that the null hypothesis is accurate, providing a basis for rejecting the null hypothesis.', 'The z-score of 2.34 implies that the probability of sampling this given the null hypothesis is less than 5%, allowing the rejection of the null hypothesis. The z-score of 2.34 indicates that the probability of obtaining this value given the null hypothesis is less than 5%, justifying the rejection of the null hypothesis and suggesting a statistical difference between the proportions of men and women voters.', 'The significance level of 5% is used to reject the null hypothesis, demonstrating a statistical difference between the proportions of men and women voters. The 5% significance level is employed to reject the null hypothesis, indicating a statistical distinction between the proportions of male and female voters.']}], 'duration': 328.249, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/dvSa_tx04hw/pics/dvSa_tx04hw643559.jpg', 'highlights': ['The z-score of 2.34 implies that the probability of sampling this given the null hypothesis is less than 5%, allowing the rejection of the null hypothesis.', 'The critical z-value is 1.96, indicating a 5% chance of sampling a z-statistic greater than 1.96 assuming the null hypothesis is correct.', 'The significance level of 5% is used to reject the null hypothesis, demonstrating a statistical difference between the proportions of men and women voters.', 'Determination that the sample is 2.35 standard deviations away from the assumed mean.', 'Calculation of z-score with a standard deviation of 0.0217']}], 'highlights': ['The z-score of 2.34 implies that the probability of sampling this given the null hypothesis is less than 5%, allowing the rejection of the null hypothesis.', 'The significance level of 5% is used to reject the null hypothesis, demonstrating a statistical difference between the proportions of men and women voters.', 'The process involves calculating a z-score to determine how many standard deviations away from the mean a result is, and then comparing the likelihood of obtaining that extreme result to a 5% threshold for rejecting the null hypothesis.', 'The chapter estimates a 95% confidence interval for the difference in the proportion of men and women likely to vote for a candidate.', 'The difference between the proportion of men and women who said they would vote for the candidate was 0.8.', 'The specific example of a difference of 0.051 between two groups is used to illustrate the calculation of the z-score and its comparison to the assumed mean of 0.', 'The critical z-value is 1.96, indicating a 5% chance of sampling a z-statistic greater than 1.96 assuming the null hypothesis is correct.', 'The consistent population proportion among men and women is estimated to be 0.6165 based on the assumption of no difference between the groups.', 'The process involves assuming the null hypothesis, calculating the probability of the difference between sample proportions, and rejecting the null hypothesis if the probability is less than 5%.', 'Determination that the sample is 2.35 standard deviations away from the assumed mean.']}