title
Covariance and the regression line | Regression | Probability and Statistics | Khan Academy

description
Courses on Khan Academy are always 100% free. Start practicing—and saving your progress—now: https://www.khanacademy.org/math/statistics-probability/describing-relationships-quantitative-data/more-on-regression/v/covariance-and-the-regression-line Covariance, Variance and the Slope of the Regression Line Watch the next lesson: https://www.khanacademy.org/math/probability/statistics-inferential/normal_distribution/v/introduction-to-the-normal-distribution?utm_source=YT&utm_medium=Desc&utm_campaign=ProbabilityandStatistics Missed the previous lesson? https://www.khanacademy.org/math/probability/regression/regression-correlation/v/calculating-r-squared?utm_source=YT&utm_medium=Desc&utm_campaign=ProbabilityandStatistics Probability and statistics on Khan Academy: We dare you to go through a day in which you never consider or use probability. Did you check the weather forecast? Busted! Did you decide to go through the drive through lane vs walk in? Busted again! We are constantly creating hypotheses, making predictions, testing, and analyzing. Our lives are full of probabilities! Statistics is related to probability because much of the data we use when determining probable outcomes comes from our understanding of statistics. In these tutorials, we will cover a range of topics, some which include: independent events, dependent probability, combinatorics, hypothesis testing, descriptive statistics, random variables, probability distributions, regression, and inferential statistics. So buckle up and hop on for a wild ride. We bet you're going to be challenged AND love it! About Khan Academy: Khan Academy offers practice exercises, instructional videos, and a personalized learning dashboard that empower learners to study at their own pace in and outside of the classroom. We tackle math, science, computer programming, history, art history, economics, and more. Our math missions guide learners from kindergarten to calculus using state-of-the-art, adaptive technology that identifies strengths and learning gaps. We've also partnered with institutions like NASA, The Museum of Modern Art, The California Academy of Sciences, and MIT to offer specialized content. For free. For everyone. Forever. #YouCanLearnAnything Subscribe to KhanAcademy’s Probability and Statistics channel: https://www.youtube.com/channel/UCRXuOXLW3LcQLWvxbZiIZ0w?sub_confirmation=1 Subscribe to KhanAcademy: https://www.youtube.com/subscription_center?add_user=khanacademy

detail
{'title': 'Covariance and the regression line | Regression | Probability and Statistics | Khan Academy', 'heatmap': [{'end': 45.901, 'start': 25.729, 'weight': 0.766}], 'summary': 'Covers the concept of covariance, its calculation, and significance in understanding relationships between variables and its relevance in regression analysis. it also discusses the calculation of expected value and covariance for random variables x and y, emphasizing their properties and providing visual examples to illustrate the concepts. furthermore, it explains the interconnection between covariance, regression, and regression line, providing a deeper understanding of their calculations and interpretations.', 'chapters': [{'end': 219.635, 'segs': [{'end': 45.901, 'src': 'heatmap', 'start': 0.646, 'weight': 1, 'content': [{'end': 9.016, 'text': 'What I want to do in this video is introduce you to the idea of the covariance between two random variables.', 'start': 0.646, 'duration': 8.37}, {'end': 16.545, 'text': "And it's defined as the expected value of the distance or, I guess,", 'start': 9.076, 'duration': 7.469}, {'end': 23.248, 'text': 'the product of the distances of each random variable from their mean or from their expected value.', 'start': 16.545, 'duration': 6.703}, {'end': 24.388, 'text': 'So let me just write that down.', 'start': 23.268, 'duration': 1.12}, {'end': 25.609, 'text': "So I'll have x first.", 'start': 24.548, 'duration': 1.061}, {'end': 27.73, 'text': "I'll do this in another color.", 'start': 25.729, 'duration': 2.001}, {'end': 34.154, 'text': "So it's the expected value of random variable x minus the expected value of x.", 'start': 28.211, 'duration': 5.943}, {'end': 45.901, 'text': 'You could view this as the population mean of x times, and then this is random variable y so times, the distance from y to its expected value,', 'start': 34.154, 'duration': 11.747}], 'summary': 'The video introduces the concept of covariance as the expected value of the product of the distances of two random variables from their means.', 'duration': 27.084, 'max_score': 0.646, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/ualmyZiPs9w/pics/ualmyZiPs9w646.jpg'}, {'end': 202.986, 'src': 'embed', 'start': 158.444, 'weight': 0, 'content': [{'end': 165.307, 'text': "And if we kept doing this, let's say for the entire population this happened, then it would make sense that they have a negative covariance.", 'start': 158.444, 'duration': 6.863}, {'end': 167.704, 'text': 'When one goes up, the other one goes down.', 'start': 165.663, 'duration': 2.041}, {'end': 169.685, 'text': 'When one goes down, the other one goes up.', 'start': 167.864, 'duration': 1.821}, {'end': 172.827, 'text': 'If they both go up together, they would have a positive variance.', 'start': 170.105, 'duration': 2.722}, {'end': 179.01, 'text': 'Or if they both go down together, and the degree to which they do it together would tell you kind of the magnitude of the covariance.', 'start': 172.847, 'duration': 6.163}, {'end': 183.032, 'text': 'Hopefully that gives you a little bit of intuition about what the covariance is trying to tell us.', 'start': 179.05, 'duration': 3.982}, {'end': 193.541, 'text': "But the more important thing that I want to do in this video is to connect this formula is I want to connect this definition of covariance to everything we've been doing with least squared regression.", 'start': 183.612, 'duration': 9.929}, {'end': 202.986, 'text': "And really it's just kind of a fun math thing to do to show you all of these connections and where really the definition of covariance really becomes useful.", 'start': 193.561, 'duration': 9.425}], 'summary': 'Covariance measures the relationship between variables, positive for simultaneous increase, and negative for simultaneous decrease.', 'duration': 44.542, 'max_score': 158.444, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/ualmyZiPs9w/pics/ualmyZiPs9w158444.jpg'}], 'start': 0.646, 'title': 'Covariance and its significance', 'summary': 'Introduces the concept of covariance, emphasizing its measure of variation between two random variables and its connection to least squared regression. it provides an example to illustrate its calculation and interpretation, highlighting its significance in understanding relationships between variables and its relevance in regression analysis.', 'chapters': [{'end': 158.123, 'start': 0.646, 'title': 'Introduction to covariance', 'summary': 'Introduces the concept of covariance as a measure of the variation between two random variables, emphasizing how they vary together and providing an example to illustrate its calculation and interpretation.', 'duration': 157.477, 'highlights': ['Covariance is defined as the expected value of the product of the distances of each random variable from their mean, expressing how much the variables vary together.', 'The calculation of covariance involves multiplying the distance of each random variable from its expected value and taking the expected value of the resulting products, providing a measure of their joint variability.', 'An example is given to demonstrate the calculation and interpretation of covariance, emphasizing how the relationship between two random variables can be assessed using this measure.']}, {'end': 219.635, 'start': 158.444, 'title': 'Covariance and regression connections', 'summary': 'Explains the concept of covariance, its connection to least squared regression, and its significance in understanding relationships between variables, with a focus on how it impacts the entire population and its relevance in regression analysis.', 'duration': 61.191, 'highlights': ['The covariance measures the relationship between two variables, with a negative covariance indicating an inverse relationship, and a positive covariance indicating a direct relationship.', 'The degree to which the variables move together reflects the magnitude of the covariance, providing insights into their joint behavior.', 'The definition of covariance is connected to least squared regression, showcasing its significance in understanding relationships between variables and its relevance in regression analysis.']}], 'duration': 218.989, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/ualmyZiPs9w/pics/ualmyZiPs9w646.jpg', 'highlights': ['Covariance measures the relationship between two variables, with a negative covariance indicating an inverse relationship, and a positive covariance indicating a direct relationship.', 'The calculation of covariance involves multiplying the distance of each random variable from its expected value and taking the expected value of the resulting products, providing a measure of their joint variability.', 'The degree to which the variables move together reflects the magnitude of the covariance, providing insights into their joint behavior.', 'Covariance is defined as the expected value of the product of the distances of each random variable from their mean, expressing how much the variables vary together.', 'The definition of covariance is connected to least squared regression, showcasing its significance in understanding relationships between variables and its relevance in regression analysis.', 'An example is given to demonstrate the calculation and interpretation of covariance, emphasizing how the relationship between two random variables can be assessed using this measure.']}, {'end': 653.072, 'segs': [{'end': 253.858, 'src': 'embed', 'start': 220.215, 'weight': 0, 'content': [{'end': 231.399, 'text': "So this is going to be the same thing as the expected value of, and I'm just going to multiply these two binomials in here.", 'start': 220.215, 'duration': 11.184}, {'end': 236.801, 'text': 'So the expected value of our random variable x times our random variable y.', 'start': 231.779, 'duration': 5.022}, {'end': 242.01, 'text': "Minus, well I'll just do the x first.", 'start': 240.149, 'duration': 1.861}, {'end': 246.093, 'text': 'So plus x times the negative expected value of y.', 'start': 242.13, 'duration': 3.963}, {'end': 253.858, 'text': "So I'll just say minus x times the expected value of y.", 'start': 246.093, 'duration': 7.765}], 'summary': 'Multiplying two binomials to find expected value', 'duration': 33.643, 'max_score': 220.215, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/ualmyZiPs9w/pics/ualmyZiPs9w220215.jpg'}, {'end': 332.663, 'src': 'embed', 'start': 279.578, 'weight': 1, 'content': [{'end': 285.101, 'text': "And so you're just going to have plus the expected value of x times the expected value of y.", 'start': 279.578, 'duration': 5.523}, {'end': 289.244, 'text': "And of course, it's the expected value of this entire thing.", 'start': 285.101, 'duration': 4.143}, {'end': 301.051, 'text': "Now, let's see if we can rewrite this.", 'start': 299.81, 'duration': 1.241}, {'end': 307.814, 'text': 'Well, the expected value of the sum of a bunch of random variables, or the sum and difference of a bunch of random variables,', 'start': 301.091, 'duration': 6.723}, {'end': 310.236, 'text': 'is just the sum or difference of their expected value.', 'start': 307.814, 'duration': 2.422}, {'end': 311.716, 'text': 'So this is going to be the same thing.', 'start': 310.276, 'duration': 1.44}, {'end': 317.32, 'text': 'And remember, expected value, in a lot of contexts, you can view it as just the arithmetic mean.', 'start': 312.477, 'duration': 4.843}, {'end': 322.822, 'text': 'Or in a continuous distribution, you could view it as a probability weighted sum or probability weighted integral.', 'start': 317.78, 'duration': 5.042}, {'end': 326.004, 'text': "Either way, we've seen it before, I think.", 'start': 322.903, 'duration': 3.101}, {'end': 327.325, 'text': "So let's rewrite this.", 'start': 326.444, 'duration': 0.881}, {'end': 329.186, 'text': 'So this is equal to the expected value.', 'start': 327.345, 'duration': 1.841}, {'end': 332.663, 'text': 'of the random variables x and y.', 'start': 329.921, 'duration': 2.742}], 'summary': 'Expected value of x*y is same as expected value of x * expected value of y.', 'duration': 53.085, 'max_score': 279.578, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/ualmyZiPs9w/pics/ualmyZiPs9w279578.jpg'}, {'end': 418.627, 'src': 'embed', 'start': 389.631, 'weight': 3, 'content': [{'end': 393.093, 'text': 'Because the expected value of an expected value is the same thing as the expected value.', 'start': 389.631, 'duration': 3.462}, {'end': 395.134, 'text': 'Actually, let me write this over here just to remind ourselves.', 'start': 393.113, 'duration': 2.021}, {'end': 399.797, 'text': 'The expected value of x is just going to be the expected value of x.', 'start': 395.594, 'duration': 4.203}, {'end': 407.598, 'text': 'Think of it this way.', 'start': 406.957, 'duration': 0.641}, {'end': 410.78, 'text': 'You could view this as the population mean for the random variable.', 'start': 408.198, 'duration': 2.582}, {'end': 414.423, 'text': "So that's just going to be a known, it's out there, it's in the universe.", 'start': 411.221, 'duration': 3.202}, {'end': 418.627, 'text': 'So the expected value of that is just going to be itself.', 'start': 414.463, 'duration': 4.164}], 'summary': 'The expected value of a random variable is the population mean, which is a known constant.', 'duration': 28.996, 'max_score': 389.631, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/ualmyZiPs9w/pics/ualmyZiPs9w389631.jpg'}, {'end': 639.462, 'src': 'embed', 'start': 597.215, 'weight': 4, 'content': [{'end': 599.736, 'text': 'We have the covariance of these two random variables.', 'start': 597.215, 'duration': 2.521}, {'end': 602.517, 'text': 'x and y are equal to the expected value of.', 'start': 599.736, 'duration': 2.781}, {'end': 611.28, 'text': "I'll switch back to my colors just because this is the final result the expected value of x times the expected value of the product of xy,", 'start': 602.517, 'duration': 8.763}, {'end': 618.261, 'text': 'minus the expected value of y times the expected value of x.', 'start': 611.28, 'duration': 6.981}, {'end': 639.462, 'text': 'If you know you can calculate these expected values, if you know everything about the probability,', 'start': 634.918, 'duration': 4.544}], 'summary': 'Covariance calculation for random variables x and y.', 'duration': 42.247, 'max_score': 597.215, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/ualmyZiPs9w/pics/ualmyZiPs9w597215.jpg'}], 'start': 220.215, 'title': 'Calculating expected value and covariance', 'summary': 'Covers the calculation of expected value for random variables x and y, demonstrating the process of multiplication, addition, and cancellation of terms. it also discusses the calculation of covariance using the expected values of random variables x and y, emphasizing the property that the expected value of an expected value is the same as the expected value and provides a visual example to illustrate the concept of population mean.', 'chapters': [{'end': 332.663, 'start': 220.215, 'title': 'Expected value calculation', 'summary': 'Discusses the calculation of expected value for random variables x and y, explaining the process of multiplication, addition, and cancellation of terms, ultimately arriving at the expected value of the random variables x and y.', 'duration': 112.448, 'highlights': ['The expected value of the random variable x times the random variable y is calculated by multiplying the two binomials, performing distributed property twice, and canceling out negative terms to obtain the expected value of x times the expected value of y.', 'The chapter explains that the expected value of the sum or difference of random variables is equivalent to the sum or difference of their expected values, and provides insights into the interpretation of expected value as an arithmetic mean or a probability-weighted sum in different contexts.']}, {'end': 653.072, 'start': 332.663, 'title': 'Covariance calculation', 'summary': 'Explains the calculation of covariance using the expected values of random variables x and y, and emphasizes the property that the expected value of an expected value is the same as the expected value. it also touches upon the concept of population mean and provides a visual example to make the explanation more relatable.', 'duration': 320.409, 'highlights': ['The chapter emphasizes the property that the expected value of an expected value is the same as the expected value, reinforcing this point with a relatable example of population mean and expected value.', 'It explains the calculation of covariance using the expected values of random variables x and y and provides a visual example to simplify the concept for better understanding.', 'It mentions the method to estimate the expected values of random variables when only a sample is available, highlighting the practical aspect of the concept.']}], 'duration': 432.857, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/ualmyZiPs9w/pics/ualmyZiPs9w220215.jpg', 'highlights': ['The expected value of the random variable x times the random variable y is calculated by multiplying the two binomials, performing distributed property twice, and canceling out negative terms to obtain the expected value of x times the expected value of y.', 'The chapter explains that the expected value of the sum or difference of random variables is equivalent to the sum or difference of their expected values, and provides insights into the interpretation of expected value as an arithmetic mean or a probability-weighted sum in different contexts.', 'It mentions the method to estimate the expected values of random variables when only a sample is available, highlighting the practical aspect of the concept.', 'The chapter emphasizes the property that the expected value of an expected value is the same as the expected value, reinforcing this point with a relatable example of population mean and expected value.', 'It explains the calculation of covariance using the expected values of random variables x and y and provides a visual example to simplify the concept for better understanding.']}, {'end': 906.277, 'segs': [{'end': 731.466, 'src': 'embed', 'start': 679.235, 'weight': 0, 'content': [{'end': 684.477, 'text': 'You take each of your xy associations, take the product, and then take the mean of all of them.', 'start': 679.235, 'duration': 5.242}, {'end': 686.117, 'text': "So that's going to be the product of x and y.", 'start': 684.597, 'duration': 1.52}, {'end': 693.87, 'text': 'And then this thing right over here, the expected value of y, that can be approximated by the sample by the sample mean of y.', 'start': 686.117, 'duration': 7.753}, {'end': 698.531, 'text': 'And the expected value of x can be approximated by the sample mean of x.', 'start': 693.87, 'duration': 4.661}, {'end': 705.354, 'text': 'So what can the covariance of two random variables be approximated by??', 'start': 698.531, 'duration': 6.823}, {'end': 711.056, 'text': "Well, this right here is the mean of their product from your sample, minus the mean of your sample y's.", 'start': 705.734, 'duration': 5.322}, {'end': 724.582, 'text': "minus the mean of your sample y's times the mean of your sample x's.", 'start': 718.098, 'duration': 6.484}, {'end': 726.643, 'text': 'And this should start looking familiar.', 'start': 724.942, 'duration': 1.701}, {'end': 731.466, 'text': 'This should look a little bit familiar because what is this? This was the numerator.', 'start': 726.944, 'duration': 4.522}], 'summary': 'Covariance can be approximated by mean of product minus mean of y times mean of x.', 'duration': 52.231, 'max_score': 679.235, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/ualmyZiPs9w/pics/ualmyZiPs9w679235.jpg'}, {'end': 906.277, 'src': 'embed', 'start': 833.384, 'weight': 1, 'content': [{'end': 834.885, 'text': "right?. That's what the mean of x squared is.", 'start': 833.384, 'duration': 1.501}, {'end': 843.091, 'text': "Well, what's this? Well, you could view this as the covariance of x with x.", 'start': 834.905, 'duration': 8.186}, {'end': 845.213, 'text': "But we've actually already seen this.", 'start': 843.091, 'duration': 2.122}, {'end': 849.856, 'text': "And I've actually shown you many, many videos ago when we first learned about it what this is.", 'start': 845.233, 'duration': 4.623}, {'end': 858.258, 'text': 'The covariance of a random variable with itself is really just the variance of that random variable.', 'start': 849.896, 'duration': 8.362}, {'end': 860.019, 'text': 'And you could verify it for yourself.', 'start': 858.598, 'duration': 1.421}, {'end': 868.885, 'text': 'If you change this y to an x, this becomes x minus the expected value of x times x minus the expected value of x.', 'start': 860.559, 'duration': 8.326}, {'end': 872.767, 'text': "Or that's the expected value of x minus the expected value of x squared.", 'start': 868.885, 'duration': 3.882}, {'end': 874.328, 'text': "That's your definition of variance.", 'start': 872.847, 'duration': 1.481}, {'end': 880.059, 'text': 'Another way of thinking about the slope of our regression line.', 'start': 876.315, 'duration': 3.744}, {'end': 892.413, 'text': 'it can be literally viewed as the covariance of our two random variables over the variance of x.', 'start': 880.059, 'duration': 12.354}, {'end': 896.056, 'text': 'So you can kind of view it as the independent random variable.', 'start': 892.413, 'duration': 3.643}, {'end': 898.399, 'text': 'That right there is the slope.', 'start': 896.537, 'duration': 1.862}, {'end': 899.86, 'text': 'of our regression line.', 'start': 898.897, 'duration': 0.963}, {'end': 906.277, 'text': 'Anyway, I thought that was interesting and I wanted to make connections between things you see in different parts of statistics and show you that they really are connected.', 'start': 899.9, 'duration': 6.377}], 'summary': 'Covariance, variance, and regression slope are interconnected in statistics.', 'duration': 72.893, 'max_score': 833.384, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/ualmyZiPs9w/pics/ualmyZiPs9w833384.jpg'}], 'start': 653.793, 'title': 'Covariance, regression, and regression line', 'summary': 'Covers the approximation of expected values, covariance of random variables, and the relationship to regression, emphasizing the use of sample means and products to estimate these values. additionally, it explains the interconnection between covariance, variance, and the slope of the regression line, providing a deeper understanding of their calculations and interpretations.', 'chapters': [{'end': 806.113, 'start': 653.793, 'title': 'Covariance and regression', 'summary': 'Discusses the approximation of expected values, covariance of random variables, and the relationship to regression, emphasizing the use of sample means and products to estimate these values.', 'duration': 152.32, 'highlights': ["The covariance of two random variables can be approximated by the mean of their product from the sample, minus the mean of the sample y's times the mean of the sample x's, similar to the numerator in the slope formula of the regression line.", 'The expected value of x times y can be approximated by the sample mean of the products of x and y, and the expected value of y and x can be approximated by their respective sample means.']}, {'end': 906.277, 'start': 807.833, 'title': 'Covariance and regression line', 'summary': 'Explains the relationship between covariance, variance, and the slope of the regression line, highlighting how they are interconnected in statistics and providing a deeper understanding of their calculations and interpretations.', 'duration': 98.444, 'highlights': ['The slope of the regression line can be viewed as the covariance of two random variables over the variance of x, providing a deeper understanding of the relationship between these statistical measures.', 'The covariance of a random variable with itself is equivalent to the variance of that random variable, highlighting the interconnected nature of covariance and variance in statistical calculations.', 'The chapter emphasizes the connections between different parts of statistics, showing how covariance, variance, and regression line slope are fundamentally linked, providing a holistic understanding of statistical concepts.']}], 'duration': 252.484, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/ualmyZiPs9w/pics/ualmyZiPs9w653793.jpg', 'highlights': ["The covariance of two random variables can be approximated by the mean of their product from the sample, minus the mean of the sample y's times the mean of the sample x's, similar to the numerator in the slope formula of the regression line.", 'The slope of the regression line can be viewed as the covariance of two random variables over the variance of x, providing a deeper understanding of the relationship between these statistical measures.', 'The expected value of x times y can be approximated by the sample mean of the products of x and y, and the expected value of y and x can be approximated by their respective sample means.', 'The covariance of a random variable with itself is equivalent to the variance of that random variable, highlighting the interconnected nature of covariance and variance in statistical calculations.', 'The chapter emphasizes the connections between different parts of statistics, showing how covariance, variance, and regression line slope are fundamentally linked, providing a holistic understanding of statistical concepts.']}], 'highlights': ['The degree to which the variables move together reflects the magnitude of the covariance, providing insights into their joint behavior.', 'The definition of covariance is connected to least squared regression, showcasing its significance in understanding relationships between variables and its relevance in regression analysis.', 'The slope of the regression line can be viewed as the covariance of two random variables over the variance of x, providing a deeper understanding of the relationship between these statistical measures.', 'The calculation of covariance involves multiplying the distance of each random variable from its expected value and taking the expected value of the resulting products, providing a measure of their joint variability.', 'The covariance of a random variable with itself is equivalent to the variance of that random variable, highlighting the interconnected nature of covariance and variance in statistical calculations.', 'The expected value of the random variable x times the random variable y is calculated by multiplying the two binomials, performing distributed property twice, and canceling out negative terms to obtain the expected value of x times the expected value of y.']}