title
Lecture 21: Covariance and Correlation | Statistics 110
description
We introduce covariance and correlation, and show how to obtain the variance of a sum, including the variance of a Hypergeometric random variable.
detail
{'title': 'Lecture 21: Covariance and Correlation | Statistics 110', 'heatmap': [{'end': 984.346, 'start': 916.632, 'weight': 0.93}, {'end': 1898.428, 'start': 1868.447, 'weight': 0.708}, {'end': 1989.506, 'start': 1927.604, 'weight': 1}, {'end': 2503.793, 'start': 2445.658, 'weight': 0.716}, {'end': 2640.28, 'start': 2580.744, 'weight': 0.87}, {'end': 2847.541, 'start': 2784.549, 'weight': 0.916}, {'end': 2942.742, 'start': 2896.559, 'weight': 0.749}], 'summary': 'The lecture covers covariance and correlation, explaining their definitions, properties, and relevance in analyzing joint distributions, variance, and dependence between variables, while also discussing their implications on the variance and the impact of standardized variables on variance and covariance calculations.', 'chapters': [{'end': 157.491, 'segs': [{'end': 77.193, 'src': 'embed', 'start': 20.572, 'weight': 0, 'content': [{'end': 23.814, 'text': 'So on the one hand, covariance is what we need to deal with variance of the sums.', 'start': 20.572, 'duration': 3.242}, {'end': 29.958, 'text': "On the other hand, it's what we need when we wanna study two random variables together instead of one, right?", 'start': 24.194, 'duration': 5.764}, {'end': 31.72, 'text': "So it's like variance, except two of them.", 'start': 29.978, 'duration': 1.742}, {'end': 33.781, 'text': "so that's why it's called covariance.", 'start': 31.72, 'duration': 2.061}, {'end': 42.888, 'text': "And so let's define it, do some properties, do some examples, okay? So first, start with the definition.", 'start': 34.622, 'duration': 8.266}, {'end': 48.156, 'text': "It's analogous to how we define variance, except now we have an X and a Y.", 'start': 44.215, 'duration': 3.941}, {'end': 50.217, 'text': "right?. Cuz, we're looking at joint distributions, okay?", 'start': 48.156, 'duration': 2.061}, {'end': 52.257, 'text': 'So we have X, we have Y.', 'start': 50.237, 'duration': 2.02}, {'end': 53.498, 'text': 'we want their covariance.', 'start': 52.257, 'duration': 1.241}, {'end': 62.68, 'text': 'And we define it like this, covariance of X and Y, X and Y are any two random variables on the same space.', 'start': 54.378, 'duration': 8.302}, {'end': 71.643, 'text': 'Covariance XY equals expected value of X minus its mean.', 'start': 63.501, 'duration': 8.142}, {'end': 77.193, 'text': 'times y minus its mean.', 'start': 75.652, 'duration': 1.541}], 'summary': 'Covariance measures the joint variability of two random variables.', 'duration': 56.621, 'max_score': 20.572, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU20572.jpg'}], 'start': 0.129, 'title': 'Understanding covariance', 'summary': 'Introduces covariance, defining it as the expected value of the product of the differences of x and y from their means, and highlighting its relevance in analyzing joint distributions and the variation of two variables together.', 'chapters': [{'end': 157.491, 'start': 0.129, 'title': 'Understanding covariance', 'summary': 'Introduces covariance as a way to understand the variance of sums and study the relationship between two random variables, defining it as the expected value of the product of the differences of x and y from their means, and highlighting its relevance in analyzing joint distributions and the variation of two variables together.', 'duration': 157.362, 'highlights': ['Covariance is defined as the expected value of the product of the differences of X and Y from their means, denoted as Cov(XY) = E[(X - E[X])(Y - E[Y])].', 'It is essential for understanding the variance of sums and studying the relationship between two random variables within joint distributions.', 'The intuitive understanding of covariance involves observing how X and Y vary together and the implications of their positive or negative relationships.']}], 'duration': 157.362, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU129.jpg', 'highlights': ['Covariance is defined as the expected value of the product of the differences of X and Y from their means, denoted as Cov(XY) = E[(X - E[X])(Y - E[Y])].', 'It is essential for understanding the variance of sums and studying the relationship between two random variables within joint distributions.', 'The intuitive understanding of covariance involves observing how X and Y vary together and the implications of their positive or negative relationships.']}, {'end': 680.835, 'segs': [{'end': 200.501, 'src': 'embed', 'start': 178.086, 'weight': 0, 'content': [{'end': 189.354, 'text': "So if X, being above its mean, tends to imply that Y is above its mean and being below, then we'd say that they're positively correlated, right?", 'start': 178.086, 'duration': 11.268}, {'end': 190.795, 'text': 'And vice versa.', 'start': 190.235, 'duration': 0.56}, {'end': 195.478, 'text': "if it's negatively correlated, if X is above its mean, it doesn't imply that Y is below its mean,", 'start': 190.795, 'duration': 4.683}, {'end': 198.06, 'text': 'but it has more of a tendency that Y will be below its mean.', 'start': 195.478, 'duration': 2.582}, {'end': 200.501, 'text': "then we would say they're negatively correlated, okay?", 'start': 198.06, 'duration': 2.441}], 'summary': 'X above mean implies y above mean, positively correlated; vice versa for negatively correlated.', 'duration': 22.415, 'max_score': 178.086, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU178086.jpg'}, {'end': 250.744, 'src': 'embed', 'start': 220.541, 'weight': 2, 'content': [{'end': 226.606, 'text': 'But just like you know how for variance we had- Two different ways to write it.', 'start': 220.541, 'duration': 6.065}, {'end': 229.108, 'text': 'we define variance as notice.', 'start': 226.606, 'duration': 2.502}, {'end': 232.991, 'text': 'the way we define variance was expected value of X minus its mean squared.', 'start': 229.108, 'duration': 3.883}, {'end': 237.134, 'text': 'So if we let X equal Y, that is just the variance.', 'start': 233.831, 'duration': 3.303}, {'end': 243.058, 'text': "So we just proved a theorem already, so I'll just call this properties.", 'start': 237.894, 'duration': 5.164}, {'end': 250.744, 'text': 'The first property to keep in mind is that covariance of X with itself is the variance.', 'start': 243.639, 'duration': 7.105}], 'summary': 'Variance defined as expected value of x minus its mean squared; covariance of x with itself is the variance.', 'duration': 30.203, 'max_score': 220.541, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU220541.jpg'}, {'end': 500.73, 'src': 'embed', 'start': 474.062, 'weight': 4, 'content': [{'end': 486.626, 'text': "Okay, now what if we multiplied by a constant instead of just having a constant there? So if we have, let's say, the covariance of Cx with y.", 'start': 474.062, 'duration': 12.564}, {'end': 490.427, 'text': "Let's just use this one.", 'start': 486.626, 'duration': 3.801}, {'end': 495.668, 'text': 'To compute this, all we have to do is replace x by C times x.', 'start': 491.827, 'duration': 3.841}, {'end': 498.969, 'text': 'C comes out, C comes out, so C just comes out of the whole thing.', 'start': 495.668, 'duration': 3.301}, {'end': 500.73, 'text': 'So constants come out.', 'start': 499.41, 'duration': 1.32}], 'summary': 'Multiplying by a constant in covariance computation simplifies the process and the constant comes out of the whole thing.', 'duration': 26.668, 'max_score': 474.062, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU474062.jpg'}, {'end': 652.073, 'src': 'embed', 'start': 620.507, 'weight': 3, 'content': [{'end': 622.951, 'text': 'Bilinearity is just a fancy term that means.', 'start': 620.507, 'duration': 2.444}, {'end': 632.975, 'text': "If you imagine treating one coordinate as just kind of fixed and you're working with the other coordinate, it looks like linearity, right?", 'start': 625.547, 'duration': 7.428}, {'end': 636.899, 'text': 'So, like here, notice the y just stayed as y.', 'start': 633.595, 'duration': 3.304}, {'end': 641.504, 'text': 'And what happened to the cx? Well, I took out the constant just with linearity.', 'start': 636.899, 'duration': 4.605}, {'end': 652.073, 'text': 'And what happened here? X just stayed X throughout, but if you just look at the Y plus Z part, we split it out into the Y and the Z.', 'start': 642.065, 'duration': 10.008}], 'summary': 'Bilinearity means treating one coordinate as fixed, showing linearity, and splitting out coordinates, as demonstrated with y and cx in the given example.', 'duration': 31.566, 'max_score': 620.507, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU620507.jpg'}], 'start': 158.345, 'title': 'Correlation and covariance', 'summary': 'Covers correlation concepts, explaining positive and negative correlation, while also delving into covariance, its properties, and bilinearity.', 'chapters': [{'end': 200.501, 'start': 158.345, 'title': 'Correlation analysis', 'summary': "Explains the concept of correlation, stating that if x being above its mean tends to imply that y is above its mean and being below implies that they are positively correlated, while if x is above its mean, it doesn't imply that y is below its mean, but has more of a tendency that y will be below its mean, then they are negatively correlated.", 'duration': 42.156, 'highlights': ['The concept of positive correlation is explained, where when x is above its mean, it tends to imply that y is also above its mean, resulting in positive times positive.', "The concept of negative correlation is explained, where if x being above its mean doesn't imply that y is below its mean, but has more of a tendency that y will be below its mean, then they are negatively correlated."]}, {'end': 680.835, 'start': 201.837, 'title': 'Understanding covariance and correlation', 'summary': 'Covers the definition of covariance, properties of covariance, and bilinearity, including key points such as covariance definition in terms of covariance, properties of covariance, and the concept of bilinearity.', 'duration': 478.998, 'highlights': ['The chapter defines covariance in terms of covariance and discusses the properties of covariance, such as covariance of X with itself being the variance, symmetric property of covariance, and alternative way to write covariance.', 'The concept of bilinearity is explained, highlighting its usefulness in avoiding complex calculations and its analogy to linearity when treating one coordinate as fixed while working with the other coordinate.', 'The chapter also covers the covariance of a constant with X, covariance of Cx with Y, and covariance of X with Y plus Z, emphasizing their immediate computation and straightforward nature.']}], 'duration': 522.49, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU158345.jpg', 'highlights': ['The concept of positive correlation is explained, where x above its mean implies y is also above its mean.', 'The concept of negative correlation is explained, where x above its mean implies y is more likely below its mean.', 'The chapter defines covariance and discusses its properties, including covariance of X with itself being the variance.', 'The concept of bilinearity is explained, highlighting its usefulness in avoiding complex calculations.', 'The chapter covers the covariance of a constant with X, emphasizing their immediate computation.']}, {'end': 1254.079, 'segs': [{'end': 709.803, 'src': 'embed', 'start': 681.176, 'weight': 1, 'content': [{'end': 687.981, 'text': "Just like linearity is incredibly useful, bilinearity is incredibly useful when we're working with covariances.", 'start': 681.176, 'duration': 6.805}, {'end': 699.359, 'text': 'An easy kind of way to remember this is it kind of looks like this distributive property.', 'start': 693.576, 'duration': 5.783}, {'end': 704.481, 'text': "Like here, there's just a distributive property, X times Y plus Z is XY plus XZ.", 'start': 699.379, 'duration': 5.102}, {'end': 709.803, 'text': "It kind of looks like that, except it's not literally multiplication, it's covariances.", 'start': 704.961, 'duration': 4.842}], 'summary': 'Bilinearity in covariances is incredibly useful, resembling distributive property.', 'duration': 28.627, 'max_score': 681.176, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU681176.jpg'}, {'end': 793.553, 'src': 'embed', 'start': 766.281, 'weight': 0, 'content': [{'end': 772.142, 'text': "And more generally than that, let's just write what happens if we have a covariance of one sum with another sum?", 'start': 766.281, 'duration': 5.861}, {'end': 774.943, 'text': "Rather, I don't wanna write out nine terms.", 'start': 772.663, 'duration': 2.28}, {'end': 777.304, 'text': "Let's just write the general thing once and for all.", 'start': 775.443, 'duration': 1.861}, {'end': 779.924, 'text': 'So we have a covariance of one sum of terms.', 'start': 777.624, 'duration': 2.3}, {'end': 783.145, 'text': "Let's say we have the sum over i of Ai.", 'start': 780.344, 'duration': 2.801}, {'end': 793.553, 'text': "where ai's are constants, so this is a linear combination of random variables.", 'start': 788.47, 'duration': 5.083}], 'summary': 'Examining the covariance of a sum of terms expressed as a linear combination of random variables.', 'duration': 27.272, 'max_score': 766.281, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU766281.jpg'}, {'end': 924.837, 'src': 'embed', 'start': 890.626, 'weight': 3, 'content': [{'end': 893.327, 'text': 'right?, The variance of a sum, that is.', 'start': 890.626, 'duration': 2.701}, {'end': 899.648, 'text': 'Okay, so one of the main reasons we want covariance is so that we can deal with sums.', 'start': 893.347, 'duration': 6.301}, {'end': 903.389, 'text': "So let's just work out the variance of a sum.", 'start': 900.488, 'duration': 2.901}, {'end': 913.33, 'text': "Let's say we have the variance of X1 plus X2 to start with, but then we could generalize that to a sum of any number of terms,", 'start': 905.785, 'duration': 7.545}, {'end': 916.192, 'text': 'just by using this one repeatedly, okay?', 'start': 913.33, 'duration': 2.862}, {'end': 924.837, 'text': "Well, we already know how to do this, because by property one, that's the covariance of X1 plus X2 with itself.", 'start': 916.632, 'duration': 8.205}], 'summary': 'Covariance helps to deal with variance of a sum by generalizing for any number of terms.', 'duration': 34.211, 'max_score': 890.626, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU890626.jpg'}, {'end': 984.346, 'src': 'heatmap', 'start': 916.632, 'weight': 0.93, 'content': [{'end': 924.837, 'text': "Well, we already know how to do this, because by property one, that's the covariance of X1 plus X2 with itself.", 'start': 916.632, 'duration': 8.205}, {'end': 936.279, 'text': "But by property five, or whichever property six, What's the covariance of X1 plus X2 with itself? Well, we just have those four terms.", 'start': 926.238, 'duration': 10.041}, {'end': 940.082, 'text': "We have the covariance of X1 with itself, but that's just the variance.", 'start': 936.859, 'duration': 3.223}, {'end': 945.127, 'text': "And we have the covariance of X2 with itself, that's just the variance of X2.", 'start': 940.823, 'duration': 4.304}, {'end': 948.61, 'text': 'And then we have two cross terms.', 'start': 946.228, 'duration': 2.382}, {'end': 952.633, 'text': 'We have the covariance of X1 and X2, and we have the covariance of X2 and X1.', 'start': 949.09, 'duration': 3.543}, {'end': 961.102, 'text': "But by the symmetry property, those are the same thing, so it's simpler to just write it as 2 times the covariance of X1 and X2.", 'start': 953.578, 'duration': 7.524}, {'end': 973.769, 'text': 'In particular, this says that if the covariance is 0, then the variance of the sum is the sum of the variances.', 'start': 965.484, 'duration': 8.285}, {'end': 976.07, 'text': "And that's an if and only if statement.", 'start': 974.229, 'duration': 1.841}, {'end': 984.346, 'text': "So one case where that's true is if they're independent, we showed before that if they're independent, then the covariance is 0.", 'start': 976.721, 'duration': 7.625}], 'summary': 'Covariance properties simplify calculation of sum variance and reveal independence implications.', 'duration': 67.714, 'max_score': 916.632, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU916632.jpg'}, {'end': 976.07, 'src': 'embed', 'start': 946.228, 'weight': 7, 'content': [{'end': 948.61, 'text': 'And then we have two cross terms.', 'start': 946.228, 'duration': 2.382}, {'end': 952.633, 'text': 'We have the covariance of X1 and X2, and we have the covariance of X2 and X1.', 'start': 949.09, 'duration': 3.543}, {'end': 961.102, 'text': "But by the symmetry property, those are the same thing, so it's simpler to just write it as 2 times the covariance of X1 and X2.", 'start': 953.578, 'duration': 7.524}, {'end': 973.769, 'text': 'In particular, this says that if the covariance is 0, then the variance of the sum is the sum of the variances.', 'start': 965.484, 'duration': 8.285}, {'end': 976.07, 'text': "And that's an if and only if statement.", 'start': 974.229, 'duration': 1.841}], 'summary': 'Covariance simplification shows variance of sum equals sum of variances if covariance is 0.', 'duration': 29.842, 'max_score': 946.228, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU946228.jpg'}, {'end': 1073.732, 'src': 'embed', 'start': 1045.046, 'weight': 4, 'content': [{'end': 1055.35, 'text': "I think it's easiest if we write it as 2 times the sum over i less than j covariance xi xj.", 'start': 1045.046, 'duration': 10.304}, {'end': 1058.712, 'text': "It's easy to forget the 2 here.", 'start': 1055.991, 'duration': 2.721}, {'end': 1064.825, 'text': 'I could have also written it as i not equal to j, in which case I would not put the 2.', 'start': 1059.881, 'duration': 4.944}, {'end': 1073.732, 'text': "It's simply the question of are you gonna list covariance of x1, x2 separately from covariance of x2 and x1, or group them together?", 'start': 1064.825, 'duration': 8.907}], 'summary': 'Formula for covariance calculation discussed, addressing potential variations and considerations.', 'duration': 28.686, 'max_score': 1045.046, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU1045046.jpg'}, {'end': 1139.687, 'src': 'embed', 'start': 1095.548, 'weight': 5, 'content': [{'end': 1103.17, 'text': 'First I wanna just make sure that the connection with independence is clear, and we also need to define correlation.', 'start': 1095.548, 'duration': 7.622}, {'end': 1117.112, 'text': 'So theorem says that if X and Y are independent, Then they are uncorrelated.', 'start': 1104.25, 'duration': 12.862}, {'end': 1123.032, 'text': 'The definition of uncorrelated is just that the covariance is 0.', 'start': 1119.948, 'duration': 3.084}, {'end': 1124.374, 'text': "That's just the definition.", 'start': 1123.032, 'duration': 1.342}, {'end': 1135.506, 'text': 'IE covariance equals 0.', 'start': 1130.04, 'duration': 5.466}, {'end': 1139.687, 'text': "And we actually proved this last time when we just didn't have the terminology yet.", 'start': 1135.506, 'duration': 4.181}], 'summary': 'Independence implies uncorrelated; covariance equals 0.', 'duration': 44.139, 'max_score': 1095.548, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU1095548.jpg'}, {'end': 1210.534, 'src': 'embed', 'start': 1174.659, 'weight': 6, 'content': [{'end': 1179.604, 'text': "If the covariance is 0, and that's all we know, they may or may not be independent.", 'start': 1174.659, 'duration': 4.945}, {'end': 1185.209, 'text': "So just to give a simple counter example showing why this doesn't imply this.", 'start': 1181.325, 'duration': 3.884}, {'end': 1193.528, 'text': "Let's just consider an example with normal random variables.", 'start': 1189.346, 'duration': 4.182}, {'end': 1204.332, 'text': "So let's let Z be standard normal and we'll let X equals Z slightly redundant notation.", 'start': 1194.328, 'duration': 10.004}, {'end': 1210.534, 'text': "but I'm just in the habit of using Z for standard normals and Y equals Z squared.", 'start': 1204.332, 'duration': 6.202}], 'summary': 'Covariance of 0 does not imply independence, illustrated with a normal random variable example.', 'duration': 35.875, 'max_score': 1174.659, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU1174659.jpg'}], 'start': 681.176, 'title': 'Covariance, variance, and sum', 'summary': 'Discusses the bilinearity of covariances, the relationship between covariance and variance, and the variance of a sum, emphasizing the implications of covariance and independence on the variance.', 'chapters': [{'end': 864.982, 'start': 681.176, 'title': 'Covariance of sums', 'summary': 'Discusses the bilinearity of covariances, demonstrating how to calculate the covariance of a sum of terms using distributive property and the application of property 5, resulting in a simplified expression with a sum over all possible pairs.', 'duration': 183.806, 'highlights': ['The chapter demonstrates the application of property 5 to calculate the covariance of a sum of terms, resulting in a simplified expression with a sum over all possible pairs. By applying property 5 repeatedly, the chapter simplifies the expression for the covariance of a sum of terms into a sum over all possible pairs, reducing the complexity of the calculation.', 'The chapter explains the concept of bilinearity of covariances using the distributive property, drawing an analogy to the multiplication of polynomials. The chapter illustrates the concept of bilinearity of covariances by likening it to the distributive property and the multiplication of polynomials, making it easier to understand and remember.', 'The chapter provides a generalized expression for the covariance of one sum with another sum, simplifying a seemingly complicated calculation into a structured form. The chapter presents a generalized expression for the covariance of one sum with another sum, simplifying the complex calculation into a structured form, making it more manageable and understandable.']}, {'end': 991.852, 'start': 864.982, 'title': 'Covariance and variance relationship', 'summary': 'Discusses the relationship between covariance and variance, how covariance is useful in computing variance of a sum, and the implications of covariance being 0 on the variance of the sum.', 'duration': 126.87, 'highlights': ['The covariance of two random variables X1 and X2 can be expressed as 2 times the covariance of X1 and X2, which simplifies the computation of the variance of their sum.', 'If the covariance of X1 and X2 is 0, then the variance of the sum is the sum of the variances, and this is true if and only if they are independent.', 'The chapter emphasizes the usefulness of covariance in computing the variance of a sum and its implications, providing insights into the relationship between covariance and variance.']}, {'end': 1254.079, 'start': 992.332, 'title': 'Variance of sum and covariance', 'summary': 'Discusses the relationship between the variance of a sum, covariance, and independence, emphasizing that the variance of the sum is the sum of the variances and covariances, and the distinction between uncorrelated and independent random variables.', 'duration': 261.747, 'highlights': ['The variance of the sum is the sum of the variances and covariances, as expressed in the formula 2 times the sum over i less than j covariance xi xj. The formula 2 times the sum over i less than j covariance xi xj demonstrates that the variance of the sum is the sum of the variances and covariances, providing a mathematical expression for understanding the relationship.', 'The connection between independence and uncorrelated random variables is emphasized, with the theorem stating that if X and Y are independent, then they are uncorrelated, defined as having a covariance of 0. The chapter emphasizes the connection between independence and uncorrelated random variables, highlighting the theorem that declares independent random variables to be uncorrelated, defined by having a covariance of 0, reinforcing the fundamental relationship between these concepts.', 'The distinction between uncorrelated and independent random variables is clarified by cautioning against the common mistake of assuming independence solely based on a covariance of 0, illustrated through a counterexample involving normal random variables X and Y. The chapter highlights the distinction between uncorrelated and independent random variables by cautioning against assuming independence solely based on a covariance of 0, exemplifying this with a counterexample involving normal random variables X and Y, effectively emphasizing the importance of understanding the nuances between these concepts.']}], 'duration': 572.903, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU681176.jpg', 'highlights': ['The chapter simplifies the expression for the covariance of a sum of terms into a sum over all possible pairs, reducing the complexity of the calculation.', 'The chapter illustrates the concept of bilinearity of covariances by likening it to the distributive property and the multiplication of polynomials, making it easier to understand and remember.', 'The chapter presents a generalized expression for the covariance of one sum with another sum, simplifying the complex calculation into a structured form, making it more manageable and understandable.', 'The chapter emphasizes the usefulness of covariance in computing the variance of a sum and its implications, providing insights into the relationship between covariance and variance.', 'The formula 2 times the sum over i less than j covariance xi xj demonstrates that the variance of the sum is the sum of the variances and covariances, providing a mathematical expression for understanding the relationship.', 'The chapter emphasizes the connection between independence and uncorrelated random variables, highlighting the theorem that declares independent random variables to be uncorrelated, defined by having a covariance of 0, reinforcing the fundamental relationship between these concepts.', 'The chapter highlights the distinction between uncorrelated and independent random variables by cautioning against assuming independence solely based on a covariance of 0, exemplifying this with a counterexample involving normal random variables X and Y, effectively emphasizing the importance of understanding the nuances between these concepts.', 'If the covariance of X1 and X2 is 0, then the variance of the sum is the sum of the variances, and this is true if and only if they are independent.']}, {'end': 1775.441, 'segs': [{'end': 1288.311, 'src': 'embed', 'start': 1254.079, 'weight': 0, 'content': [{'end': 1255.18, 'text': "So they're uncorrelated.", 'start': 1254.079, 'duration': 1.101}, {'end': 1258.622, 'text': "But they're clearly not independent.", 'start': 1256.781, 'duration': 1.841}, {'end': 1264.307, 'text': 'In fact, they are very non-independent.', 'start': 1260.644, 'duration': 3.663}, {'end': 1266.069, 'text': 'I should say very dependent.', 'start': 1264.667, 'duration': 1.402}, {'end': 1268.711, 'text': 'Avoid too many double negatives.', 'start': 1267.19, 'duration': 1.521}, {'end': 1278.028, 'text': "So they're very dependent, in fact, Y is a function of X.", 'start': 1270.205, 'duration': 7.823}, {'end': 1280.028, 'text': "So that they're extremely dependent.", 'start': 1278.028, 'duration': 2}, {'end': 1283.59, 'text': 'If you know X, you know Y, complete information.', 'start': 1280.108, 'duration': 3.482}, {'end': 1288.311, 'text': 'So Y is actually a function of X.', 'start': 1285.49, 'duration': 2.821}], 'summary': 'X and y are extremely dependent, with complete information about one providing complete knowledge about the other.', 'duration': 34.232, 'max_score': 1254.079, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU1254079.jpg'}, {'end': 1339.902, 'src': 'embed', 'start': 1310.907, 'weight': 2, 'content': [{'end': 1315.711, 'text': "If we know Z squared, then we can take the square root and we'll get the absolute value.", 'start': 1310.907, 'duration': 4.804}, {'end': 1317.532, 'text': 'So we know it up to a plus or minus.', 'start': 1315.791, 'duration': 1.741}, {'end': 1324.939, 'text': "So that also shows it's dependent, which we didn't need to do, but it's just nice to think.", 'start': 1319.074, 'duration': 5.865}, {'end': 1326.6, 'text': 'If we know this, okay, we know this.', 'start': 1325.059, 'duration': 1.541}, {'end': 1329.823, 'text': 'If we know this, then what do we know? Well, we know it up to a sine.', 'start': 1326.66, 'duration': 3.163}, {'end': 1339.902, 'text': 'So I would just say Y also determines X, at least it determines it up to a sign.', 'start': 1333.46, 'duration': 6.442}], 'summary': 'Y determines x, up to a sign, based on z squared.', 'duration': 28.995, 'max_score': 1310.907, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU1310907.jpg'}, {'end': 1404.064, 'src': 'embed', 'start': 1361.189, 'weight': 3, 'content': [{'end': 1370.371, 'text': "But why the definition doesn't capture this is part of the intuition of correlation is it's kind of a measure of linear association.", 'start': 1361.189, 'duration': 9.182}, {'end': 1376.632, 'text': 'And those of you who have taken Stat 100 or 104 see a lot of things like that, where you actually have a data set.', 'start': 1370.451, 'duration': 6.181}, {'end': 1381.253, 'text': "And if it kind of looks like it's sloping upwards generally, you have this cloud of points.", 'start': 1376.692, 'duration': 4.561}, {'end': 1384.874, 'text': 'And does it kind of go upwards or downwards, that kind of thing.', 'start': 1381.313, 'duration': 3.561}, {'end': 1389.356, 'text': "It's measuring linear trends in some sense.", 'start': 1385.474, 'duration': 3.882}, {'end': 1397.521, 'text': "There's a theorem that we're not gonna prove that says that if every function of X is uncorrelated with every function of Y, then they're independent.", 'start': 1389.416, 'duration': 8.105}, {'end': 1404.064, 'text': 'But just having the linear things be uncorrelated is not enough, as this example shows.', 'start': 1397.801, 'duration': 6.263}], 'summary': "Correlation measures linear association, but linear uncorrelation doesn't imply independence.", 'duration': 42.875, 'max_score': 1361.189, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU1361189.jpg'}, {'end': 1493.935, 'src': 'embed', 'start': 1466.134, 'weight': 4, 'content': [{'end': 1474.603, 'text': "Usually it's defined this way as the covariance, and then we divide by the product of the standard deviations.", 'start': 1466.134, 'duration': 8.469}, {'end': 1480.409, 'text': 'Remember, standard deviation is just the square root of variance.', 'start': 1477.867, 'duration': 2.542}, {'end': 1484.734, 'text': 'So take the covariance, divide by the square root of the product of the variances.', 'start': 1480.449, 'duration': 4.285}, {'end': 1488.973, 'text': "That's the usual definition.", 'start': 1487.872, 'duration': 1.101}, {'end': 1493.935, 'text': "I actually would prefer to define it a different way, and I'll show you why these are equivalent.", 'start': 1488.993, 'duration': 4.942}], 'summary': 'Covariance is usually defined as covariance divided by the product of standard deviations.', 'duration': 27.801, 'max_score': 1466.134, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU1466134.jpg'}, {'end': 1761.342, 'src': 'embed', 'start': 1735.956, 'weight': 5, 'content': [{'end': 1741.54, 'text': 'For those of you who have seen the Cauchy-Schwarz inequality in linear algebra or elsewhere,', 'start': 1735.956, 'duration': 5.584}, {'end': 1745.503, 'text': 'Cauchy-Schwarz is one of the most important inequalities in all of mathematics.', 'start': 1741.54, 'duration': 3.963}, {'end': 1752.932, 'text': "And if you put this If you rewrite this statement in a linear algebra setting, you can show that it's essentially Cauchy-Schwarz.", 'start': 1745.543, 'duration': 7.389}, {'end': 1758.419, 'text': "If you haven't seen Cauchy-Schwarz yet, we'll come back to it later in the semester and you don't need to worry about it right now.", 'start': 1753.332, 'duration': 5.087}, {'end': 1761.342, 'text': 'But for those of you who have, I want to make the connection right now.', 'start': 1758.479, 'duration': 2.863}], 'summary': 'Cauchy-schwarz inequality is important in mathematics and linear algebra.', 'duration': 25.386, 'max_score': 1735.956, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU1735956.jpg'}], 'start': 1254.079, 'title': 'Understanding correlation and dependence between variables', 'summary': 'Discusses the strong functional relationship between variables x and y, highlighting that knowing x provides complete information about y and vice versa, alongside the definition and interpretation of correlation as a measure of linear association, emphasizing its range, interpretability, and connection to the cauchy-schwarz inequality.', 'chapters': [{'end': 1329.823, 'start': 1254.079, 'title': 'Relationship between x and y', 'summary': 'Discusses the very strong dependence between variables x and y, where knowing x provides complete information about y, and vice versa, demonstrating a clear functional relationship between the two, as well as the dependency of z on its magnitude.', 'duration': 75.744, 'highlights': ['The strong dependency between variables X and Y is highlighted, with complete information about Y being provided by knowing X, and vice versa.', 'The functional relationship between Y and X is emphasized, indicating that Y is a function of X, demonstrating a strong dependency.', 'The dependency of Z on its magnitude is discussed, showing that knowing Z squared provides information up to a plus or minus, highlighting its dependency.']}, {'end': 1775.441, 'start': 1333.46, 'title': 'Understanding correlation and its mathematical interpretation', 'summary': 'Discusses the definition of correlation, its interpretation as a measure of linear association, the relation between correlation and independence, and the advantages of standardization in correlation calculation. it also highlights the range of correlation, its interpretability, and its connection to the cauchy-schwarz inequality.', 'duration': 441.981, 'highlights': ['The definition of correlation as the covariance divided by the product of the standard deviations is discussed, along with an alternative definition using standardized variables, showcasing the mathematical interpretation and advantages of the correlation calculation.', 'The relationship between correlation and independence is briefly mentioned, emphasizing the necessity for functions of X and Y to be uncorrelated for independence, with a highlighted example illustrating the insufficiency of only linear uncorrelation for independence.', 'The interpretability of correlation is highlighted, emphasizing its dimensionless nature, making it independent of the units of measurement and its range being between -1 and 1, providing a quantifiable measure of linear association.', 'The connection between correlation and the Cauchy-Schwarz inequality is briefly discussed, providing an insight into the mathematical foundation of correlation and its broader significance in mathematics.']}], 'duration': 521.362, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU1254079.jpg', 'highlights': ['Knowing X provides complete information about Y and vice versa, demonstrating a strong dependency.', 'The functional relationship between Y and X is emphasized, indicating a strong dependency.', 'The dependency of Z on its magnitude is discussed, showing that knowing Z squared provides information up to a plus or minus, highlighting its dependency.', 'The interpretability of correlation is highlighted, emphasizing its dimensionless nature and range between -1 and 1, providing a quantifiable measure of linear association.', 'The definition of correlation as the covariance divided by the product of the standard deviations is discussed, showcasing the mathematical interpretation and advantages of the correlation calculation.', 'The connection between correlation and the Cauchy-Schwarz inequality is briefly discussed, providing an insight into the mathematical foundation of correlation and its broader significance in mathematics.', 'The relationship between correlation and independence is briefly mentioned, emphasizing the necessity for functions of X and Y to be uncorrelated for independence, with a highlighted example illustrating the insufficiency of only linear uncorrelation for independence.']}, {'end': 2200.658, 'segs': [{'end': 1836.196, 'src': 'embed', 'start': 1808.103, 'weight': 0, 'content': [{'end': 1812.946, 'text': "So we may as well just assume from the start that they've been standardized.", 'start': 1808.103, 'duration': 4.843}, {'end': 1816.629, 'text': 'Standardized meaning that they have mean 0, variance 1.', 'start': 1813.247, 'duration': 3.382}, {'end': 1822.793, 'text': "Cuz if they weren't standardized, well, I could just make up some new notation like X tilde, Y tilde for the standardized ones.", 'start': 1816.629, 'duration': 6.164}, {'end': 1828.156, 'text': "But this says that the correlation will be the same anyway, so we may as well assume that they're already standardized.", 'start': 1823.093, 'duration': 5.063}, {'end': 1832.973, 'text': "All right, so now let's just compute the variance.", 'start': 1828.769, 'duration': 4.204}, {'end': 1836.196, 'text': 'This is actually good practice with that property seven there.', 'start': 1833.073, 'duration': 3.123}], 'summary': 'Assuming standardized variables, computing variance, practicing property seven.', 'duration': 28.093, 'max_score': 1808.103, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU1808103.jpg'}, {'end': 1914.519, 'src': 'heatmap', 'start': 1868.447, 'weight': 1, 'content': [{'end': 1870.889, 'text': "All right, so that's the variance.", 'start': 1868.447, 'duration': 2.442}, {'end': 1875.052, 'text': 'But I assume they were standardized, so this is 1 plus 1.', 'start': 1871.85, 'duration': 3.202}, {'end': 1880.396, 'text': "And if they're standardized already, then the covariance is the correlation, because they're standardized.", 'start': 1875.052, 'duration': 5.344}, {'end': 1882.257, 'text': "So that's just 1 plus 1 plus 2 row.", 'start': 1880.436, 'duration': 1.821}, {'end': 1892.925, 'text': "So that's really just 2 plus 2 row, right? On the other hand, we could look at the variance of the difference.", 'start': 1882.658, 'duration': 10.267}, {'end': 1898.428, 'text': "Again, that's good practice with variances of sums and differences.", 'start': 1894.005, 'duration': 4.423}, {'end': 1904.933, 'text': "A common mistake is to say that's the variance of X minus the variance of Y, which we talked about that fact before,", 'start': 1899.209, 'duration': 5.724}, {'end': 1907.254, 'text': 'when we were talking about sums and differences of normals.', 'start': 1904.933, 'duration': 2.321}, {'end': 1908.735, 'text': "Variances can't be negative.", 'start': 1907.354, 'duration': 1.381}, {'end': 1914.519, 'text': 'So think of this not as X minus Y, think of this as X plus minus Y.', 'start': 1909.516, 'duration': 5.003}], 'summary': 'The transcript discusses variance, covariance, and common mistakes in statistical practice.', 'duration': 46.072, 'max_score': 1868.447, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU1868447.jpg'}, {'end': 2023.467, 'src': 'heatmap', 'start': 1927.604, 'weight': 3, 'content': [{'end': 1932.726, 'text': 'So we have a minus on the covariance part, but not on these variance terms.', 'start': 1927.604, 'duration': 5.122}, {'end': 1936.748, 'text': "So that's just 2-2 row.", 'start': 1935.007, 'duration': 1.741}, {'end': 1946.311, 'text': "Okay, while we're running out of space on this board, that's actually the end of the proof, because variance is non-negative.", 'start': 1936.768, 'duration': 9.543}, {'end': 1956.786, 'text': 'So these two inequalities say that row is between minus 1 and 1.', 'start': 1949.012, 'duration': 7.774}, {'end': 1964.51, 'text': 'All right, so that shows a correlation is always between minus 1 and 1.', 'start': 1956.786, 'duration': 7.724}, {'end': 1973.034, 'text': "And so in general, it's easier to work with covariances than correlations, but correlations are more intuitive and standardized,", 'start': 1964.51, 'duration': 8.524}, {'end': 1975.535, 'text': "and everything it's between minus 1 and 1..", 'start': 1973.034, 'duration': 2.501}, {'end': 1979.318, 'text': 'Okay, so I wanted to, for the rest of the time,', 'start': 1975.535, 'duration': 3.783}, {'end': 1989.506, 'text': 'do some examples with this thing and also with computing covariances for certain problems we might be interested in.', 'start': 1979.318, 'duration': 10.188}, {'end': 1993.969, 'text': "So let's talk about the multinomial, cuz we were talking about that last time.", 'start': 1990.807, 'duration': 3.162}, {'end': 2001.277, 'text': 'And now we actually have the tools to deal with the covariances within a multinomial, okay?', 'start': 1994.91, 'duration': 6.367}, {'end': 2006.644, 'text': "So this is just an example, but it's an important example cuz multinomials come up a lot.", 'start': 2001.638, 'duration': 5.006}, {'end': 2015.514, 'text': 'So we wanna compute covariances if we have a multinomial, okay? So covariances in a multinomial.', 'start': 2007.144, 'duration': 8.37}, {'end': 2023.467, 'text': "That is, this multinomial is this vector, right? It's how many people are in category one, how many people are in category two, and so on.", 'start': 2017.523, 'duration': 5.944}], 'summary': 'Correlation is always between -1 and 1, covariances for multinomials.', 'duration': 50.433, 'max_score': 1927.604, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU1927604.jpg'}, {'end': 2109.011, 'src': 'embed', 'start': 2069.38, 'weight': 4, 'content': [{'end': 2075.583, 'text': 'that there are n objects or people and the probabilities are given by some vector p.', 'start': 2069.38, 'duration': 6.203}, {'end': 2080.487, 'text': 'That just gives the probabilities for each category okay?', 'start': 2075.583, 'duration': 4.904}, {'end': 2094.197, 'text': 'And we wanna find the covariance of xi with xj for all i and j right?', 'start': 2080.547, 'duration': 13.65}, {'end': 2098.621, 'text': "So first of all, let's consider the case.", 'start': 2096.178, 'duration': 2.443}, {'end': 2102.565, 'text': 'i equals j.', 'start': 2098.621, 'duration': 3.944}, {'end': 2109.011, 'text': "Then we just have the covariance of xi with itself, and we know that that's just the variance of xi.", 'start': 2102.565, 'duration': 6.446}], 'summary': 'Finding covariance of xi with xj for given probabilities.', 'duration': 39.631, 'max_score': 2069.38, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU2069380.jpg'}], 'start': 1775.501, 'title': 'Correlation, variance, and covariance in multinomial', 'summary': 'Delves into the concept of correlation and variance, emphasizing that correlation falls within -1 and 1, and demonstrates the impact of standardized variables on variance and covariance calculations. it also examines the computation of covariances within a multinomial, illustrating the process with examples and providing intuitive explanations.', 'chapters': [{'end': 1973.034, 'start': 1775.501, 'title': 'Correlation and variance', 'summary': 'Explores the concept of correlation and variance, showing that correlation is always between -1 and 1 and how standardized variables affect the variance and covariance calculations.', 'duration': 197.533, 'highlights': ['The chapter emphasizes the standardization of variables, demonstrating that standardized variables have a variance of 1 and a covariance equal to the correlation, which ensures that the correlation is always between -1 and 1.', 'The discussion highlights the computation of variance for the sum and difference of standardized variables, illustrating that the variance for the sum is 2 + 2ρ and for the difference is 2 - 2ρ.', 'The explanation distinguishes between the computation of variance for the sum and difference of standardized variables, emphasizing the non-negativity of variance and concluding that the correlation is always between -1 and 1.']}, {'end': 2200.658, 'start': 1973.034, 'title': 'Covariance in multinomial example', 'summary': 'Discusses the computation of covariances within a multinomial, providing an example of finding the covariance of xi with xj for all i and j, as well as the intuition behind it.', 'duration': 227.624, 'highlights': ['The multinomial example is an important one as multinomials come up frequently in statistical analysis.', 'The chapter provides a method to compute covariances within a multinomial, offering a practical approach to dealing with such problems.', 'The discussion involves finding the covariance of xi with xj for all i and j, which is a crucial aspect of understanding the relationships within a multinomial.', 'The intuition behind the computation of covariances within a multinomial is emphasized, encouraging critical thinking and understanding of the results.']}], 'duration': 425.157, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU1775501.jpg', 'highlights': ['The chapter emphasizes the standardization of variables, ensuring that the correlation is always between -1 and 1.', 'The discussion highlights the computation of variance for the sum and difference of standardized variables.', 'The explanation distinguishes between the computation of variance for the sum and difference of standardized variables, emphasizing the non-negativity of variance.', 'The chapter provides a method to compute covariances within a multinomial, offering a practical approach to dealing with such problems.', 'The discussion involves finding the covariance of xi with xj for all i and j, which is a crucial aspect of understanding the relationships within a multinomial.', 'The intuition behind the computation of covariances within a multinomial is emphasized, encouraging critical thinking and understanding of the results.']}, {'end': 2965.588, 'segs': [{'end': 2246.606, 'src': 'embed', 'start': 2200.978, 'weight': 0, 'content': [{'end': 2208.422, 'text': "As you just said, if you knew that there are more people in the first category, like there's tons of people in the first category,", 'start': 2200.978, 'duration': 7.444}, {'end': 2211.464, 'text': "there's fewer people left over who could be in the second category.", 'start': 2208.422, 'duration': 3.042}, {'end': 2217.348, 'text': "So it's like these categories are kind of competing for membership, right? You have a fixed number of people.", 'start': 2211.544, 'duration': 5.804}, {'end': 2221.751, 'text': 'Not like the chicken and egg problem where we had a Poisson number of eggs, okay?', 'start': 2217.728, 'duration': 4.023}, {'end': 2228.575, 'text': "Fixed number of eggs competing for different categories, more in one than you'd expect, less in the other, right?", 'start': 2222.131, 'duration': 6.444}, {'end': 2230.156, 'text': 'So they should be negatively correlated.', 'start': 2228.735, 'duration': 1.421}, {'end': 2235.179, 'text': "All right, so now how do we do this? Well, there's a bunch of ways, as I said.", 'start': 2230.696, 'duration': 4.483}, {'end': 2246.606, 'text': 'But one way that I especially like is, To relate this back to stuff we did last time, we talked about the lumping property of a multinomial.', 'start': 2235.999, 'duration': 10.607}], 'summary': 'More people in first category means fewer in second, leading to negative correlation.', 'duration': 45.628, 'max_score': 2200.978, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU2200978.jpg'}, {'end': 2311.141, 'src': 'embed', 'start': 2280.832, 'weight': 2, 'content': [{'end': 2286.835, 'text': "So we're trying to solve for C, okay? So we have the variance of the sum equals the sum of the variances.", 'start': 2280.832, 'duration': 6.003}, {'end': 2292.298, 'text': 'Now the variance of x1 is np1, 1- p1.', 'start': 2287.515, 'duration': 4.783}, {'end': 2298.041, 'text': 'And the variance of x2 is np2, 1- p2.', 'start': 2293.218, 'duration': 4.823}, {'end': 2304.525, 'text': "And then it's plus twice the covariance, but I just named the covariance C, just to have a simple name for it, so it's plus 2C.", 'start': 2298.361, 'duration': 6.164}, {'end': 2311.141, 'text': "So the only thing we wanna solve for this, the only thing left that we haven't gotten is this.", 'start': 2306.138, 'duration': 5.003}], 'summary': 'Solving for c in variance equation with np and p values.', 'duration': 30.309, 'max_score': 2280.832, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU2280832.jpg'}, {'end': 2503.793, 'src': 'heatmap', 'start': 2445.658, 'weight': 0.716, 'content': [{'end': 2468.113, 'text': "So let X be binomial NP, and we write it as, as we've done many times, X equals X1 plus blah, blah, blah, plus Xn, where the Xi's are IID Bernoulli P.", 'start': 2445.658, 'duration': 22.455}, {'end': 2477.192, 'text': "Now each Xi, Let's do a quick little indicator random variable review.", 'start': 2468.113, 'duration': 9.079}, {'end': 2482.234, 'text': "We can think of these xj's, they're Bernoulli's, but they're also indicator random variables.", 'start': 2477.832, 'duration': 4.402}, {'end': 2484.515, 'text': "It's an indicator of success on the jth trial.", 'start': 2482.294, 'duration': 2.221}, {'end': 2489.836, 'text': "So let's just state this in general.", 'start': 2485.855, 'duration': 3.981}, {'end': 2503.793, 'text': "Let's let capital I and capital J, let's say let I sub A be indicator random variable for event A, just in general.", 'start': 2492.708, 'duration': 11.085}], 'summary': 'Reviewing the concept of indicator random variables in a binomial distribution.', 'duration': 58.135, 'max_score': 2445.658, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU2445658.jpg'}, {'end': 2700.704, 'src': 'heatmap', 'start': 2580.744, 'weight': 3, 'content': [{'end': 2583.105, 'text': "So that's immediately true, very useful fact.", 'start': 2580.744, 'duration': 2.361}, {'end': 2598.249, 'text': "Okay, now coming back to this binomial, if we want the variance of Xj, That's just e of xj squared minus e of xj squared.", 'start': 2584.145, 'duration': 14.104}, {'end': 2607.159, 'text': "But xj squared is xj, so that's just e of xj, and we know e of xj is p for a Bernoulli p.", 'start': 2600.831, 'duration': 6.328}, {'end': 2613.626, 'text': "This one is p squared, so that's just p1- p, okay? So it's extremely easy to get the variance of a Bernoulli.", 'start': 2607.159, 'duration': 6.467}, {'end': 2624.652, 'text': "And if we define this as q, then we're just saying p times q, okay? So Bernoulli p, you get p times q, very easy.", 'start': 2615.808, 'duration': 8.844}, {'end': 2628.014, 'text': 'So now we want the variance of the binomial.', 'start': 2626.353, 'duration': 1.661}, {'end': 2633.397, 'text': "Well, it's just npq, done.", 'start': 2630.735, 'duration': 2.662}, {'end': 2636.278, 'text': "Because you're adding up n of them.", 'start': 2635.117, 'duration': 1.161}, {'end': 2638.399, 'text': "They're independent for the binomial.", 'start': 2636.398, 'duration': 2.001}, {'end': 2640.28, 'text': 'We have independent Bernoulli trials.', 'start': 2638.419, 'duration': 1.861}, {'end': 2650.257, 'text': "So just to write out a little bit more, covariance of xij equals 0 for i not equal j, because they're independent.", 'start': 2642.072, 'duration': 8.185}, {'end': 2652.299, 'text': "They're not only uncorrelated, they're independent.", 'start': 2650.277, 'duration': 2.022}, {'end': 2654.46, 'text': "So we don't have any covariance terms.", 'start': 2652.819, 'duration': 1.641}, {'end': 2658.143, 'text': 'so we just add up the variances n times.', 'start': 2654.46, 'duration': 3.683}, {'end': 2660.305, 'text': 'this npq all right?', 'start': 2658.143, 'duration': 2.162}, {'end': 2663.507, 'text': 'So now you can do the variance of a binomial in your head.', 'start': 2660.345, 'duration': 3.162}, {'end': 2670.432, 'text': "You don't need to memorize this, it's just n times the variance of one of these Bernoullis, okay? So that's easy.", 'start': 2663.687, 'duration': 6.745}, {'end': 2677.719, 'text': "Let's talk about a more complicated one though, hypergeometric.", 'start': 2673.997, 'duration': 3.722}, {'end': 2684.864, 'text': 'So let X be hypergeometric.', 'start': 2681.982, 'duration': 2.882}, {'end': 2700.704, 'text': 'with parameters w, b, n, which we interpret as saying we have a jar that has w white balls, b black balls.', 'start': 2687.475, 'duration': 13.229}], 'summary': 'Variance of bernoulli is p(1-p), variance of binomial is npq, and covariance of xij equals 0 for i not equal j.', 'duration': 65.587, 'max_score': 2580.744, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU2580744.jpg'}, {'end': 2847.541, 'src': 'heatmap', 'start': 2784.549, 'weight': 0.916, 'content': [{'end': 2788.61, 'text': "This goes back to a homework problem that I'll talk a little bit more about.", 'start': 2784.549, 'duration': 4.061}, {'end': 2792.332, 'text': 'Variance of x is n times the variance of x1.', 'start': 2789.071, 'duration': 3.261}, {'end': 2798.034, 'text': "Because you can think of x, let's say we're looking at x times x7.", 'start': 2793.532, 'duration': 4.502}, {'end': 2804.076, 'text': "That's saying the seventh ball, like on the homework problem where you pick two balls.", 'start': 2798.354, 'duration': 5.722}, {'end': 2811.098, 'text': "And a lot of students were struggling somewhat with the fact that to consider the second ball, don't you need to have considered the first ball,", 'start': 2804.136, 'duration': 6.962}, {'end': 2815.959, 'text': "okay?. But when we're just looking at x7, that depends on the seventh ball.", 'start': 2811.098, 'duration': 4.861}, {'end': 2818.26, 'text': "we're imagining it before we've done anything, okay?", 'start': 2815.959, 'duration': 2.301}, {'end': 2823.862, 'text': 'Now the seventh ball is equally likely to be any of the balls, right?', 'start': 2818.58, 'duration': 5.282}, {'end': 2828.684, 'text': "Because there isn't, like some balls like to be chosen, seventh and other ones don't, right?", 'start': 2824.522, 'duration': 4.162}, {'end': 2830.044, 'text': "It's completely symmetrical.", 'start': 2828.704, 'duration': 1.34}, {'end': 2832.645, 'text': 'So this is just n times the variance of x1.', 'start': 2830.424, 'duration': 2.221}, {'end': 2841.148, 'text': 'Similarly for all the covariance terms, two times, and then there are n choose two of these covariance terms.', 'start': 2833.205, 'duration': 7.943}, {'end': 2847.541, 'text': 'But we may as well just consider the covariance of X1 and X2.', 'start': 2842.315, 'duration': 5.226}], 'summary': 'Variance of x is n times the variance of x1, and covariance terms are n choose two.', 'duration': 62.992, 'max_score': 2784.549, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU2784549.jpg'}, {'end': 2942.742, 'src': 'heatmap', 'start': 2896.559, 'weight': 0.749, 'content': [{'end': 2910.814, 'text': "Well, that's E minus E E let's do the second term first, that's easy.", 'start': 2896.559, 'duration': 14.255}, {'end': 2913.175, 'text': "That's just the fundamental bridge.", 'start': 2910.834, 'duration': 2.341}, {'end': 2921.818, 'text': 'The probability of the first ball is white times the probability of the second one is white, but both of those are w over w plus b.', 'start': 2913.195, 'duration': 8.623}, {'end': 2926.72, 'text': "Okay, now for this term, E let's use the fact here.", 'start': 2921.818, 'duration': 4.902}, {'end': 2931.075, 'text': 'that the product of two indicator random variables is the indicator of the intersection.', 'start': 2927.093, 'duration': 3.982}, {'end': 2936.219, 'text': "So this event here, it's expected I have an indicator, fundamental bridge.", 'start': 2931.756, 'duration': 4.463}, {'end': 2942.742, 'text': "That's the probability that the first two balls are both white, while the first ball has probably w over w plus b.", 'start': 2936.279, 'duration': 6.463}], 'summary': 'Discussing probability calculations and indicator random variables in a lecture.', 'duration': 46.183, 'max_score': 2896.559, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU2896559.jpg'}], 'start': 2200.978, 'title': 'Category competition and correlation', 'summary': 'Explains category competition, where fixed-size categories compete for membership, leading to a negatively correlated distribution. it also relates this to the lumping property of a multinomial. additionally, it discusses the variance of the sum, covariance, and provides examples of binomial and hypergeometric distributions, focusing on the derivation and application of relevant formulas.', 'chapters': [{'end': 2246.606, 'start': 2200.978, 'title': 'Category competition and correlation', 'summary': 'Describes how categories with a fixed number of people compete for membership, leading to a negatively correlated distribution, and relates it to the lumping property of a multinomial.', 'duration': 45.628, 'highlights': ['The categories compete for membership based on the number of people in each category, leading to a negatively correlated distribution.', 'Relates the concept to the lumping property of a multinomial, connecting it to previous discussions.']}, {'end': 2965.588, 'start': 2247.206, 'title': 'Variance of sum and covariance', 'summary': 'Discusses the variance of the sum, covariance, and examples such as binomial and hypergeometric distributions, emphasizing the derivation and application of the relevant formulas.', 'duration': 718.382, 'highlights': ['The variance of the sum equals the sum of the variances, with the variance of x1 being np1(1-p1) and the variance of x2 being np2(1-p2), and the covariance of x1 and x2 equals -Np1p2. np1(1-p1), np2(1-p2), -Np1p2', 'The variance of the binomial np is npq, derived from the sum of individual variances and covariance terms being 0 for i not equal j due to independence. npq, 0 for i not equal j', "The hypergeometric distribution's variance involves symmetrical simplifications using n times the variance of x1 and considering the covariance of X1 and X2, emphasizing the application of algebra in simplifying the formula. symmetrical simplifications, n times the variance of x1, covariance of X1 and X2"]}], 'duration': 764.61, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/IujCYxtpszU/pics/IujCYxtpszU2200978.jpg', 'highlights': ['The categories compete for membership based on the number of people in each category, leading to a negatively correlated distribution.', 'Relates the concept to the lumping property of a multinomial, connecting it to previous discussions.', 'The variance of the sum equals the sum of the variances, with the variance of x1 being np1(1-p1) and the variance of x2 being np2(1-p2), and the covariance of x1 and x2 equals -Np1p2.', 'The variance of the binomial np is npq, derived from the sum of individual variances and covariance terms being 0 for i not equal j due to independence.', "The hypergeometric distribution's variance involves symmetrical simplifications using n times the variance of x1 and considering the covariance of X1 and X2, emphasizing the application of algebra in simplifying the formula."]}], 'highlights': ['The variance of the sum equals the sum of the variances, with the variance of x1 being np1(1-p1) and the variance of x2 being np2(1-p2), and the covariance of x1 and x2 equals -Np1p2.', 'The variance of the binomial np is npq, derived from the sum of individual variances and covariance terms being 0 for i not equal j due to independence.', "The hypergeometric distribution's variance involves symmetrical simplifications using n times the variance of x1 and considering the covariance of X1 and X2, emphasizing the application of algebra in simplifying the formula.", 'The chapter simplifies the expression for the covariance of a sum of terms into a sum over all possible pairs, reducing the complexity of the calculation.', 'The chapter presents a generalized expression for the covariance of one sum with another sum, simplifying the complex calculation into a structured form, making it more manageable and understandable.', 'The formula 2 times the sum over i less than j covariance xi xj demonstrates that the variance of the sum is the sum of the variances and covariances, providing a mathematical expression for understanding the relationship.', 'The lecture covers covariance and correlation, explaining their definitions, properties, and relevance in analyzing joint distributions, variance, and dependence between variables, while also discussing their implications on the variance and the impact of standardized variables on variance and covariance calculations.', 'The concept of positive correlation is explained, where x above its mean implies y is also above its mean.', 'The concept of negative correlation is explained, where x above its mean implies y is more likely below its mean.', 'The interpretability of correlation is highlighted, emphasizing its dimensionless nature and range between -1 and 1, providing a quantifiable measure of linear association.', 'The definition of correlation as the covariance divided by the product of the standard deviations is discussed, showcasing the mathematical interpretation and advantages of the correlation calculation.']}