title
K-Means Clustering - The Math of Intelligence (Week 3)

description
Let's detect the intruder trying to break into our security system using a very popular ML technique called K-Means Clustering! This is an example of learning from data that has no labels (unsupervised) and we'll use some concepts that we've already learned about like computing the Euclidean distance and a loss function to do this. Code for this video: https://github.com/llSourcell/k_means_clustering Please Subscribe! And like. And comment. That's what keeps me going. More learning resources: http://www.kdnuggets.com/2016/12/datascience-introduction-k-means-clustering-tutorial.html http://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_ml/py_kmeans/py_kmeans_understanding/py_kmeans_understanding.html http://people.revoledu.com/kardi/tutorial/kMean/ https://home.deib.polimi.it/matteucc/Clustering/tutorial_html/kmeans.html http://mnemstudio.org/clustering-k-means-example-1.htm https://www.dezyre.com/data-science-in-r-programming-tutorial/k-means-clustering-techniques-tutorial http://scikit-learn.org/stable/tutorial/statistical_inference/unsupervised_learning.html Join us in the Wizards Slack channel: http://wizards.herokuapp.com/ And please support me on Patreon: https://www.patreon.com/user?u=3191693 Follow me: Twitter: https://twitter.com/sirajraval Facebook: https://www.facebook.com/sirajology Instagram: https://www.instagram.com/sirajraval/ Instagram: https://www.instagram.com/sirajraval/ Signup for my newsletter for exciting updates in the field of AI: https://goo.gl/FZzJ5w Hit the Join button above to sign up to become a member of my channel for access to exclusive content! Join my AI community: http://chatgptschool.io/ Sign up for my AI Sports betting Bot, WagerGPT! (500 spots available): https://www.wagergpt.co

detail
{'title': 'K-Means Clustering - The Math of Intelligence (Week 3)', 'heatmap': [{'end': 504.111, 'start': 462.005, 'weight': 0.742}], 'summary': 'Explains k-means clustering for detecting intruders, compares supervised and unsupervised learning, finds optimal k value, implements k-means algorithm, and covers clustering process, highlighting the relevance and applications of k-means algorithm in various domains.', 'chapters': [{'end': 175.769, 'segs': [{'end': 115.792, 'src': 'embed', 'start': 79.147, 'weight': 0, 'content': [{'end': 82.148, 'text': 'This is one of the most popular techniques in machine learning.', 'start': 79.147, 'duration': 3.001}, {'end': 87.631, 'text': 'You see it all the time in Kaggle contests, in the machine learning subreddit, everywhere.', 'start': 82.188, 'duration': 5.443}, {'end': 91.633, 'text': "It's a very popular algorithm, and it's very easy, more or less.", 'start': 87.751, 'duration': 3.882}, {'end': 94.995, 'text': "I mean, more than other things that I've been talking about, so that's a good thing.", 'start': 91.693, 'duration': 3.302}, {'end': 97.757, 'text': "But let's talk about what we've learned so far.", 'start': 95.695, 'duration': 2.062}, {'end': 104.743, 'text': "What we've learned is that machine learning is all about optimizing for an objective, right? We are trying to optimize for an objective.", 'start': 97.837, 'duration': 6.906}, {'end': 106.745, 'text': "That's the goal of machine learning.", 'start': 104.783, 'duration': 1.962}, {'end': 110.608, 'text': "And we've learned about first order and second order optimization.", 'start': 107.205, 'duration': 3.403}, {'end': 111.829, 'text': "What's first order?", 'start': 110.988, 'duration': 0.841}, {'end': 114.671, 'text': 'Gradient descent and its variance right?', 'start': 112.349, 'duration': 2.322}, {'end': 115.792, 'text': 'Where we are trying to?', 'start': 114.912, 'duration': 0.88}], 'summary': 'Machine learning optimization techniques. popular algorithm, easy to use. focus on objective optimization.', 'duration': 36.645, 'max_score': 79.147, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk79147.jpg'}, {'end': 188.697, 'src': 'embed', 'start': 156.338, 'weight': 1, 'content': [{'end': 161.76, 'text': "It's usually the predicted label minus or, sorry, the actual label minus the predicted label.", 'start': 156.338, 'duration': 5.422}, {'end': 162.781, 'text': "That's the error.", 'start': 162.08, 'duration': 0.701}, {'end': 166.402, 'text': 'Use it to compute the partial derivatives with respect to each weight value.', 'start': 163.161, 'duration': 3.241}, {'end': 173.445, 'text': "But if you don't have the label, how are you supposed to compute the error? And that's where unsupervised learning comes into play.", 'start': 166.882, 'duration': 6.563}, {'end': 175.769, 'text': 'Specifically, k means clustering.', 'start': 174.145, 'duration': 1.624}, {'end': 178.675, 'text': "So I've got this diagram here to show the differences here.", 'start': 176.13, 'duration': 2.545}, {'end': 188.697, 'text': 'So there are two outcomes that we could possibly want right either a discrete outcome, that is, some contained outcome, like red or blue, or blue,', 'start': 179.912, 'duration': 8.785}], 'summary': 'Error computation and unsupervised learning with k-means clustering for discrete outcomes.', 'duration': 32.359, 'max_score': 156.338, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk156338.jpg'}], 'start': 0.069, 'title': 'Detecting intruders with k-means clustering', 'summary': 'Focuses on using k-means clustering to detect intruders in a security system and explains the concept of unsupervised learning in optimizing for an objective without labels.', 'chapters': [{'end': 175.769, 'start': 0.069, 'title': 'Detecting intruders with k-means clustering', 'summary': 'Discusses detecting intruders in a security system using k-means clustering, a popular machine learning technique, and explains the concept of unsupervised learning in optimizing for an objective without labels.', 'duration': 175.7, 'highlights': ['K-means clustering is a popular technique in machine learning, widely used in Kaggle contests and the machine learning community. K-means clustering is widely popular and commonly used in machine learning competitions and communities.', 'Unsupervised learning, specifically k-means clustering, is used when there are no labels available to compute the error in optimization for an objective. Unsupervised learning, such as k-means clustering, is essential when labels are unavailable to compute the error in optimization.', 'The chapter explains the concept of first order and second order optimization, covering gradient descent and its variance for minimizing error. The chapter delves into first order and second order optimization, emphasizing gradient descent and its variance for error minimization.']}], 'duration': 175.7, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk69.jpg', 'highlights': ['K-means clustering is widely popular and commonly used in machine learning competitions and communities.', 'Unsupervised learning, such as k-means clustering, is essential when labels are unavailable to compute the error in optimization.', 'The chapter delves into first order and second order optimization, emphasizing gradient descent and its variance for error minimization.']}, {'end': 645.894, 'segs': [{'end': 258.357, 'src': 'embed', 'start': 216.632, 'weight': 1, 'content': [{'end': 224.321, 'text': 'or between two and point and two, point, two, five, and it could just go infinity in that direction of that of that numerical interval.', 'start': 216.632, 'duration': 7.689}, {'end': 230.684, 'text': 'right. and So, with supervised learning, we learned how to predict a continuous outcome using linear regression.', 'start': 224.321, 'duration': 6.363}, {'end': 231.804, 'text': 'That was our first video.', 'start': 230.724, 'duration': 1.08}, {'end': 236.246, 'text': 'And then the next thing we learned was how to predict a discrete outcome.', 'start': 232.265, 'duration': 3.981}, {'end': 238.127, 'text': 'And we used logistic regression for that.', 'start': 236.326, 'duration': 1.801}, {'end': 239.667, 'text': "And that's when we have labels.", 'start': 238.507, 'duration': 1.16}, {'end': 245.65, 'text': "But if we don't have labels, then we use clustering to predict a discrete outcome.", 'start': 240.028, 'duration': 5.622}, {'end': 246.53, 'text': "That's what we're going to do.", 'start': 245.69, 'duration': 0.84}, {'end': 248.631, 'text': "We're going to predict a discrete outcome.", 'start': 246.55, 'duration': 2.081}, {'end': 258.357, 'text': "and by defining these classes for people, these discrete classes, and then we're going to find the anomaly that is the intruder.", 'start': 250.294, 'duration': 8.063}], 'summary': 'Learned linear regression for continuous outcome, then logistic regression for discrete outcome. also discussed clustering and anomaly detection.', 'duration': 41.725, 'max_score': 216.632, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk216632.jpg'}, {'end': 306.949, 'src': 'embed', 'start': 277.512, 'weight': 0, 'content': [{'end': 282.799, 'text': "I mean, you've got the label, so of course it's more accurate, right? It's like having training wheels on your bike.", 'start': 277.512, 'duration': 5.287}, {'end': 288.88, 'text': "But you have to have a human who labels this data or it's just labeled itself somehow.", 'start': 284.358, 'duration': 4.522}, {'end': 294.003, 'text': 'But unsupervised learning is more convenient because most data is unlabeled right?', 'start': 289.361, 'duration': 4.642}, {'end': 298.765, 'text': "You don't just have this neatly labeled data like oh, this is this or this is.", 'start': 294.023, 'duration': 4.742}, {'end': 299.986, 'text': 'No, data is messy.', 'start': 298.765, 'duration': 1.221}, {'end': 301.006, 'text': 'The world is messy.', 'start': 300.026, 'duration': 0.98}, {'end': 301.847, 'text': 'Life is messy.', 'start': 301.066, 'duration': 0.781}, {'end': 306.949, 'text': "So that's what we want, ideally, to run our algorithms unsupervised.", 'start': 302.327, 'duration': 4.622}], 'summary': 'Unsupervised learning is ideal for messy, unlabeled data.', 'duration': 29.437, 'max_score': 277.512, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk277512.jpg'}, {'end': 424.805, 'src': 'embed', 'start': 399.635, 'weight': 3, 'content': [{'end': 405.48, 'text': "okay?. They're called centroids because eventually they're going to be the center of each cluster that we learn.", 'start': 399.635, 'duration': 5.845}, {'end': 407.822, 'text': 'So these centroid points?', 'start': 405.861, 'duration': 1.961}, {'end': 410.505, 'text': 'there are K of them and we just plot them randomly, okay?', 'start': 407.822, 'duration': 2.683}, {'end': 411.947, 'text': 'So we define k.', 'start': 410.885, 'duration': 1.062}, {'end': 416.493, 'text': 'we have our set of data points and then our set of centroids, k of them, and we just plot them randomly, okay?', 'start': 411.947, 'duration': 4.546}, {'end': 417.495, 'text': 'Now what?', 'start': 417.054, 'duration': 0.441}, {'end': 424.805, 'text': "Now, here's the steps we do, and we repeat them until convergence, which is what we predefined beforehand with threshold value.", 'start': 418.035, 'duration': 6.77}], 'summary': 'Learning k-means clustering with k centroid points, repeating steps until convergence', 'duration': 25.17, 'max_score': 399.635, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk399635.jpg'}, {'end': 504.111, 'src': 'heatmap', 'start': 462.005, 'weight': 0.742, 'content': [{'end': 470.548, 'text': "So we're gonna assign that data point to the, to the cluster J, where J is for the centroid, that is closest to that data point.", 'start': 462.005, 'duration': 8.543}, {'end': 471.949, 'text': 'Okay, so then what happens is?', 'start': 470.548, 'duration': 1.401}, {'end': 475.886, 'text': "We've got a set of clusters now and we do this for every single data point.", 'start': 473.203, 'duration': 2.683}, {'end': 484.678, 'text': 'So every single data point will belong to a cluster and that cluster will be defined as the centroid point that is closest to that data point, okay?', 'start': 476.327, 'duration': 8.351}, {'end': 487.882, 'text': "So that's the initial cluster that's gonna be defined.", 'start': 485.018, 'duration': 2.864}, {'end': 493.687, 'text': "Then, for each of those clusters, J, We're going to take all of those data points in that cluster.", 'start': 488.342, 'duration': 5.345}, {'end': 496.688, 'text': "We're going to add them all up and then divide by the number of them.", 'start': 493.707, 'duration': 2.981}, {'end': 499.649, 'text': "And what is this called? It's called the mean, right, or the average.", 'start': 496.748, 'duration': 2.901}, {'end': 504.111, 'text': "So now you're getting to see where this name comes from, right? K-means, right? It all makes sense.", 'start': 500.289, 'duration': 3.822}], 'summary': 'K-means algorithm assigns data points to clusters based on centroids, finding mean for each cluster.', 'duration': 42.106, 'max_score': 462.005, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk462005.jpg'}, {'end': 562.18, 'src': 'embed', 'start': 539.202, 'weight': 4, 'content': [{'end': 547.529, 'text': "So we go back to for each point x, and for those new centroids, find the distance for all the closest data points, and it's gonna be a new cluster.", 'start': 539.202, 'duration': 8.327}, {'end': 548.77, 'text': "And so that's what you're seeing here.", 'start': 547.609, 'duration': 1.161}, {'end': 552.534, 'text': "It's gonna be a new cluster, and then we just keep repeating that process.", 'start': 549.111, 'duration': 3.423}, {'end': 557.477, 'text': "until none of the cluster assignments change, and then we're good.", 'start': 553.214, 'duration': 4.263}, {'end': 562.18, 'text': "So, right, so that's kind of how that works.", 'start': 559.118, 'duration': 3.062}], 'summary': 'Iteratively find closest data points to new centroids to create new clusters until cluster assignments stabilize.', 'duration': 22.978, 'max_score': 539.202, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk539202.jpg'}, {'end': 629.55, 'src': 'embed', 'start': 605.041, 'weight': 5, 'content': [{'end': 611.063, 'text': "what I guess those are on my mind Then we know that K should be three, because we have three countries that we're targeting.", 'start': 605.041, 'duration': 6.022}, {'end': 620.427, 'text': "But if we don't know how many classes we want these are just unknown, unknowns then we will have to decide what the best K value is.", 'start': 611.063, 'duration': 9.364}, {'end': 623.388, 'text': 'and And that can be a guess and check method.', 'start': 620.427, 'duration': 2.961}, {'end': 626.789, 'text': "but there's actually a smarter way to do that, and it's called the elbow method.", 'start': 623.388, 'duration': 3.401}, {'end': 629.55, 'text': 'So the elbow method is a very popular method.', 'start': 626.949, 'duration': 2.601}], 'summary': 'Determining k value for targeting three countries using the elbow method.', 'duration': 24.509, 'max_score': 605.041, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk605041.jpg'}], 'start': 176.13, 'title': 'Supervised vs unsupervised learning and k-means clustering algorithm', 'summary': 'Discusses the differences between supervised and unsupervised learning, emphasizing the use of linear regression for continuous outcomes and logistic regression for discrete outcomes. it also explains the k-means clustering algorithm, detailing the iterative process involving centroids, data point assignment, and the elbow method for determining the best k value.', 'chapters': [{'end': 322.463, 'start': 176.13, 'title': 'Supervised vs unsupervised learning', 'summary': 'Discusses the differences between supervised and unsupervised learning, highlighting the use of linear regression for continuous outcomes and logistic regression for discrete outcomes, as well as the advantages and drawbacks of both approaches.', 'duration': 146.333, 'highlights': ['Supervised learning is more accurate as it uses labeled data, while unsupervised learning is more convenient due to the prevalence of unlabeled data. Supervised learning is more accurate due to the use of labeled data, resembling training wheels for a bike, while unsupervised learning is more convenient as most data is unlabeled, reducing human effort.', 'Linear regression is used for predicting continuous outcomes, while logistic regression is employed for predicting discrete outcomes. Linear regression is utilized for predicting continuous outcomes, while logistic regression is employed for predicting discrete outcomes.', 'Clustering is used for predicting discrete outcomes when labels are not available, and dimensionality reduction is performed for continuous outcomes. Clustering is utilized for predicting discrete outcomes in the absence of labels, while dimensionality reduction is performed for continuous outcomes.']}, {'end': 645.894, 'start': 322.483, 'title': 'K-means clustering algorithm', 'summary': 'Explains the k-means clustering algorithm, which involves randomly placing centroids, assigning data points to the nearest centroid, recalculating centroids, and iterating until cluster assignments no longer change, with the method to determine the best k value being the elbow method.', 'duration': 323.411, 'highlights': ['K-means clustering involves randomly placing centroids and assigning data points to the nearest centroid The algorithm starts by randomly placing K centroids and then assigns each data point to the nearest centroid, which forms the initial clusters.', 'Iterating until cluster assignments no longer change The process iterates until none of the cluster assignments change, signifying that the algorithm has found the final cluster formations.', 'Determining the best K value using the elbow method The chapter introduces the elbow method as a smart approach to determine the best K value, by analyzing a graph resembling an elbow to find the optimal number of clusters.']}], 'duration': 469.764, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk176130.jpg', 'highlights': ['Supervised learning is more accurate as it uses labeled data, while unsupervised learning is more convenient due to the prevalence of unlabeled data.', 'Linear regression is used for predicting continuous outcomes, while logistic regression is employed for predicting discrete outcomes.', 'Clustering is used for predicting discrete outcomes when labels are not available, and dimensionality reduction is performed for continuous outcomes.', 'K-means clustering involves randomly placing centroids and assigning data points to the nearest centroid.', 'Iterating until cluster assignments no longer change.', 'Determining the best K value using the elbow method.']}, {'end': 1009.011, 'segs': [{'end': 758.599, 'src': 'embed', 'start': 727.661, 'weight': 0, 'content': [{'end': 730.302, 'text': 'so it would be six in the case of this graph.', 'start': 727.661, 'duration': 2.641}, {'end': 732.723, 'text': 'and that is our optimal K value.', 'start': 730.302, 'duration': 2.421}, {'end': 739.188, 'text': "and Because after that there's very diminishing returns, as you can see, we want to find the minimal error value.", 'start': 732.723, 'duration': 6.465}, {'end': 745.452, 'text': "And we found that for this K value of six the error is, It's not at its smallest,", 'start': 739.188, 'duration': 6.264}, {'end': 751.555, 'text': "But at the point where it's everything after that is just diminishing returns, and we could say 10 or 12 or 14.", 'start': 745.452, 'duration': 6.103}, {'end': 754.917, 'text': 'But then, for computational efficiency sake, we could just say six.', 'start': 751.555, 'duration': 3.362}, {'end': 758.599, 'text': "so we don't have to run that many iterations, So we'll just say six.", 'start': 754.917, 'duration': 3.682}], 'summary': 'The optimal k value for minimal error is 6, for computational efficiency.', 'duration': 30.938, 'max_score': 727.661, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk727661.jpg'}, {'end': 869.883, 'src': 'embed', 'start': 837.033, 'weight': 1, 'content': [{'end': 840.936, 'text': "And if you look at the definition of variance, which I'll define more in detail later,", 'start': 837.033, 'duration': 3.903}, {'end': 847.4, 'text': 'it is identical to the sum of the squared Euclidean distances from the center, so in Euclidean space.', 'start': 840.936, 'duration': 6.464}, {'end': 851.303, 'text': 'And because it is identical to the sum of Euclidean distances.', 'start': 847.8, 'duration': 3.503}, {'end': 856.486, 'text': 'we use the Euclidean distance as our distance metric, as opposed to something else like the Manhattan distance.', 'start': 851.303, 'duration': 5.183}, {'end': 860.31, 'text': 'Lastly, or two more.', 'start': 858.828, 'duration': 1.482}, {'end': 862.193, 'text': 'actually, when should you use this?', 'start': 860.31, 'duration': 1.883}, {'end': 869.883, 'text': 'if your data is numeric and that means if your features are numeric, right, you have numbers for your features if you have a categorical feature,', 'start': 862.193, 'duration': 7.69}], 'summary': 'Variance defined as sum of squared euclidean distances, used for numeric data.', 'duration': 32.85, 'max_score': 837.033, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk837033.jpg'}, {'end': 935.619, 'src': 'embed', 'start': 901.707, 'weight': 2, 'content': [{'end': 904.307, 'text': 'you know, quick and dirty clustering algorithm.', 'start': 901.707, 'duration': 2.6}, {'end': 909.81, 'text': "it's great for that, okay, And it really shines when you have multivariate data.", 'start': 904.307, 'duration': 5.503}, {'end': 911.752, 'text': 'So that is more than one dimension.', 'start': 910.41, 'duration': 1.342}, {'end': 915.317, 'text': "Okay, so lastly, two other examples I've got here.", 'start': 912.153, 'duration': 3.164}, {'end': 919.162, 'text': 'One for fraud detection and then one for MNIST without labels.', 'start': 915.437, 'duration': 3.725}, {'end': 923.647, 'text': "I know, what? That is possible? MNIST without labels? Yes, it's possible.", 'start': 919.262, 'duration': 4.385}, {'end': 924.709, 'text': 'Anything is possible.', 'start': 923.807, 'duration': 0.902}, {'end': 927.331, 'text': 'So anything is really possible here.', 'start': 925.149, 'duration': 2.182}, {'end': 935.619, 'text': "So for credit card fraud detection and for finding these labels, for these MNIST images, where let's just say we don't know the labels.", 'start': 927.371, 'duration': 8.248}], 'summary': 'Clustering algorithm excels with multivariate data. used for fraud detection and mnist without labels.', 'duration': 33.912, 'max_score': 901.707, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk901707.jpg'}], 'start': 645.894, 'title': 'Finding optimal k value for k-means clustering', 'summary': "Discusses the process of finding the optimal k value for k-means clustering, performing k-means for k values between 1 and 10, computing the sum of squared errors for each iteration, and selecting the k value at the 'elbow' of the graph, with six being the optimal k value. it also introduces k-means clustering, explains the computation of euclidean distance for data points, its relevance in minimizing within-cluster variance and its application to multi-dimensional datasets, and highlights the suitability of the k-means algorithm for numeric data, simplicity, speed, and its applications in fraud detection and clustering mnist images without labels.", 'chapters': [{'end': 751.555, 'start': 645.894, 'title': 'Finding optimal k value for k-means clustering', 'summary': "Explains the process of finding the optimal k value for k-means clustering, by performing k-means for k values between 1 and 10, computing the sum of squared errors for each iteration, and selecting the k value at the 'elbow' of the graph, with six being the optimal k value.", 'duration': 105.661, 'highlights': ['Performing K-means for k values between 1 and 10 to find the optimal cluster count.', "Computing the sum of squared errors for each iteration of K-means to create an 'elbow-like' graph.", "Selecting the K value at the 'elbow' of the graph, with six being the optimal K value."]}, {'end': 860.31, 'start': 751.555, 'title': 'K-means clustering & euclidean distance', 'summary': 'Introduces the concept of k-means clustering and explains the computation of euclidean distance for data points, highlighting its relevance in minimizing within-cluster variance and its application to multi-dimensional datasets.', 'duration': 108.755, 'highlights': ['The Euclidean distance formula is explained, involving the calculation of the square root of the sum of squared differences between data points in multi-dimensional datasets.', 'K-means clustering minimizes within-cluster variance, which is identical to the sum of squared Euclidean distances from the center in Euclidean space.', 'The rationale for using Euclidean distance as the distance metric in K-means clustering is highlighted, emphasizing its relevance in minimizing within-cluster variance.']}, {'end': 1009.011, 'start': 860.31, 'title': 'K-means clustering and its applications', 'summary': 'Introduces the k-means algorithm, highlighting its suitability for numeric data, simplicity, and speed, along with applications in fraud detection and clustering mnist images without labels.', 'duration': 148.701, 'highlights': ['K-means is suitable for numeric data and categorical features cannot be mapped in Euclidean space, making it ideal for clustering numeric data (e.g., packets sent per second and packet size) for anomaly detection such as DDoS attacks.', 'The algorithm is simple and fast, making it a quick and effective clustering approach, particularly advantageous for multivariate data, and shines in cases like credit card fraud detection and clustering MNIST images without labels.', 'It is advantageous for fraud detection and clustering MNIST images without labels, showcasing its versatility and potential in real-world applications.']}], 'duration': 363.117, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk645894.jpg', 'highlights': ["Selecting the K value at the 'elbow' of the graph, with six being the optimal K value.", 'K-means clustering minimizes within-cluster variance, which is identical to the sum of squared Euclidean distances from the center in Euclidean space.', 'The algorithm is simple and fast, making it a quick and effective clustering approach, particularly advantageous for multivariate data, and shines in cases like credit card fraud detection and clustering MNIST images without labels.']}, {'end': 1282.261, 'segs': [{'end': 1052.241, 'src': 'embed', 'start': 1009.551, 'weight': 1, 'content': [{'end': 1013.433, 'text': "Okay, and that that's that those are our data points and we have a few of these data points.", 'start': 1009.551, 'duration': 3.882}, {'end': 1015.053, 'text': "Okay, so That's it.", 'start': 1013.433, 'duration': 1.62}, {'end': 1019.958, 'text': "just two features for our data and we'll talk about, like I said, dimensionality reduction later on.", 'start': 1015.053, 'duration': 4.905}, {'end': 1024.08, 'text': 'if we have a million features, How do we reduce it to two or three so that we can visualize it?', 'start': 1019.958, 'duration': 4.122}, {'end': 1027.743, 'text': "So that's our data set that we want to load.", 'start': 1024.742, 'duration': 3.001}, {'end': 1029.465, 'text': "Okay so then here's what we've got here.", 'start': 1027.743, 'duration': 1.722}, {'end': 1030.807, 'text': 'This is the Euclidean distance.', 'start': 1029.486, 'duration': 1.321}, {'end': 1033.429, 'text': 'now This is the formula that I was talking about.', 'start': 1030.807, 'duration': 2.622}, {'end': 1037.973, 'text': 'so, given two data points, P and Q, take Each of the values.', 'start': 1033.429, 'duration': 4.544}, {'end': 1041.454, 'text': 'so if it has you know Q, it could be X, it could be Y, Z.', 'start': 1037.973, 'duration': 3.481}, {'end': 1050.06, 'text': 'take each of the values, Subtract them to get the difference, square it and then add them all together and then square root that whole Equation.', 'start': 1041.454, 'duration': 8.606}, {'end': 1052.241, 'text': "and so that's what the Sigma notation denotes.", 'start': 1050.06, 'duration': 2.181}], 'summary': 'The data set has few data points, with a focus on dimensionality reduction for visualization. euclidean distance formula calculates differences between data points.', 'duration': 42.69, 'max_score': 1009.551, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk1009551.jpg'}, {'end': 1145.868, 'src': 'embed', 'start': 1121.618, 'weight': 0, 'content': [{'end': 1127.541, 'text': "So what we're gonna first do is store the past centroids in this history of centroids list.", 'start': 1121.618, 'duration': 5.923}, {'end': 1130.022, 'text': 'Now this is not really a part of the algorithm.', 'start': 1127.861, 'duration': 2.161}, {'end': 1136.786, 'text': 'This is just for us, so we can then graph how the centroids move over time later on so that we can visualize it okay?', 'start': 1130.042, 'duration': 6.744}, {'end': 1140.727, 'text': "So then we're going to say OK, the distance metric is going to be Euclidean.", 'start': 1137.626, 'duration': 3.101}, {'end': 1144.308, 'text': 'So we define it right here as distmethod, that variable.', 'start': 1140.747, 'duration': 3.561}, {'end': 1145.868, 'text': 'And then we set the data set.', 'start': 1144.808, 'duration': 1.06}], 'summary': 'Storing past centroids for visualization and using euclidean distance metric.', 'duration': 24.25, 'max_score': 1121.618, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk1121618.jpg'}, {'end': 1250.11, 'src': 'embed', 'start': 1226.393, 'weight': 4, 'content': [{'end': 1233.178, 'text': "So we're going to take those centroids that we just defined randomly and set them to our history centroids list.", 'start': 1226.393, 'duration': 6.785}, {'end': 1235.64, 'text': 'So then we can just keep a copy of it over time.', 'start': 1233.458, 'duration': 2.182}, {'end': 1243.126, 'text': "And we'll keep adding our centroids that are calculated to this history centroids list so we can graph it later for our own visualization.", 'start': 1235.68, 'duration': 7.446}, {'end': 1247.148, 'text': 'Okay, and then we have our prototypes old list.', 'start': 1243.786, 'duration': 3.362}, {'end': 1249.549, 'text': "It's just gonna be initialized as a bunch of zeros.", 'start': 1247.408, 'duration': 2.141}, {'end': 1250.11, 'text': "It's empty.", 'start': 1249.589, 'duration': 0.521}], 'summary': 'Centroids are defined randomly and added to history centroids list for visualization.', 'duration': 23.717, 'max_score': 1226.393, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk1226393.jpg'}, {'end': 1292.848, 'src': 'embed', 'start': 1267.738, 'weight': 3, 'content': [{'end': 1273.559, 'text': 'The histories or the history centroids is just for us to see how it changes over time, okay?', 'start': 1267.738, 'duration': 5.821}, {'end': 1280.44, 'text': 'And then we have one more list, and that is the belongs to list to store the clusters over time, the clusters themselves,', 'start': 1274.039, 'duration': 6.401}, {'end': 1282.261, 'text': 'all the data points contained in a cluster.', 'start': 1280.44, 'duration': 1.821}, {'end': 1289.966, 'text': 'Okay, and then we have our distance method, which is going to take the current prototypes and then the prototypes old and the distance between them,', 'start': 1283.081, 'duration': 6.885}, {'end': 1292.848, 'text': "and it's going to store that in the norm variable.", 'start': 1289.966, 'duration': 2.882}], 'summary': 'Developing a model to track changes in centroids and clusters over time.', 'duration': 25.11, 'max_score': 1267.738, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk1267738.jpg'}], 'start': 1009.551, 'title': 'K-means algorithm implementation', 'summary': 'Covers the implementation of the k-means algorithm with a focus on storing past centroids, defining distance metric, initializing k centroids, and maintaining history of centroids and clusters, providing practical insights on algorithm implementation.', 'chapters': [{'end': 1120.978, 'start': 1009.551, 'title': 'K-means algorithm and euclidean distance', 'summary': 'Covers the k-means algorithm with a k-value of 2, epsilon threshold value, and the computation of euclidean distance for dimensionality reduction, with a focus on the formula and sigma notation for calculating the distance.', 'duration': 111.427, 'highlights': ['The chapter covers the k-means algorithm with a k-value of 2, epsilon threshold value, and the computation of Euclidean distance for dimensionality reduction, with a focus on the formula and sigma notation for calculating the distance.', 'Explanation of the Euclidean distance formula for calculating the distance between two data points by taking the difference of each value, squaring them, summing the squared values and finding the square root of the sum.', 'Definition of hyperparameters for the k-means algorithm including k-value for the number of clusters, epsilon value as the threshold for minimum error, and the selection of Euclidean distance for computation.', 'Introduction of dimensionality reduction concept for visualizing data by reducing a million features to two or three using Euclidean distance.', 'Description of the sigma notation representing the sum of squared errors for a set of values, denoting the process of finding the difference between feature values in a data point squared and then finding the square root of the sum of squared errors.']}, {'end': 1282.261, 'start': 1121.618, 'title': 'K-means algorithm implementation', 'summary': 'Outlines the implementation of the k-means algorithm, including storing past centroids, defining distance metric, setting the dataset, initializing k centroids, and maintaining history of centroids and clusters over time.', 'duration': 160.643, 'highlights': ['The chapter discusses the process of storing past centroids in a history of centroids list for later visualization.', 'It explains the definition of the distance metric as Euclidean and the loading of the dataset with two dimensions: amount of packets sent and the size of each packet.', 'The implementation involves defining k centroids, randomly plotting them on the graph, and setting the number of clusters to be found randomly.', 'It details the initialization of the prototypes old list to keep track of centroids every iteration and the use of history centroids to visualize changes over time.']}], 'duration': 272.71, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk1009551.jpg', 'highlights': ['Covers the implementation of the k-means algorithm with a focus on storing past centroids, defining distance metric, initializing k centroids, and maintaining history of centroids and clusters.', 'Explanation of the Euclidean distance formula for calculating the distance between two data points by taking the difference of each value, squaring them, summing the squared values and finding the square root of the sum.', 'Introduction of dimensionality reduction concept for visualizing data by reducing a million features to two or three using Euclidean distance.', 'The chapter discusses the process of storing past centroids in a history of centroids list for later visualization.', 'It details the initialization of the prototypes old list to keep track of centroids every iteration and the use of history centroids to visualize changes over time.']}, {'end': 1520.465, 'segs': [{'end': 1310.817, 'src': 'embed', 'start': 1283.081, 'weight': 2, 'content': [{'end': 1289.966, 'text': 'Okay, and then we have our distance method, which is going to take the current prototypes and then the prototypes old and the distance between them,', 'start': 1283.081, 'duration': 6.885}, {'end': 1292.848, 'text': "and it's going to store that in the norm variable.", 'start': 1289.966, 'duration': 2.882}, {'end': 1295.95, 'text': 'Okay, and then the number of iterations, which will start off as zero.', 'start': 1293.148, 'duration': 2.802}, {'end': 1302.255, 'text': "Okay, so now let's go ahead and write out this algorithm, shall we?", 'start': 1297.131, 'duration': 5.124}, {'end': 1310.817, 'text': "So we're gonna say okay, so, while the norm, the distance is greater than epsilon, where epsilon is zero, in our case we're gonna say well,", 'start': 1303.191, 'duration': 7.626}], 'summary': 'Distance method calculates norm variable, iterations start at zero.', 'duration': 27.736, 'max_score': 1283.081, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk1283081.jpg'}, {'end': 1439.797, 'src': 'embed', 'start': 1410.438, 'weight': 0, 'content': [{'end': 1413.26, 'text': "We're going to say we want to compute the distance between.", 'start': 1410.438, 'duration': 2.822}, {'end': 1413.88, 'text': 'what do we have to do??', 'start': 1413.26, 'duration': 0.62}, {'end': 1420.805, 'text': 'We have to compute the distance between each data point and its closest and each centroid.', 'start': 1414.18, 'duration': 6.625}, {'end': 1424.987, 'text': 'So for every data point, we want to compute the distance between it and every other centroid.', 'start': 1420.825, 'duration': 4.162}, {'end': 1432.332, 'text': 'And we want to find the minimum distance centroid, that closest centroid to it, so we can put it into that specific cluster.', 'start': 1425.708, 'duration': 6.624}, {'end': 1439.797, 'text': "So it's like, compute the distance between x and, where x is a data point, and centroid.", 'start': 1432.932, 'duration': 6.865}], 'summary': 'Compute distance between data points and centroids to assign to clusters.', 'duration': 29.359, 'max_score': 1410.438, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk1410438.jpg'}], 'start': 1283.081, 'title': 'K-means and clustering algorithms', 'summary': 'Covers the k-means training algorithm, involving distance computation, iteration, and distance vector, as well as the k-means clustering algorithm, including distance computation, data point assignment, and use of euclidean distance.', 'chapters': [{'end': 1366.555, 'start': 1283.081, 'title': 'K-means training algorithm', 'summary': 'Outlines the k-means training algorithm, which involves computing the distance between current and old prototypes, iterating until distance is less than epsilon, and defining a distance vector of size k for each instance in the data set.', 'duration': 83.474, 'highlights': ['The algorithm involves computing the distance between current prototypes and the prototypes old, storing it in the norm variable, and iterating until the distance is less than epsilon. Computing distance between current and old prototypes, iterating until distance is less than epsilon', 'The number of iterations starts at zero, and the algorithm iterates through every instance in the data set to define a distance vector of size k. Number of iterations starting at zero, iterating through every instance in the data set', 'The distance vector is initialized as a set of zero values of size k. Initializing distance vector as a set of zero values of size k']}, {'end': 1520.465, 'start': 1366.896, 'title': 'K-means clustering algorithm', 'summary': 'Explains the process of computing distances between data points and centroids, assigning data points to clusters based on the minimum distance, and using the euclidean distance method in the k-means clustering algorithm.', 'duration': 153.569, 'highlights': ['The chapter explains the process of computing distances between data points and centroids, assigning data points to clusters based on the minimum distance, and using the Euclidean distance method.', 'The algorithm involves nested for loops to compute the distance between each data point and every centroid, and then finding the closest centroid to assign the data point to a specific cluster.', 'It describes the step of finding the minimum distance for each data point and assigning it to a cluster based on the index instance, using the Euclidean distance defined in its own function.']}], 'duration': 237.384, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk1283081.jpg', 'highlights': ['The algorithm involves nested for loops to compute the distance between each data point and every centroid, and then finding the closest centroid to assign the data point to a specific cluster.', 'The chapter explains the process of computing distances between data points and centroids, assigning data points to clusters based on the minimum distance, and using the Euclidean distance method.', 'The algorithm involves computing the distance between current prototypes and the prototypes old, storing it in the norm variable, and iterating until the distance is less than epsilon.']}, {'end': 1855.03, 'segs': [{'end': 1630.49, 'src': 'embed', 'start': 1603.28, 'weight': 3, 'content': [{'end': 1608.942, 'text': 'For k clusters, compute the mean of the data points in that cluster using np.mean.', 'start': 1603.28, 'duration': 5.662}, {'end': 1610.422, 'text': 'So that is the average.', 'start': 1609.482, 'duration': 0.94}, {'end': 1617.664, 'text': 'We add them all up and then divide by the number of them for all of those instances and store that mean in prototype.', 'start': 1610.462, 'duration': 7.202}, {'end': 1620.465, 'text': "And then that's going to be our new centroid.", 'start': 1618.004, 'duration': 2.461}, {'end': 1624.386, 'text': "So we'll add our new centroid to our temporary prototypes list.", 'start': 1620.765, 'duration': 3.621}, {'end': 1630.49, 'text': 'And we did that so we could then take that temporary prototype and assign it to our main prototype list.', 'start': 1624.986, 'duration': 5.504}], 'summary': 'Compute mean of data points for k clusters using np.mean, store in prototype as new centroid.', 'duration': 27.21, 'max_score': 1603.28, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk1603280.jpg'}, {'end': 1672.472, 'src': 'embed', 'start': 1645.601, 'weight': 2, 'content': [{'end': 1649.184, 'text': 'just for us to visualize how the centroids move over time later on.', 'start': 1645.601, 'duration': 3.583}, {'end': 1650.625, 'text': "And you'll see exactly what I mean by that.", 'start': 1649.224, 'duration': 1.401}, {'end': 1657.587, 'text': 'At the end of this we return our calculated centroids, and that is at the end of our algorithm, when we have reached convergence,', 'start': 1651.045, 'duration': 6.542}, {'end': 1663.329, 'text': "and then our history of centroids, all those centroids that we've calculated over time and then belongs to,", 'start': 1657.587, 'duration': 5.742}, {'end': 1672.472, 'text': 'which is all the clusters where we cluster all those data points and then the index is going to be the cluster that it belongs to, that K cluster.', 'start': 1663.329, 'duration': 9.143}], 'summary': 'Visualize centroid movement over time, return calculated centroids and cluster data points.', 'duration': 26.871, 'max_score': 1645.601, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk1645601.jpg'}, {'end': 1791.397, 'src': 'embed', 'start': 1759.471, 'weight': 1, 'content': [{'end': 1764.553, 'text': "We've written a method to plot out our results and now we can execute these methods.", 'start': 1759.471, 'duration': 5.082}, {'end': 1769.775, 'text': "so we're going to load our data set, train the model on that data, where k equals two.", 'start': 1764.553, 'duration': 5.222}, {'end': 1778.499, 'text': "so we want two clusters, and then it's going to give us back our computed centroids, the history of all the centroids that were computed over time,", 'start': 1769.775, 'duration': 8.724}, {'end': 1784.002, 'text': 'and then our list of data points defined by the cluster that each belongs to.', 'start': 1778.499, 'duration': 5.503}, {'end': 1791.397, 'text': "And so we could take those three values and then use them as parameters for our plotting function, and that's going to plot our result.", 'start': 1785.213, 'duration': 6.184}], 'summary': 'Method plots results, executing with k=2 clusters, providing centroids and data points for plotting.', 'duration': 31.926, 'max_score': 1759.471, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk1759471.jpg'}, {'end': 1830.446, 'src': 'embed', 'start': 1805.726, 'weight': 0, 'content': [{'end': 1814.954, 'text': 'But what happened was the algorithm learned that the right clusters were red and green for these respective clusters.', 'start': 1805.726, 'duration': 9.228}, {'end': 1818.557, 'text': 'And the blue points are the center points.', 'start': 1815.674, 'duration': 2.883}, {'end': 1820.038, 'text': 'And so those are the centroids.', 'start': 1818.917, 'duration': 1.121}, {'end': 1826.243, 'text': "Okay, and so over time what happens is, look, so here's what I mean by over time.", 'start': 1821.519, 'duration': 4.724}, {'end': 1830.446, 'text': 'Over time what happens is it learns the ideal center points for the centroids.', 'start': 1826.603, 'duration': 3.843}], 'summary': 'Algorithm learned ideal center points for centroids over time.', 'duration': 24.72, 'max_score': 1805.726, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk1805726.jpg'}], 'start': 1520.465, 'title': 'K-means clustering process', 'summary': 'Explains the process of k-means clustering, involving the computation of centroids for each cluster, visualization of centroid movement over time, and the return of calculated centroids and history of centroids.', 'chapters': [{'end': 1663.329, 'start': 1520.465, 'title': 'K-means clustering process', 'summary': 'Explains the process of k-means clustering, involving the computation of centroids for each cluster, and the visualization of centroid movement over time, concluding with the return of calculated centroids and the history of centroids.', 'duration': 142.864, 'highlights': ['We compute the mean of the data points in each cluster using np.mean to find the new centroids, and then store the new centroid in the main prototypes list.', 'Temporary prototype variable acts as a buffer for computing new centroids, which are then added to the main prototypes variable.', 'History centroids list is used to visualize the movement of centroids over time, aiding in understanding the convergence of the algorithm.']}, {'end': 1855.03, 'start': 1663.329, 'title': 'K-means algorithm summary', 'summary': 'Explains the k-means algorithm for clustering data points into two clusters using red and green colors for centroids, and plotting the history of centroids over time, resulting in the algorithm learning the ideal center points iteratively.', 'duration': 191.701, 'highlights': ['The algorithm clusters data points into two clusters using red and green colors for centroids, and plots the history of centroids over time, resulting in the algorithm learning the ideal center points iteratively.', 'Executing the K-means algorithm with k=2 yields the computed centroids, history of centroids computed over time, and the list of data points defined by their respective clusters.', 'The algorithm iteratively learns the ideal center points for the clusters over time, finding the most optimal center points and plotting them, resulting in the clusters being identified as red and green.']}], 'duration': 334.565, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/9991JlKnFmk/pics/9991JlKnFmk1520465.jpg', 'highlights': ['The algorithm iteratively learns the ideal center points for the clusters over time, finding the most optimal center points and plotting them, resulting in the clusters being identified as red and green.', 'Executing the K-means algorithm with k=2 yields the computed centroids, history of centroids computed over time, and the list of data points defined by their respective clusters.', 'History centroids list is used to visualize the movement of centroids over time, aiding in understanding the convergence of the algorithm.', 'We compute the mean of the data points in each cluster using np.mean to find the new centroids, and then store the new centroid in the main prototypes list.']}], 'highlights': ['K-means clustering is widely popular and commonly used in machine learning competitions and communities.', 'Unsupervised learning, such as k-means clustering, is essential when labels are unavailable to compute the error in optimization.', 'The algorithm is simple and fast, making it a quick and effective clustering approach, particularly advantageous for multivariate data, and shines in cases like credit card fraud detection and clustering MNIST images without labels.', 'Covers the implementation of the k-means algorithm with a focus on storing past centroids, defining distance metric, initializing k centroids, and maintaining history of centroids and clusters.', 'The algorithm involves nested for loops to compute the distance between each data point and every centroid, and then finding the closest centroid to assign the data point to a specific cluster.', 'The algorithm iteratively learns the ideal center points for the clusters over time, finding the most optimal center points and plotting them, resulting in the clusters being identified as red and green.']}