title

k-Nearest Neighbour

description

detail

{'title': 'k-Nearest Neighbour', 'heatmap': [{'end': 156.066, 'start': 131.589, 'weight': 0.757}, {'end': 1064.461, 'start': 963.061, 'weight': 1}], 'summary': 'Covers instance-based learning, feature selection, and the k-nearest neighbor algorithm in supervised learning, addressing the problem of feature dimensionality. it explores the training and prediction phases of the k nearest neighbor algorithm for classification and regression problems, discusses the concepts of nearest neighbor and k nearest neighbor algorithms, and delves into the impact of k values on decision boundaries and model performance.', 'chapters': [{'end': 103.342, 'segs': [{'end': 103.342, 'src': 'embed', 'start': 18.803, 'weight': 0, 'content': [{'end': 23.645, 'text': 'Good morning, today we will start module 3.', 'start': 18.803, 'duration': 4.842}, {'end': 31.507, 'text': 'In this module we will talk about instance based learning and then we will talk about feature selection.', 'start': 23.645, 'duration': 7.862}, {'end': 35.068, 'text': 'We will show how, in instance, based learning,', 'start': 31.967, 'duration': 3.101}, {'end': 46.152, 'text': 'feature dimensionality is a problem and feature large having many features is a problem for many other learning algorithms and we will look for methods for feature selection.', 'start': 35.068, 'duration': 11.084}, {'end': 50.473, 'text': 'In the first part of the lecture which we will have today,', 'start': 47.032, 'duration': 3.441}, {'end': 57.215, 'text': 'we will talk about instance based learning and specifically the k nearest neighbor algorithm.', 'start': 50.473, 'duration': 6.742}, {'end': 67.559, 'text': 'So, what we have is that suppose in a machine learning in supervised learning you have got training examples x, y.', 'start': 57.956, 'duration': 9.603}, {'end': 75.602, 'text': 'a set of them x i y i or you can say x i f x i.', 'start': 69.039, 'duration': 6.563}, {'end': 87.567, 'text': 'So, these examples are given to you and given these examples you want to come up with a function f, you want to find an estimate for the function f.', 'start': 75.602, 'duration': 11.965}, {'end': 97.74, 'text': 'In the previous module we have seen how to learn linear regression, a linear function f or a decision tree,', 'start': 88.937, 'duration': 8.803}, {'end': 103.342, 'text': 'a function which is a decision tree to estimate this f.', 'start': 97.74, 'duration': 5.602}], 'summary': 'Module 3 covers instance-based learning and feature selection, focusing on the k-nearest neighbor algorithm and addressing issues with feature dimensionality and large feature sets.', 'duration': 84.539, 'max_score': 18.803, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/PNglugooJUQ/pics/PNglugooJUQ18803.jpg'}], 'start': 18.803, 'title': 'Instance-based learning & feature selection', 'summary': 'Module 3 covers instance-based learning, feature selection, and the k-nearest neighbor algorithm in supervised learning, addressing the problem of feature dimensionality and methods for feature selection.', 'chapters': [{'end': 103.342, 'start': 18.803, 'title': 'Module 3: instance-based learning & feature selection', 'summary': 'Covers instance-based learning, feature selection, and the k-nearest neighbor algorithm in supervised learning, addressing the problem of feature dimensionality and methods for feature selection.', 'duration': 84.539, 'highlights': ['The lecture discusses instance-based learning, k-nearest neighbor algorithm, and feature selection in the context of supervised learning.', 'It highlights the challenge of feature dimensionality in instance-based learning and the impact of having many features on learning algorithms.', 'The chapter emphasizes the need to find methods for feature selection to address the problem of feature dimensionality.', 'It explains the goal of supervised learning as finding an estimate for the function f based on training examples x, y.', 'The module aims to provide an understanding of learning linear regression, decision trees, and other methods to estimate the function f.']}], 'duration': 84.539, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/PNglugooJUQ/pics/PNglugooJUQ18803.jpg', 'highlights': ['The lecture discusses instance-based learning, k-nearest neighbor algorithm, and feature selection in supervised learning.', 'The chapter emphasizes the need to find methods for feature selection to address the problem of feature dimensionality.', 'It highlights the challenge of feature dimensionality in instance-based learning and the impact of having many features on learning algorithms.', 'The module aims to provide an understanding of learning linear regression, decision trees, and other methods to estimate the function f.', 'It explains the goal of supervised learning as finding an estimate for the function f based on training examples x, y.']}, {'end': 615.458, 'segs': [{'end': 156.066, 'src': 'heatmap', 'start': 103.342, 'weight': 0, 'content': [{'end': 110.725, 'text': 'Today we will look at another setting called instance based learning which is also called lazy learning.', 'start': 103.342, 'duration': 7.383}, {'end': 119.724, 'text': 'I mean the algorithm that we will discuss is a lazy algorithm.', 'start': 110.745, 'duration': 8.979}, {'end': 121.665, 'text': 'We will tell what it means.', 'start': 120.564, 'duration': 1.101}, {'end': 130.988, 'text': 'Instance based learning what we do is that when we get the training examples we do not process them and learn a model.', 'start': 121.805, 'duration': 9.183}, {'end': 133.97, 'text': 'Instead we just store the examples.', 'start': 131.589, 'duration': 2.381}, {'end': 139.467, 'text': 'when we need to classify an instance that time we do something.', 'start': 135.023, 'duration': 4.444}, {'end': 148.375, 'text': 'So, we do not immediately learn a model that is why the algorithm that we will discuss is also called a lazy algorithm.', 'start': 139.987, 'duration': 8.388}, {'end': 156.066, 'text': 'that is, the algorithm does not come up with a model a priori.', 'start': 150.802, 'duration': 5.264}], 'summary': 'Instance-based learning stores examples, does not immediately learn a model.', 'duration': 30.628, 'max_score': 103.342, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/PNglugooJUQ/pics/PNglugooJUQ103342.jpg'}, {'end': 536.036, 'src': 'embed', 'start': 426.371, 'weight': 1, 'content': [{'end': 441.4, 'text': 'So, that it is closest to x t and then we predict y 1 as the output y t.', 'start': 426.371, 'duration': 15.029}, {'end': 443.401, 'text': 'This is the basic one nearest neighbor algorithm.', 'start': 441.4, 'duration': 2.001}, {'end': 459.549, 'text': 'in a more generalized form of the algorithm.', 'start': 456.627, 'duration': 2.922}, {'end': 468.414, 'text': 'instead of finding the single nearest neighbor, instead of finding the single example which is closest to the text example,', 'start': 459.549, 'duration': 8.865}, {'end': 481.622, 'text': 'what we do is that we find k training examples x 1 y 1, x 2 y 2, x k y k.', 'start': 468.414, 'duration': 13.208}, {'end': 492.492, 'text': 'which are closest to x t.', 'start': 488.573, 'duration': 3.919}, {'end': 499.096, 'text': 'So, k may be 3, 4, 5 etcetera.', 'start': 496.235, 'duration': 2.861}, {'end': 510.282, 'text': 'We find k nearest examples, nearest training examples and predict as the output y t.', 'start': 499.717, 'duration': 10.565}, {'end': 517.505, 'text': 'What should we predict? It depends on whether we are doing a classification problem or a regression problem.', 'start': 510.282, 'duration': 7.223}, {'end': 536.036, 'text': 'For a classification problem we look at y 1, y 2, y k and we can predict that class which is the majority class among y 1, y 2, y k.', 'start': 518.164, 'duration': 17.872}], 'summary': 'K-nearest neighbor algorithm predicts output based on k nearest training examples.', 'duration': 109.665, 'max_score': 426.371, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/PNglugooJUQ/pics/PNglugooJUQ426371.jpg'}], 'start': 103.342, 'title': 'Instance based learning', 'summary': 'Introduces instance based learning, also known as lazy learning, and explores the k nearest neighbor algorithm. it explains the training and prediction phases of the k nearest neighbor algorithm, and its use for classification and regression problems.', 'chapters': [{'end': 294.111, 'start': 103.342, 'title': 'Instance based learning', 'summary': 'Introduces instance based learning, also known as lazy learning, which is a method of not immediately learning a model but instead storing training examples and using them to classify new instances, and explores the k nearest neighbor algorithm for finding similar instances.', 'duration': 190.769, 'highlights': ["Instance based learning involves storing training examples and using them to classify new instances instead of learning a model, making it a 'lazy' algorithm. This method does not immediately learn a model but instead stores the examples, and when it needs to classify an instance, it uses the stored instances in memory to find the possible y.", 'The k nearest neighbor algorithm is used to find similar instances by identifying the closest instance in terms of the x value and guessing the y value of the new instance based on the closest instance. The k nearest neighbor algorithm works by finding the closest instance in terms of the x value and guessing the y value of the new instance based on the closest instance.', 'Different distance functions or metrics, such as Euclidean distance and cosine, can be used to measure similarity when finding similar instances. Various metrics like Euclidean distance and cosine are used to find similarity based on the type of data.']}, {'end': 615.458, 'start': 294.111, 'title': 'K nearest neighbor algorithm', 'summary': 'Explains the training and prediction phases of the k nearest neighbor algorithm, where at prediction time, k training examples closest to the test instance are used to make predictions based on majority class for classification problems and average for regression problems.', 'duration': 321.347, 'highlights': ['The k nearest neighbor algorithm consists of a training phase where training examples are saved and a prediction phase where k training examples closest to the test instance are used to make predictions.', 'The prediction in k nearest neighbor algorithm depends on whether it is a classification problem or a regression problem, where for classification problems, the majority class among the k nearest examples is predicted, and for regression problems, the average value of the k nearest examples is predicted.', 'The k in k nearest neighbor algorithm represents the number of training examples closest to the test instance that are considered for making predictions, and it can be any value such as 3, 4, or 5.']}], 'duration': 512.116, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/PNglugooJUQ/pics/PNglugooJUQ103342.jpg', 'highlights': ["Instance based learning involves storing training examples and using them to classify new instances instead of learning a model, making it a 'lazy' algorithm.", 'The k nearest neighbor algorithm is used to find similar instances by identifying the closest instance in terms of the x value and guessing the y value of the new instance based on the closest instance.', 'The k nearest neighbor algorithm consists of a training phase where training examples are saved and a prediction phase where k training examples closest to the test instance are used to make predictions.', 'The prediction in k nearest neighbor algorithm depends on whether it is a classification problem or a regression problem, where for classification problems, the majority class among the k nearest examples is predicted, and for regression problems, the average value of the k nearest examples is predicted.']}, {'end': 1040.397, 'segs': [{'end': 742.941, 'src': 'embed', 'start': 653.641, 'weight': 1, 'content': [{'end': 664.984, 'text': 'the classification of a point in this space for one nearest neighbor is coming from the most nearby point.', 'start': 653.641, 'duration': 11.343}, {'end': 670.846, 'text': 'Therefore, you can think of we can divide the space into regions.', 'start': 665.844, 'duration': 5.002}, {'end': 676.047, 'text': 'So this blue region, this violet region, purple or whatever the purple region,', 'start': 671.666, 'duration': 4.381}, {'end': 682.249, 'text': 'corresponds to those places in space for which this is the nearest neighbor.', 'start': 676.047, 'duration': 6.202}, {'end': 687.5, 'text': 'this region is the region for which this is the nearest neighbor.', 'start': 683.755, 'duration': 3.745}, {'end': 699.055, 'text': 'So we have divided the space into different regions, and this particular type of division, or this structure data structure,', 'start': 688.161, 'duration': 10.894}, {'end': 700.597, 'text': 'is called a Voronoi diagram.', 'start': 699.055, 'duration': 1.542}, {'end': 706.079, 'text': 'In this Voronoi diagram, which can be constructed by you,', 'start': 701.298, 'duration': 4.781}, {'end': 713.521, 'text': 'take any two neighboring points and find a perpendicular bisector of the line joining the two points.', 'start': 706.079, 'duration': 7.442}, {'end': 715.561, 'text': 'that will give you a separation surface.', 'start': 713.521, 'duration': 2.04}, {'end': 725.706, 'text': 'So, like this, you can draw the Voronoi diagram and in this Voronoi diagram in this region this is the nearest point for this region,', 'start': 716.162, 'duration': 9.544}, {'end': 727.027, 'text': 'all points in this region.', 'start': 725.706, 'duration': 1.321}, {'end': 728.148, 'text': 'this is the neighboring point.', 'start': 727.027, 'duration': 1.121}, {'end': 733.953, 'text': 'So, this is captures the decision boundary of one nearest neighbor.', 'start': 728.629, 'duration': 5.324}, {'end': 742.941, 'text': 'And, as we have seen, if you look at this slide, this is the basic k nearest neighbor classification,', 'start': 735.936, 'duration': 7.005}], 'summary': 'Voronoi diagram divides space into regions for nearest neighbor classification.', 'duration': 89.3, 'max_score': 653.641, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/PNglugooJUQ/pics/PNglugooJUQ653641.jpg'}, {'end': 856.621, 'src': 'embed', 'start': 776.357, 'weight': 0, 'content': [{'end': 783.163, 'text': 'So, what we will discuss are certain points related to this algorithm and how to improve the algorithm.', 'start': 776.357, 'duration': 6.806}, {'end': 792.331, 'text': 'Some of the issues that we will discuss is when we use k nearest neighbor where k is greater than 1,', 'start': 783.563, 'duration': 8.768}, {'end': 797.515, 'text': 'is there a possibility of giving different weights to this k neighbors?', 'start': 792.331, 'duration': 5.184}, {'end': 802.7, 'text': 'So, we can weight different examples may be because of distance.', 'start': 798.576, 'duration': 4.124}, {'end': 810.083, 'text': 'Second issue is how to measure closeness, what type of distance function one can use.', 'start': 803.9, 'duration': 6.183}, {'end': 817.907, 'text': 'Today we will discuss that we can use Euclidean distance function as one of the metric, but other distance functions are possible.', 'start': 810.643, 'duration': 7.264}, {'end': 828.352, 'text': 'The third issue which is important which we will discuss later is how to find the closest points quickly at run time.', 'start': 818.707, 'duration': 9.645}, {'end': 835.638, 'text': 'As we have seen in such lazy algorithms During training time we do not learn a model.', 'start': 829.092, 'duration': 6.546}, {'end': 839.422, 'text': 'at prediction time we get a point and find the nearest instances.', 'start': 835.638, 'duration': 3.784}, {'end': 848.874, 'text': 'If we do not use a good data structure or a good method to store the examples, we have to go through all the training examples,', 'start': 840.123, 'duration': 8.751}, {'end': 851.757, 'text': 'find the distance and find the smallest of them.', 'start': 848.874, 'duration': 2.883}, {'end': 856.621, 'text': 'and if the training set is large this may take considerable time.', 'start': 852.438, 'duration': 4.183}], 'summary': 'Discussing improvements for k-nearest neighbor algorithm and its challenges.', 'duration': 80.264, 'max_score': 776.357, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/PNglugooJUQ/pics/PNglugooJUQ776357.jpg'}], 'start': 616.699, 'title': 'Nearest neighbor classification and k nearest neighbor algorithm', 'summary': 'Covers the concepts of one nearest neighbor and k nearest neighbor algorithms, including the use of voronoi diagrams and the euclidean distance function, and the need for efficient data structures for quick nearest point retrieval.', 'chapters': [{'end': 742.941, 'start': 616.699, 'title': 'Nearest neighbor classification', 'summary': 'Explains the concept of one nearest neighbor classification, dividing the space into regions using voronoi diagrams, and capturing the decision boundary.', 'duration': 126.242, 'highlights': ['The classification of a point in the space for one nearest neighbor is based on the nearest neighbor, dividing the space into regions using a Voronoi diagram.', 'The Voronoi diagram is constructed by finding perpendicular bisectors of the lines joining neighboring points, capturing the decision boundary of one nearest neighbor.', 'The chapter explains the basic concept of k-nearest neighbor classification, providing an overview of the approach.']}, {'end': 1040.397, 'start': 742.941, 'title': 'K nearest neighbor algorithm', 'summary': 'Explains the k nearest neighbor algorithm, discussing the use of euclidean distance function, the possibility of giving different weights to neighbors, and the need for efficient data structures for quick nearest point retrieval.', 'duration': 297.456, 'highlights': ['The chapter explains the K Nearest Neighbor algorithm. The transcript discusses the K Nearest Neighbor algorithm for classification and regression tasks.', 'Discussing the use of Euclidean distance function as a metric for measuring closeness. The chapter discusses the use of Euclidean distance function as one of the metrics for measuring the distance between instances, providing a specific formula for calculating the Euclidean distance.', 'Possibility of giving different weights to the k neighbors. The transcript highlights the potential to assign different weights to the k neighbors, possibly based on distance, in the K Nearest Neighbor algorithm.', 'The need for efficient data structures for quick nearest point retrieval. Efficient data structures are emphasized for quick retrieval of nearest points during prediction time in the K Nearest Neighbor algorithm, especially when dealing with a large training set.']}], 'duration': 423.698, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/PNglugooJUQ/pics/PNglugooJUQ616699.jpg', 'highlights': ['The need for efficient data structures for quick nearest point retrieval. Efficient data structures are emphasized for quick retrieval of nearest points during prediction time in the K Nearest Neighbor algorithm, especially when dealing with a large training set.', 'The classification of a point in the space for one nearest neighbor is based on the nearest neighbor, dividing the space into regions using a Voronoi diagram.', 'The Voronoi diagram is constructed by finding perpendicular bisectors of the lines joining neighboring points, capturing the decision boundary of one nearest neighbor.', 'The chapter explains the K Nearest Neighbor algorithm. The transcript discusses the K Nearest Neighbor algorithm for classification and regression tasks.', 'Discussing the use of Euclidean distance function as a metric for measuring closeness. The chapter discusses the use of Euclidean distance function as one of the metrics for measuring the distance between instances, providing a specific formula for calculating the Euclidean distance.', 'Possibility of giving different weights to the k neighbors. The transcript highlights the potential to assign different weights to the k neighbors, possibly based on distance, in the K Nearest Neighbor algorithm.', 'The chapter explains the basic concept of k-nearest neighbor classification, providing an overview of the approach.']}, {'end': 1387.283, 'segs': [{'end': 1186.576, 'src': 'embed', 'start': 1052.034, 'weight': 1, 'content': [{'end': 1064.461, 'text': 'So, under such circumstances, there is a case for using k larger than 1 and we will see that when we use a larger value of k, we get a better,', 'start': 1052.034, 'duration': 12.427}, {'end': 1066.823, 'text': 'a smoother classifier.', 'start': 1064.461, 'duration': 2.362}, {'end': 1077.669, 'text': 'Now, the second thing we will worry about is when we look at the Euclidean distance.', 'start': 1072.326, 'duration': 5.343}, {'end': 1086.314, 'text': 'we are taking the sum of squares root over sum of squares of the distance for each attribute.', 'start': 1079.591, 'duration': 6.723}, {'end': 1096.878, 'text': 'Let us not use k here, because k is used for k nearest neighbor, but we are talking about you know an index variable.', 'start': 1086.374, 'duration': 10.504}, {'end': 1098.299, 'text': 'So, let us use m.', 'start': 1096.978, 'duration': 1.321}, {'end': 1115.264, 'text': 'Now, all attributes are not equally important or the attributes may have different scales.', 'start': 1107.6, 'duration': 7.664}, {'end': 1129.677, 'text': 'So, a normal Euclidean distance treats all the attributes at the same scale, but there may be a case for giving different weights to the attribute.', 'start': 1116.185, 'duration': 13.492}, {'end': 1139.025, 'text': 'When you are giving equal weights to the attributes, you can give equal weights only under certain circumstances or certain assumptions.', 'start': 1130.317, 'duration': 8.708}, {'end': 1156.584, 'text': 'should we give equal weights to all attributes.', 'start': 1153.382, 'duration': 3.202}, {'end': 1165.83, 'text': 'Now, you can do so only if the scale of the attributes are similar.', 'start': 1157.905, 'duration': 7.925}, {'end': 1183.234, 'text': 'What do you mean by scale?', 'start': 1182.093, 'duration': 1.141}, {'end': 1186.576, 'text': 'Suppose height is an attribute.', 'start': 1184.114, 'duration': 2.462}], 'summary': 'Using k larger than 1 yields a smoother classifier, concerns about attribute scales and weights.', 'duration': 134.542, 'max_score': 1052.034, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/PNglugooJUQ/pics/PNglugooJUQ1052034.jpg'}, {'end': 1306.118, 'src': 'embed', 'start': 1263.54, 'weight': 3, 'content': [{'end': 1268.065, 'text': 'So, their ranges should be similar and the variance should be similar.', 'start': 1263.54, 'duration': 4.525}, {'end': 1283.217, 'text': 'under such condition only you can go for such a simple Euclidean distance function.', 'start': 1276.115, 'duration': 7.102}, {'end': 1290.14, 'text': 'Also you are assuming that you are taking x i m minus y i m whole square.', 'start': 1283.878, 'duration': 6.262}, {'end': 1292.981, 'text': 'So, you are assuming that classes are spherical.', 'start': 1290.62, 'duration': 2.361}, {'end': 1306.118, 'text': 'So, the second assumption is that classes are spherical in shape.', 'start': 1294.321, 'duration': 11.797}], 'summary': 'For euclidean distance, require similar ranges and variance, assuming spherical classes.', 'duration': 42.578, 'max_score': 1263.54, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/PNglugooJUQ/pics/PNglugooJUQ1263540.jpg'}, {'end': 1387.283, 'src': 'embed', 'start': 1337.354, 'weight': 0, 'content': [{'end': 1341.536, 'text': 'And the way you can overcome this problem, there are several things that one can do.', 'start': 1337.354, 'duration': 4.182}, {'end': 1358.642, 'text': 'One is you use larger k to smooth out the differences and you use weighted Euclidean, you use weighted distance function.', 'start': 1342.236, 'duration': 16.406}, {'end': 1382.279, 'text': 'Now, what we will do? Now, when you say you use larger k, you have to have some idea how large or small k impacts.', 'start': 1371.091, 'duration': 11.188}, {'end': 1387.283, 'text': 'I think before we try to look at that, let us look at some examples.', 'start': 1382.88, 'duration': 4.403}], 'summary': 'Using a larger k and weighted distance function can help overcome differences in data, but the impact of k size needs to be considered.', 'duration': 49.929, 'max_score': 1337.354, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/PNglugooJUQ/pics/PNglugooJUQ1337354.jpg'}], 'start': 1052.034, 'title': 'Euclidean distance challenges', 'summary': 'Discusses the impact of k-value on classifier smoothness, using variable m, considering attribute weights, conditions for equal attribute weights, challenges of basic euclidean distance for clustering, and proposes solutions like using larger k and weighted distance functions.', 'chapters': [{'end': 1129.677, 'start': 1052.034, 'title': 'K-value and attribute weight in euclidean distance', 'summary': 'Discusses the impact of using a larger k value on classifier smoothness, the use of variable m instead of k in the context of an index variable, and the importance of considering different attribute weights in euclidean distance calculations.', 'duration': 77.643, 'highlights': ['When using a larger value of k, a smoother classifier is obtained.', 'The discussion introduces the use of variable m instead of k in the context of an index variable.', 'It emphasizes the importance of considering different weights for attributes in Euclidean distance calculations.']}, {'end': 1262.96, 'start': 1130.317, 'title': 'Equal weight and attribute scale', 'summary': 'Discusses the conditions for giving equal weights to attributes, emphasizing the importance of similar scales and range for attributes to contribute equally to the distance in training examples.', 'duration': 132.643, 'highlights': ['The scale of attributes must be similar for giving equal weights, as differences in scale will result in different contributions to the distance in training examples.', 'Attributes should have equal range to be given equal weights, as attributes with different ranges will not contribute equally.']}, {'end': 1387.283, 'start': 1263.54, 'title': 'Challenges of using euclidean distance', 'summary': 'Discusses the limitations of using basic euclidean distance for clustering, highlighting issues with spherical class assumptions, importance of attributes, and noise levels, and proposes solutions such as using larger k and weighted distance functions.', 'duration': 123.743, 'highlights': ['Using larger k to smooth out differences and weighted distance functions can overcome the problems with Euclidean distance, addressing issues such as non-spherical classes, varying attribute importance, and noise levels.', 'The assumption of classes being spherical and equal variance is necessary for using a simple Euclidean distance function.', 'Challenges arise if the classes are not spherical, certain attributes are more important, or if there is more noise in some attributes than others, affecting the performance of the distance function.']}], 'duration': 335.249, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/PNglugooJUQ/pics/PNglugooJUQ1052034.jpg', 'highlights': ['Using larger k to smooth out differences and weighted distance functions can overcome the problems with Euclidean distance.', 'When using a larger value of k, a smoother classifier is obtained.', 'The discussion introduces the use of variable m instead of k in the context of an index variable.', 'The assumption of classes being spherical and equal variance is necessary for using a simple Euclidean distance function.', 'The scale of attributes must be similar for giving equal weights, as differences in scale will result in different contributions to the distance in training examples.']}, {'end': 1760.568, 'segs': [{'end': 1462.788, 'src': 'embed', 'start': 1423.91, 'weight': 0, 'content': [{'end': 1447.177, 'text': 'So, you see that this decision boundary is very you know it is not yet it is not at all So, what we can see is that if you have small value of k.', 'start': 1423.91, 'duration': 23.267}, {'end': 1454.482, 'text': 'So, small value of k it captures fine structures of the problem space better.', 'start': 1447.177, 'duration': 7.305}, {'end': 1462.788, 'text': 'You can see that these lines capture very fine structures of the problem space.', 'start': 1457.665, 'duration': 5.123}], 'summary': 'Small k captures fine structures better in decision boundary.', 'duration': 38.878, 'max_score': 1423.91, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/PNglugooJUQ/pics/PNglugooJUQ1423910.jpg'}, {'end': 1632.87, 'src': 'embed', 'start': 1562.87, 'weight': 1, 'content': [{'end': 1565.011, 'text': 'So use large K.', 'start': 1562.87, 'duration': 2.141}, {'end': 1586.651, 'text': 'under these circumstances, when you use large K, the classifier that you get is less sensitive to noise, particularly noise in the output class.', 'start': 1565.011, 'duration': 21.64}, {'end': 1597.934, 'text': 'So, particularly class noise.', 'start': 1595.113, 'duration': 2.821}, {'end': 1610.953, 'text': 'you get better probability estimates for discrete classes.', 'start': 1606.226, 'duration': 4.727}, {'end': 1625.267, 'text': 'And you can use for discrete classes.', 'start': 1619.785, 'duration': 5.482}, {'end': 1632.87, 'text': 'If the classes are discrete you get better probability estimates if you use large values of k.', 'start': 1626.428, 'duration': 6.442}], 'summary': 'Using large k in classification reduces sensitivity to noise and improves probability estimates for discrete classes.', 'duration': 70, 'max_score': 1562.87, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/PNglugooJUQ/pics/PNglugooJUQ1562870.jpg'}, {'end': 1760.568, 'src': 'embed', 'start': 1733.569, 'weight': 2, 'content': [{'end': 1741.319, 'text': 'So, this is the region where the test error is smallest and this is the region where the cross validation error is smallest.', 'start': 1733.569, 'duration': 7.75}, {'end': 1749.123, 'text': 'So, in this particular example the error is smallest at around k equal to 5 or k equal to 7.', 'start': 1741.92, 'duration': 7.203}, {'end': 1752.724, 'text': 'So where for small k the error is high, for large k also,', 'start': 1749.123, 'duration': 3.601}, {'end': 1760.568, 'text': 'it increases and some middle value of k you have the best performance if you look at the test error.', 'start': 1752.724, 'duration': 7.844}], 'summary': 'In this region, the test error is smallest around k=5 or k=7.', 'duration': 26.999, 'max_score': 1733.569, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/PNglugooJUQ/pics/PNglugooJUQ1733569.jpg'}], 'start': 1391.385, 'title': 'K-nearest neighbor algorithm', 'summary': 'Delves into the impact of k values on decision boundaries, illustrating that smaller k values capture fine structures better, while larger k values lead to smoother classes. it also discusses the circumstances for using large k values and analyzes model performance, highlighting the smallest test error at around k equal to 5 or 7.', 'chapters': [{'end': 1496.043, 'start': 1391.385, 'title': 'K-nearest neighbor decision boundaries', 'summary': 'Discusses the impact of the value of k on the decision boundaries in a k-nearest neighbor algorithm, demonstrating that smaller values of k capture fine structures of the problem space better, as evidenced by the detailed decision boundaries and the distinction between classes.', 'duration': 104.658, 'highlights': ['When using a small value of k, fine structures of the problem space are captured better, as demonstrated by the detailed decision boundaries and the distinct separation between classes.', 'The decision boundary becomes more detailed and captures fine differences between classes when k is small.']}, {'end': 1670.429, 'start': 1507.815, 'title': 'Choosing k in k-nearest neighbor', 'summary': 'Discusses the circumstances under which to use large values of k in k-nearest neighbor algorithm, including achieving smoother classes, less sensitivity to noise, and better probability estimates for discrete classes, particularly when dealing with larger training sets.', 'duration': 162.614, 'highlights': ['When using large K, the classifier is less sensitive to noise, particularly class noise, and provides better probability estimates for discrete classes.', 'Using large K results in smoother classes, as seen in an example with 15 nearest neighbors where all classes were classified as red rather than blue.', 'Larger training sets allow the use of larger values of K, while small training sets necessitate the use of small K.']}, {'end': 1760.568, 'start': 1671.67, 'title': 'Knn model performance analysis', 'summary': 'Analyzes the performance of a k-nearest neighbors model, demonstrating that for small and large k values the error is high, while for a middle value of k the test error is smallest, particularly at around k equal to 5 or 7.', 'duration': 88.898, 'highlights': ['The test error and cross validation error are smallest at around k equal to 5 or 7, demonstrating the best performance for these values of k.', 'For small k values, the error is high, and for large k values, the error also increases, indicating the suboptimal performance of these extreme k values.']}], 'duration': 369.183, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/PNglugooJUQ/pics/PNglugooJUQ1391385.jpg', 'highlights': ['Smaller k values capture fine structures better in decision boundaries.', 'Larger k values lead to smoother classes and are less sensitive to noise.', 'The smallest test error occurs at around k equal to 5 or 7, demonstrating the best performance for these values.']}, {'end': 2196.446, 'segs': [{'end': 1879.339, 'src': 'embed', 'start': 1827.241, 'weight': 0, 'content': [{'end': 1835.066, 'text': 'So for those attributes which are more important for the particular classification, we can use larger weights,', 'start': 1827.241, 'duration': 7.825}, {'end': 1839.812, 'text': 'and for less important attributes we can use smaller weights.', 'start': 1835.066, 'duration': 4.746}, {'end': 1842.933, 'text': 'These weights can also be decided based on.', 'start': 1840.092, 'duration': 2.841}, {'end': 1847.915, 'text': 'you know, if an attribute has larger range, we can use small weights.', 'start': 1842.933, 'duration': 4.982}, {'end': 1852.838, 'text': 'if the small range, we can use larger weights or we can scale the attributes.', 'start': 1847.915, 'duration': 4.923}, {'end': 1861.765, 'text': 'So, that they have similar range we can sort of normalize the attributes, so that they have same mean and same standard deviation.', 'start': 1852.918, 'duration': 8.847}, {'end': 1868.57, 'text': 'Based on that we can then fix the weights depending on the importance of the attributes.', 'start': 1862.185, 'duration': 6.385}, {'end': 1874.555, 'text': 'And if an attribute does not matter it is irrelevant the attribute has 0 weight.', 'start': 1869.131, 'duration': 5.424}, {'end': 1879.339, 'text': 'So, this brings us to an important question.', 'start': 1876.737, 'duration': 2.602}], 'summary': 'Weights can be adjusted based on attribute importance and range, aiming for similar ranges and normalizing attributes.', 'duration': 52.098, 'max_score': 1827.241, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/PNglugooJUQ/pics/PNglugooJUQ1827241.jpg'}, {'end': 1938.661, 'src': 'embed', 'start': 1908.832, 'weight': 3, 'content': [{'end': 1911.474, 'text': 'neighbor or instance based learning algorithms greatly.', 'start': 1908.832, 'duration': 2.642}, {'end': 1924.744, 'text': 'So it is important for us to remove extra features, because if you have a very high dimensional space,', 'start': 1912.054, 'duration': 12.69}, {'end': 1931.655, 'text': 'two items which are similar may still differ in some unimportant attributes,', 'start': 1924.744, 'duration': 6.911}, {'end': 1938.661, 'text': 'and the differences in distance between different pairs of items will be almost similar.', 'start': 1931.655, 'duration': 7.006}], 'summary': 'Removing extra features is important for neighbor-based learning algorithms in high dimensional spaces.', 'duration': 29.829, 'max_score': 1908.832, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/PNglugooJUQ/pics/PNglugooJUQ1908832.jpg'}, {'end': 2006.864, 'src': 'embed', 'start': 1964.057, 'weight': 4, 'content': [{'end': 1968.979, 'text': 'which can be difficult because large k means more emphasis on nearer neighbors.', 'start': 1964.057, 'duration': 4.922}, {'end': 1976.624, 'text': 'So, in the weighted k n n the prediction is based on this weighted average.', 'start': 1969.46, 'duration': 7.164}, {'end': 1981.502, 'text': 'and this weight can be based on the distance.', 'start': 1978.461, 'duration': 3.041}, {'end': 1989.084, 'text': 'The weight can be proportional to the inverse of the distance between the two examples.', 'start': 1982.122, 'duration': 6.962}, {'end': 2006.864, 'text': 'So, for a particular test example, suppose this is a test example and these are three nearest neighbors.', 'start': 1989.504, 'duration': 17.36}], 'summary': 'Weighted knn uses distance-based weights for prediction.', 'duration': 42.807, 'max_score': 1964.057, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/PNglugooJUQ/pics/PNglugooJUQ1964057.jpg'}, {'end': 2059.966, 'src': 'embed', 'start': 2036.175, 'weight': 5, 'content': [{'end': 2045.719, 'text': 'We can also, instead of using this distance in, you know the weight based on the, you know, inverse of the distance,', 'start': 2036.175, 'duration': 9.544}, {'end': 2050.161, 'text': 'we can use other types of functions for this weighting.', 'start': 2045.719, 'duration': 4.442}, {'end': 2053.643, 'text': 'So, one of the ideas is locally weighted averaging.', 'start': 2050.501, 'duration': 3.142}, {'end': 2059.966, 'text': 'What locally weighted averaging does is that suppose k is the number of training points.', 'start': 2054.003, 'duration': 5.963}], 'summary': 'Exploring alternative weighting functions, such as locally weighted averaging, for k training points.', 'duration': 23.791, 'max_score': 2036.175, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/PNglugooJUQ/pics/PNglugooJUQ2036175.jpg'}], 'start': 1761.849, 'title': 'Weighted distance functions and knn algorithms', 'summary': 'Discusses the use of weighted distance functions in k-means algorithm, allowing different weights for attributes and its impact on classification, as well as the importance of feature reduction in distance-weighted k-nearest neighbor (knn) algorithms, highlighting the tradeoff between small and large k, and the concept of locally weighted averaging.', 'chapters': [{'end': 1879.339, 'start': 1761.849, 'title': 'Weighted distance function in k-means', 'summary': 'Discusses the use of weighted distance function in k-means algorithm, allowing different weights for attributes and demonstrating its impact on classification.', 'duration': 117.49, 'highlights': ['Weighted distance function introduces weights for different attributes, allowing larger weights for more important attributes and smaller weights for less important ones. Introducing weighted Euclidean distance using different weights for different attributes, allowing for more impactful attributes to have larger weights and less impactful ones to have smaller weights.', 'The importance of attributes is determined through their impact on classification, with attributes considered irrelevant having a weight of 0. Attributes are assigned weights based on their importance for classification, with irrelevant attributes receiving a weight of 0.', 'Normalization of attributes can be performed to ensure they have similar range and mean, enabling better decision-making for assigning weights. Attributes can be normalized to have similar range and mean, aiding in the determination of appropriate weights for each attribute.']}, {'end': 2196.446, 'start': 1880.359, 'title': 'Distance weighted k nearest neighbor', 'summary': 'Discusses the importance of feature reduction in distance-weighted k-nearest neighbor (knn) algorithms, highlighting the tradeoff between small and large k, and the concept of locally weighted averaging.', 'duration': 316.087, 'highlights': ['Feature reduction is crucial for k-nearest neighbor algorithms due to the challenge of defining an appropriate similarity metric in high-dimensional spaces, impacting the ability to find good representative training examples for a given test example. Importance of feature reduction in k-nearest neighbor algorithms', 'In distance-weighted kNN, there is a tradeoff between small and large k, where large k emphasizes nearer neighbors, impacting the prediction based on weighted averages. Tradeoff between small and large k in distance-weighted kNN', 'Locally weighted averaging introduces the concept of using a large value of k, assigning different weights to training examples based on a weighting function that falls off rapidly with distance, such as a Gaussian function. Concept of locally weighted averaging and its application in kNN']}], 'duration': 434.597, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/PNglugooJUQ/pics/PNglugooJUQ1761849.jpg', 'highlights': ['Weighted distance function introduces weights for different attributes, allowing larger weights for more important attributes and smaller weights for less important ones.', 'The importance of attributes is determined through their impact on classification, with attributes considered irrelevant having a weight of 0.', 'Normalization of attributes can be performed to ensure they have similar range and mean, enabling better decision-making for assigning weights.', 'Feature reduction is crucial for k-nearest neighbor algorithms due to the challenge of defining an appropriate similarity metric in high-dimensional spaces.', 'In distance-weighted kNN, there is a tradeoff between small and large k, where large k emphasizes nearer neighbors, impacting the prediction based on weighted averages.', 'Locally weighted averaging introduces the concept of using a large value of k, assigning different weights to training examples based on a weighting function that falls off rapidly with distance, such as a Gaussian function.']}], 'highlights': ['The lecture discusses instance-based learning, k-nearest neighbor algorithm, and feature selection in supervised learning.', 'The chapter emphasizes the need to find methods for feature selection to address the problem of feature dimensionality.', 'The module aims to provide an understanding of learning linear regression, decision trees, and other methods to estimate the function f.', 'The chapter explains the basic concept of k-nearest neighbor classification, providing an overview of the approach.', 'The need for efficient data structures for quick nearest point retrieval. Efficient data structures are emphasized for quick retrieval of nearest points during prediction time in the K Nearest Neighbor algorithm, especially when dealing with a large training set.', 'The classification of a point in the space for one nearest neighbor is based on the nearest neighbor, dividing the space into regions using a Voronoi diagram.', 'The discussion introduces the use of variable m instead of k in the context of an index variable.', 'Smaller k values capture fine structures better in decision boundaries.', 'Weighted distance function introduces weights for different attributes, allowing larger weights for more important attributes and smaller weights for less important ones.', 'Feature reduction is crucial for k-nearest neighbor algorithms due to the challenge of defining an appropriate similarity metric in high-dimensional spaces.']}