title
Lecture 2 | Image Classification

description
Lecture 2 formalizes the problem of image classification. We discuss the inherent difficulties of image classification, and introduce data-driven approaches. We discuss two simple data-driven image classification algorithms: K-Nearest Neighbors and Linear Classifiers, and introduce the concepts of hyperparameters and cross-validation. Keywords: Image classification, K-Nearest Neighbor, distance metrics, hyperparameters, cross-validation, linear classifiers Slides: http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture2.pdf -------------------------------------------------------------------------------------- Convolutional Neural Networks for Visual Recognition Instructors: Fei-Fei Li: http://vision.stanford.edu/feifeili/ Justin Johnson: http://cs.stanford.edu/people/jcjohns/ Serena Yeung: http://ai.stanford.edu/~syyeung/ Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are visual recognition tasks such as image classification, localization and detection. Recent developments in neural network (aka “deep learning”) approaches have greatly advanced the performance of these state-of-the-art visual recognition systems. This lecture collection is a deep dive into details of the deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. From this lecture collection, students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. Website: http://cs231n.stanford.edu/ For additional learning opportunities please visit: http://online.stanford.edu/

detail
{'title': 'Lecture 2 | Image Classification', 'heatmap': [{'end': 1536.599, 'start': 1498.167, 'weight': 0.807}, {'end': 2466.247, 'start': 2428.655, 'weight': 0.805}, {'end': 2866.258, 'start': 2821.255, 'weight': 0.711}, {'end': 3111.665, 'start': 3035.598, 'weight': 0.941}, {'end': 3216.107, 'start': 3143.806, 'weight': 0.88}], 'summary': 'Cs231n lecture 2 delves into learning algorithms, covering the implementation of k-nearest neighbor, svm, softmax, and a two-layer neural network for image classification challenges. it emphasizes the shift to data-driven approaches for training, explores nearest neighbor and k-nearest neighbor algorithms, discusses hyperparameter selection, and highlights the challenges and limitations of k-nearest neighbor classifiers, as well as the importance of linear classification in deep learning.', 'chapters': [{'end': 122.9, 'segs': [{'end': 47.798, 'src': 'embed', 'start': 22.412, 'weight': 1, 'content': [{'end': 27.617, 'text': "And we'll start to see in much more depth exactly how some of these learning algorithms actually work in practice.", 'start': 22.412, 'duration': 5.205}, {'end': 32.862, 'text': 'So the first lecture of the class is probably the sort of the largest big picture vision,', 'start': 28.798, 'duration': 4.064}, {'end': 36.486, 'text': 'and the majority of the lectures in this class will be much more detail oriented,', 'start': 32.862, 'duration': 3.624}, {'end': 39.149, 'text': 'much more focused on the specific mechanics of these different algorithms.', 'start': 36.486, 'duration': 2.663}, {'end': 43.554, 'text': "So today we'll see our first learning algorithm and that'll be really exciting I think.", 'start': 40.571, 'duration': 2.983}, {'end': 47.798, 'text': 'But before we get to that, I wanted to talk about a couple administrative issues.', 'start': 44.255, 'duration': 3.543}], 'summary': 'The class will cover learning algorithms in depth, with the majority of lectures focusing on specific mechanics.', 'duration': 25.386, 'max_score': 22.412, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG022412.jpg'}, {'end': 122.9, 'src': 'embed', 'start': 77.026, 'weight': 0, 'content': [{'end': 81.791, 'text': "You'll probably get answers to your questions faster on Piazza because all the TAs are knowing to check that.", 'start': 77.026, 'duration': 4.765}, {'end': 85.895, 'text': "And it's sort of easy for emails to get lost in the shuffle if you just send to the course list.", 'start': 82.472, 'duration': 3.423}, {'end': 92.259, 'text': "It's also come to my attention that some SCPD students are having a bit of a hard time signing up for Piazza.", 'start': 87.236, 'duration': 5.023}, {'end': 99.362, 'text': 'SCPD students are supposed to receive a at stanford.edu email address.', 'start': 93.619, 'duration': 5.743}, {'end': 104.004, 'text': 'So once you get that email address, then you can use the Stanford email to sign into Piazza.', 'start': 99.762, 'duration': 4.242}, {'end': 110.047, 'text': "Probably that doesn't affect those of you who are sitting in the room right now, but for those students listening on SCPD.", 'start': 105.105, 'duration': 4.942}, {'end': 116.556, 'text': 'The next administrative issue is about Assignment 1.', 'start': 113.073, 'duration': 3.483}, {'end': 120.399, 'text': 'Assignment 1 will be up later today, probably sometime this afternoon.', 'start': 116.556, 'duration': 3.843}, {'end': 122.9, 'text': "But I promise before I go to sleep tonight, it'll be up.", 'start': 120.499, 'duration': 2.401}], 'summary': 'Tas recommend piazza for faster responses. scpd students face email and piazza sign-up issues. assignment 1 to be up today.', 'duration': 45.874, 'max_score': 77.026, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG077026.jpg'}], 'start': 4.875, 'title': 'Cs231n lecture 2: learning algorithms', 'summary': 'Discusses the transition to a more detailed exploration of learning algorithms, emphasizing the significance of piazza for communication and the imminent release of assignment 1.', 'chapters': [{'end': 122.9, 'start': 4.875, 'title': 'Cs231n lecture 2: learning algorithms', 'summary': 'Discusses the transition from the big picture view to a more detailed exploration of learning algorithms, highlighting the significance of piazza for communication and the upcoming release of assignment 1.', 'duration': 118.025, 'highlights': ['The majority of the lectures in this class will be much more detail oriented, much more focused on the specific mechanics of these different algorithms.', 'Piazza is emphasized as the main source of communication between the students and the core staff, with over 500 students signed up and SCPD students facing issues with sign-up due to email access.', 'Assignment 1 will be up later today, probably sometime this afternoon, with an assurance that it will be available before the end of the day.']}], 'duration': 118.025, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG04875.jpg', 'highlights': ['Assignment 1 will be up later today, probably sometime this afternoon, with an assurance that it will be available before the end of the day.', 'The majority of the lectures in this class will be much more detail oriented, much more focused on the specific mechanics of these different algorithms.', 'Piazza is emphasized as the main source of communication between the students and the core staff, with over 500 students signed up and SCPD students facing issues with sign-up due to email access.']}, {'end': 562.155, 'segs': [{'end': 174.651, 'src': 'embed', 'start': 141.931, 'weight': 2, 'content': [{'end': 144.572, 'text': 'But the content of the assignment will still be the same as last year.', 'start': 141.931, 'duration': 2.641}, {'end': 150.914, 'text': "So in this assignment, you'll be implementing your own k-nearest neighbor classifier, which we're gonna talk about in this lecture.", 'start': 145.692, 'duration': 5.222}, {'end': 158.437, 'text': "You'll also implement several different linear classifiers, including the SVM and Softmax, as well as a simple two-layer neural network.", 'start': 151.594, 'duration': 6.843}, {'end': 161.378, 'text': "And we'll cover all of this content over the next couple of lectures.", 'start': 159.017, 'duration': 2.361}, {'end': 166.689, 'text': 'So all of our assignments are using Python and NumPy.', 'start': 164.068, 'duration': 2.621}, {'end': 174.651, 'text': "If you aren't familiar with Python or NumPy, then we have written a tutorial that you can find on the course website to try and get you up to speed.", 'start': 167.369, 'duration': 7.282}], 'summary': 'Implement k-nearest neighbor, svm, softmax, and neural network classifiers using python and numpy.', 'duration': 32.72, 'max_score': 141.931, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG0141931.jpg'}, {'end': 205.887, 'src': 'embed', 'start': 177.791, 'weight': 3, 'content': [{'end': 184.353, 'text': 'NumPy lets you write these very efficient vectorized operations that let you do quite a lot of computation in just a couple lines of code.', 'start': 177.791, 'duration': 6.562}, {'end': 193.799, 'text': 'So this is super important for pretty much all aspects of numerical computing and machine learning and everything like that is efficiently implementing these vectorized operations.', 'start': 184.853, 'duration': 8.946}, {'end': 197.461, 'text': "And you'll get a lot of practice with this on the first assignment.", 'start': 194.66, 'duration': 2.801}, {'end': 205.887, 'text': "So, for those of you who don't have a lot of experience with MATLAB or NumPy or other types of vectorized tensor computation,", 'start': 197.982, 'duration': 7.905}], 'summary': 'Numpy enables efficient vectorized operations for numerical computing and machine learning, crucial for assignments and those new to matlab or numpy.', 'duration': 28.096, 'max_score': 177.791, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG0177791.jpg'}, {'end': 263.955, 'src': 'embed', 'start': 236.471, 'weight': 1, 'content': [{'end': 244.858, 'text': "But our intention is that you'll be able to just download some image and it'll be very seamless for you to work on the assignment on one of these instances on the cloud.", 'start': 236.471, 'duration': 8.387}, {'end': 249.402, 'text': 'And because Google has very generously supported this course,', 'start': 245.779, 'duration': 3.623}, {'end': 255.387, 'text': "we'll be able to distribute to each of you coupons that let you use Google Cloud Credits for free for the class.", 'start': 249.402, 'duration': 5.985}, {'end': 263.955, 'text': 'So you can feel free to use these for the assignments and also for the course projects when you want to start using GPUs and larger machines and whatnot.', 'start': 256.488, 'duration': 7.467}], 'summary': 'Google will provide free google cloud credits for class assignments and projects.', 'duration': 27.484, 'max_score': 236.471, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG0236471.jpg'}, {'end': 330.035, 'src': 'embed', 'start': 301.407, 'weight': 0, 'content': [{'end': 308.168, 'text': "And this is something that we'll really focus on throughout the course of the class, is exactly how do we work on this image classification task.", 'start': 301.407, 'duration': 6.761}, {'end': 315.35, 'text': "So a little bit more concretely, when you're doing image classification, your system receives some input image,", 'start': 308.789, 'duration': 6.561}, {'end': 322.372, 'text': 'which is this cute cat in this example, and the system is aware of some predetermined set of categories or labels.', 'start': 315.35, 'duration': 7.022}, {'end': 330.035, 'text': "So these might be like a dog or a cat, or a truck or a plane, and there's some fixed set of category labels,", 'start': 323.412, 'duration': 6.623}], 'summary': 'Focus on image classification task, with predetermined set of categories and labels.', 'duration': 28.628, 'max_score': 301.407, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG0301407.jpg'}], 'start': 124.201, 'title': 'Implementing machine learning algorithms and image classification challenges', 'summary': 'Covers the implementation of machine learning algorithms such as k-nearest neighbor, svm, softmax, and a two-layer neural network, and discusses the challenges of image classification, highlighting the complexities of the task for machines and the support for google cloud, including the provision of free google cloud credits for the class.', 'chapters': [{'end': 158.437, 'start': 124.201, 'title': 'Implementing machine learning algorithms', 'summary': 'Discusses implementing machine learning algorithms including k-nearest neighbor, svm, softmax, and a two-layer neural network, and mentions the changes from python 2.7 to python 3.', 'duration': 34.236, 'highlights': ['Implementing k-nearest neighbor, SVM, Softmax, and a two-layer neural network for the assignment.', 'Mention of upgrading to work with Python 3 rather than Python 2.7 and minor cosmetic changes.', 'The content of the assignment will be the same as last year.']}, {'end': 562.155, 'start': 159.017, 'title': 'Image classification challenges and google cloud support', 'summary': 'Discusses the challenges of image classification, emphasizing the complexities of the task for machines and the support for google cloud, including the provision of free google cloud credits for the class.', 'duration': 403.138, 'highlights': ['The chapter covers the challenges of image classification, including issues related to viewpoint, illumination, object deformation, occlusion, background clutter, and intraclass variation, making it a highly complex task for machines to perform. It explains the various challenges in image classification, such as viewpoint, illumination, object deformation, occlusion, background clutter, and intraclass variation, which make it a highly complex task for machines.', 'The support from Google Cloud is highlighted, with the provision of free Google Cloud Credits for the class, enabling students to utilize GPUs and larger machines for assignments and course projects. It emphasizes the support from Google Cloud, including the provision of free Google Cloud Credits for the class, allowing students to use GPUs and larger machines for assignments and course projects.', 'The importance of efficiently implementing vectorized operations using NumPy in Python for numerical computing and machine learning is stressed, with the assurance of practice through the first assignment. It stresses the importance of efficiently implementing vectorized operations using NumPy in Python for numerical computing and machine learning, ensuring practice through the first assignment.']}], 'duration': 437.954, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG0124201.jpg', 'highlights': ['Covers the challenges of image classification, including viewpoint, illumination, object deformation, occlusion, background clutter, and intraclass variation.', 'Support from Google Cloud, including provision of free Google Cloud Credits for the class, enabling students to utilize GPUs and larger machines for assignments and course projects.', 'Implementing k-nearest neighbor, SVM, Softmax, and a two-layer neural network for the assignment.', 'Stresses the importance of efficiently implementing vectorized operations using NumPy in Python for numerical computing and machine learning.']}, {'end': 1026.724, 'segs': [{'end': 627.35, 'src': 'embed', 'start': 602.469, 'weight': 1, 'content': [{'end': 608.272, 'text': 'explicit algorithm that makes intuitive sense for how you might go about recognizing these objects.', 'start': 602.469, 'duration': 5.803}, {'end': 612.999, 'text': 'So this is again quite challenging if you think about like, if you knew nothing,', 'start': 608.832, 'duration': 4.167}, {'end': 618.067, 'text': 'if it was your first day programming and you had to sit down and write this function, I think most people would be in trouble.', 'start': 612.999, 'duration': 5.068}, {'end': 627.35, 'text': 'That being said, people have definitely made explicit attempts to try to write sort of hand-coded rules for recognizing different animals.', 'start': 619.946, 'duration': 7.404}], 'summary': 'Challenging to write hand-coded rules for recognizing objects, requires intuitive algorithm.', 'duration': 24.881, 'max_score': 602.469, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG0602469.jpg'}, {'end': 687.543, 'src': 'embed', 'start': 657.425, 'weight': 2, 'content': [{'end': 661.067, 'text': 'and then kind of write down this explicit set of rules for recognizing cats.', 'start': 657.425, 'duration': 3.642}, {'end': 664.449, 'text': 'But this turns out not to work very well.', 'start': 662.508, 'duration': 1.941}, {'end': 666.39, 'text': "One, it's super brittle.", 'start': 665.469, 'duration': 0.921}, {'end': 676.056, 'text': 'And two say if you want to start over for another object category and maybe not worry about cats but talk about trucks or dogs or fishes or something else,', 'start': 667.05, 'duration': 9.006}, {'end': 677.497, 'text': 'then you need to start all over again.', 'start': 676.056, 'duration': 1.441}, {'end': 680.278, 'text': 'So this is really not a very scalable approach.', 'start': 677.957, 'duration': 2.321}, {'end': 687.543, 'text': 'We want to come up with some algorithm or some method for these recognition tasks which scales much more, naturally,', 'start': 680.699, 'duration': 6.844}], 'summary': 'Explicit rules for recognizing cats are not scalable; need a more natural, scalable approach.', 'duration': 30.118, 'max_score': 657.425, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG0657425.jpg'}, {'end': 755.723, 'src': 'embed', 'start': 724.926, 'weight': 0, 'content': [{'end': 730.27, 'text': 'By the way, this actually takes quite a lot of effort to go out and actually collect these data sets, but people, luckily,', 'start': 724.926, 'duration': 5.344}, {'end': 733.893, 'text': "there's a lot of really good, high quality data sets out there already for you to use.", 'start': 730.27, 'duration': 3.623}, {'end': 741.797, 'text': 'Then, once we get this data set, we train this machine learning classifier that is gonna ingest all of the data,', 'start': 734.954, 'duration': 6.843}, {'end': 748.54, 'text': 'summarize it in some way and then spit out a model that summarizes the knowledge of how to recognize these different object categories.', 'start': 741.797, 'duration': 6.743}, {'end': 755.723, 'text': "Then finally we'll use this trained model and apply it on new images that will then be able to recognize cats and dogs and whatnot.", 'start': 749.34, 'duration': 6.383}], 'summary': 'Effort to collect data sets, train machine learning classifier, recognize object categories and apply on new images.', 'duration': 30.797, 'max_score': 724.926, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG0724926.jpg'}, {'end': 822.048, 'src': 'embed', 'start': 796.756, 'weight': 4, 'content': [{'end': 803.299, 'text': "And I think it's useful to sort of step through this process for a very simple classifier first before we get to these big complex ones.", 'start': 796.756, 'duration': 6.543}, {'end': 809.222, 'text': 'So probably the simplest classifier you can imagine is something we call nearest neighbor.', 'start': 804.039, 'duration': 5.183}, {'end': 811.803, 'text': 'The algorithm is pretty dumb, honestly.', 'start': 809.762, 'duration': 2.041}, {'end': 814.944, 'text': "So during the training step, we won't do anything.", 'start': 812.323, 'duration': 2.621}, {'end': 817.046, 'text': "We'll just memorize all of the training data.", 'start': 815.145, 'duration': 1.901}, {'end': 819.387, 'text': 'So this is very simple.', 'start': 817.766, 'duration': 1.621}, {'end': 822.048, 'text': 'And now, during the prediction step,', 'start': 819.947, 'duration': 2.101}], 'summary': 'Simplest classifier is nearest neighbor, memorizes all training data.', 'duration': 25.292, 'max_score': 796.756, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG0796756.jpg'}, {'end': 868.099, 'src': 'embed', 'start': 840.88, 'weight': 3, 'content': [{'end': 845.823, 'text': 'So, to be a little bit more concrete, you might imagine working on this dataset called CIFAR-10,,', 'start': 840.88, 'duration': 4.943}, {'end': 851.988, 'text': "which is very commonly used in machine learning as kind of a small test case, and you'll be working with this dataset on your homework.", 'start': 845.823, 'duration': 6.165}, {'end': 859.514, 'text': 'So the CIFAR-10 dataset gives you 10 different classes, airplanes and automobiles and birds and cats and different things like that.', 'start': 852.488, 'duration': 7.026}, {'end': 868.099, 'text': 'And for each of those 10 categories it provides 10,000, sorry, it provides 50,000 training images,', 'start': 860.895, 'duration': 7.204}], 'summary': 'The cifar-10 dataset contains 10 classes and 50,000 training images per category.', 'duration': 27.219, 'max_score': 840.88, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG0840880.jpg'}, {'end': 1000.107, 'src': 'embed', 'start': 971.443, 'weight': 5, 'content': [{'end': 975.567, 'text': 'we actually have many different choices for exactly what that comparison function should look like.', 'start': 971.443, 'duration': 4.124}, {'end': 982.815, 'text': "So in the example in the previous slide, we've used what's called the L1 distance, also sometimes called the Manhattan distance.", 'start': 976.589, 'duration': 6.226}, {'end': 987.679, 'text': 'So this is a really, really sort of simple, easy idea for comparing images.', 'start': 983.416, 'duration': 4.263}, {'end': 993.263, 'text': "And that's that we're gonna take the, just compare individual pixels in these images.", 'start': 988.299, 'duration': 4.964}, {'end': 1000.107, 'text': 'So, supposing that our test image is maybe just a tiny four by four image of pixel values,', 'start': 993.843, 'duration': 6.264}], 'summary': 'Comparison function options include l1 distance for image pixel comparison.', 'duration': 28.664, 'max_score': 971.443, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG0971443.jpg'}], 'start': 564.563, 'title': 'Data-driven image classification', 'summary': 'Discusses challenges in traditional algorithms, limitations of hand-coded rules, and the shift towards a data-driven approach for training machine learning classifiers, enabling scalability and improved recognition of object categories. it also explores the data-driven approach in a nearest neighbor classifier applied to the cifar-10 dataset, containing 10 classes with 50,000 training images and 10,000 testing images, demonstrating the use of l1 distance for image comparison.', 'chapters': [{'end': 775.546, 'start': 564.563, 'title': 'Api for image classifier', 'summary': 'Discusses the challenges in writing an explicit algorithm for image recognition, the limitations of hand-coded rules, and the shift towards a data-driven approach for training machine learning classifiers, which allows for scalability and improved recognition of object categories.', 'duration': 210.983, 'highlights': ['Data-driven approach for training machine learning classifiers The approach involves collecting a large dataset of different object categories from the internet, training a machine learning classifier to summarize the knowledge of recognizing these categories, and using the trained model to make predictions on new images.', 'Challenges in writing explicit algorithm for image recognition The chapter highlights the difficulty in writing explicit algorithms for recognizing objects or images due to the absence of clear, intuitive algorithms compared to traditional algorithmic tasks such as sorting numbers or computing a convex hull.', 'Limitations of hand-coded rules for recognizing object categories The discussion emphasizes the drawbacks of creating explicit hand-coded rules for recognizing object categories, as it is brittle, not scalable, and requires starting over for new object categories.']}, {'end': 1026.724, 'start': 776.166, 'title': 'Data-driven approach in nearest neighbor classifier', 'summary': 'Explores the concept of a data-driven approach in the context of a simple nearest neighbor classifier applied to the cifar-10 dataset, containing 10 classes with 50,000 training images and 10,000 testing images, demonstrating the use of l1 distance for image comparison.', 'duration': 250.558, 'highlights': ['The CIFAR-10 dataset consists of 10 different classes with 50,000 training images and 10,000 testing images, providing a small test case for machine learning.', 'The nearest neighbor classifier, a simple algorithm, involves memorizing all the training data and predicting the label of the most similar image to a new image, demonstrating the concept of a data-driven approach.', 'The L1 distance, also known as the Manhattan distance, is used for comparing images in the nearest neighbor algorithm, providing a concrete way to measure the difference between two images.']}], 'duration': 462.161, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG0564563.jpg', 'highlights': ['Data-driven approach for training machine learning classifiers involves collecting a large dataset of different object categories, training a machine learning classifier, and using the trained model for predictions.', 'Challenges in writing explicit algorithm for image recognition due to the absence of clear, intuitive algorithms compared to traditional algorithmic tasks.', 'Limitations of hand-coded rules for recognizing object categories include brittleness, lack of scalability, and the need to start over for new object categories.', 'CIFAR-10 dataset consists of 10 classes with 50,000 training images and 10,000 testing images, providing a small test case for machine learning.', 'Nearest neighbor classifier involves memorizing all the training data and predicting the label of the most similar image to a new image, demonstrating the concept of a data-driven approach.', 'L1 distance, also known as the Manhattan distance, is used for comparing images in the nearest neighbor algorithm, providing a concrete way to measure the difference between two images.']}, {'end': 1694.631, 'segs': [{'end': 1073.853, 'src': 'embed', 'start': 1029.372, 'weight': 0, 'content': [{'end': 1037.478, 'text': "So here's some full Python code for implementing this nearest neighbor classifier, and you can see it's actually pretty short and pretty concise,", 'start': 1029.372, 'duration': 8.106}, {'end': 1041.34, 'text': "because we've made use of many of these vectorized operations offered by NumPy.", 'start': 1037.478, 'duration': 3.862}, {'end': 1049.608, 'text': 'So here we can see that this train function that we talked about earlier is again very simple in the case of nearest neighbor.', 'start': 1042.262, 'duration': 7.346}, {'end': 1051.369, 'text': 'You just memorize the training data.', 'start': 1049.828, 'duration': 1.541}, {'end': 1052.53, 'text': "There's not really much to do here.", 'start': 1051.429, 'duration': 1.101}, {'end': 1060.144, 'text': "And now, at test time, we're gonna take in our image and then go in and compare, using this L1 distance function,", 'start': 1054.261, 'duration': 5.883}, {'end': 1065.388, 'text': 'our test image to each of these training examples and find the most similar example in the training set.', 'start': 1060.144, 'duration': 5.244}, {'end': 1073.853, 'text': "And you can see that we're actually able to do this in just one or two lines of Python code by utilizing these vectorized operations in NumPy.", 'start': 1066.288, 'duration': 7.565}], 'summary': 'Python code implements nearest neighbor classifier using numpy for efficient vectorized operations.', 'duration': 44.481, 'max_score': 1029.372, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG01029372.jpg'}, {'end': 1350.558, 'src': 'embed', 'start': 1323.381, 'weight': 4, 'content': [{'end': 1326.324, 'text': 'And then, once we move to the k, equals five case,', 'start': 1323.381, 'duration': 2.943}, {'end': 1331.047, 'text': 'then these decision boundaries between the blue and red regions have become quite smooth and quite nice.', 'start': 1326.324, 'duration': 4.723}, {'end': 1339.614, 'text': "So this is generally something so generally, when you're using nearest neighbor classifiers, you almost always want to use some value of k,", 'start': 1331.708, 'duration': 7.906}, {'end': 1345.179, 'text': 'which is larger than one, because this tends to smooth out your decision boundaries and lead to better results.', 'start': 1339.614, 'duration': 5.565}, {'end': 1350.558, 'text': 'So if we to kind of oh yeah, question?', 'start': 1347.856, 'duration': 2.702}], 'summary': 'Using k=5 in nearest neighbor classifiers creates smoother decision boundaries and leads to better results.', 'duration': 27.177, 'max_score': 1323.381, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG01323381.jpg'}, {'end': 1536.599, 'src': 'heatmap', 'start': 1498.167, 'weight': 0.807, 'content': [{'end': 1505.589, 'text': 'So one interesting thing to point out between these two metrics in particular is that the L1 distance depends on your choice of coordinate system.', 'start': 1498.167, 'duration': 7.422}, {'end': 1510.61, 'text': 'So if you were to rotate the coordinate frame, that would actually change the L1 distance between the points.', 'start': 1506.029, 'duration': 4.581}, {'end': 1515.311, 'text': "Whereas changing the coordinate frame in the L2 distance doesn't matter.", 'start': 1511.09, 'duration': 4.221}, {'end': 1517.412, 'text': "It's the same thing no matter what your coordinate frame is.", 'start': 1515.371, 'duration': 2.041}, {'end': 1525.835, 'text': 'So maybe if your input features, if the individual entries in your vector have some important meaning for your task,', 'start': 1518.132, 'duration': 7.703}, {'end': 1528.536, 'text': 'then maybe somehow L1 might be a more natural fit.', 'start': 1525.835, 'duration': 2.701}, {'end': 1535.039, 'text': "But if it's just a generic vector in some space and you don't know which of the different elements, you don't know what they actually mean,", 'start': 1529.236, 'duration': 5.803}, {'end': 1536.599, 'text': 'then maybe L2 is slightly more natural.', 'start': 1535.039, 'duration': 1.56}], 'summary': "L1 distance depends on coordinate system, l2 distance doesn't. l1 might be more natural for meaningful features, l2 for generic vectors.", 'duration': 38.432, 'max_score': 1498.167, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG01498167.jpg'}, {'end': 1572.005, 'src': 'embed', 'start': 1538.34, 'weight': 3, 'content': [{'end': 1542.725, 'text': 'And another point here is that we can actually, by using different distance metrics,', 'start': 1538.34, 'duration': 4.385}, {'end': 1548.792, 'text': 'we can actually generalize the k-nearest neighbor classifier to many, many different types of data, not just vectors, not just images.', 'start': 1542.725, 'duration': 6.067}, {'end': 1552.777, 'text': 'So for example, imagine you wanted to classify pieces of text.', 'start': 1549.333, 'duration': 3.444}, {'end': 1563.442, 'text': 'then the only thing you need to do to use k-nearest neighbors is to specify some distance function that can measure distances between maybe two paragraphs or two sentences,', 'start': 1553.257, 'duration': 10.185}, {'end': 1564.162, 'text': 'or something like that.', 'start': 1563.442, 'duration': 0.72}, {'end': 1572.005, 'text': 'So simply by specifying different distance metrics, we can actually apply this algorithm very generally to basically any type of data.', 'start': 1564.762, 'duration': 7.243}], 'summary': 'Using various distance metrics, k-nearest neighbor classifier can be applied to diverse data types.', 'duration': 33.665, 'max_score': 1538.34, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG01538340.jpg'}, {'end': 1631.13, 'src': 'embed', 'start': 1594.71, 'weight': 5, 'content': [{'end': 1598.534, 'text': 'and then on the right using the familiar L2 or Euclidean distance.', 'start': 1594.71, 'duration': 3.824}, {'end': 1604.22, 'text': 'And you can see that the shapes of these decision boundaries actually change quite a bit between the two metrics.', 'start': 1599.115, 'duration': 5.105}, {'end': 1609.824, 'text': "So when you're looking at L1, these decision boundaries tend to follow the coordinate axes.", 'start': 1604.981, 'duration': 4.843}, {'end': 1613.806, 'text': 'And this is again because the L1 actually depends on our choice of coordinate system.', 'start': 1610.244, 'duration': 3.562}, {'end': 1620.65, 'text': "Where the L2 sort of doesn't really care about the coordinate axis, it just puts the boundaries where they sort of should fall naturally.", 'start': 1614.286, 'duration': 6.364}, {'end': 1631.13, 'text': "So, actually, my confession is that each of these examples that I've shown you is actually from this interactive web demo that I built,", 'start': 1623.108, 'duration': 8.022}], 'summary': 'Comparison of decision boundaries using l1 and l2 distances, showing significant changes between the two metrics.', 'duration': 36.42, 'max_score': 1594.71, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG01594710.jpg'}], 'start': 1029.372, 'title': 'Nearest neighbor and k-nearest neighbor algorithms', 'summary': 'Demonstrates the implementation of a nearest neighbor classifier in python using numpy, emphasizing its simplicity and efficiency. it also discusses the k-nearest neighbor algorithm, highlighting its application in classification, the impact of k value on decision boundaries, and the choice of distance metrics.', 'chapters': [{'end': 1073.853, 'start': 1029.372, 'title': 'Nearest neighbor classifier in python', 'summary': 'Demonstrates the implementation of a nearest neighbor classifier in python using numpy for vectorized operations, highlighting the simplicity and efficiency of the code with just one or two lines for comparing test images to training examples.', 'duration': 44.481, 'highlights': ['The implementation of the nearest neighbor classifier in Python makes use of vectorized operations offered by NumPy, resulting in concise and efficient code.', 'The train function for nearest neighbor is simple, involving the memorization of training data with minimal additional steps.', 'At test time, the code compares test images to training examples using the L1 distance function, and successfully identifies the most similar example in the training set with just one or two lines of Python code.']}, {'end': 1694.631, 'start': 1074.833, 'title': 'Understanding k-nearest neighbor algorithm', 'summary': 'Discusses the k-nearest neighbor algorithm, highlighting its application in classification, the impact of k value on decision boundaries, and the choice of distance metrics, emphasizing the importance of selecting larger k values for smoother decision boundaries and generalizing the algorithm to different types of data.', 'duration': 619.798, 'highlights': ['The k-nearest neighbor algorithm is applied for classification, where k represents the number of nearest neighbors considered for prediction. It discusses the application of the k-nearest neighbor algorithm in classification tasks.', 'The decision boundaries are impacted by the value of k, as larger k values lead to smoother decision boundaries and better results. It emphasizes the impact of the k value on decision boundaries and the importance of selecting larger k values for better results.', 'Different distance metrics, such as L1 and L2, make different assumptions about the underlying geometry or topology of the space, influencing the shape of decision boundaries. It explains the influence of different distance metrics on decision boundaries and their implications on the underlying space geometry.', 'The k-nearest neighbor algorithm can be generalized to various data types by specifying different distance metrics, making it a versatile algorithm for classification tasks. It highlights the ability to generalize the k-nearest neighbor algorithm to different data types by specifying different distance metrics, making it a versatile classification algorithm.']}], 'duration': 665.259, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG01029372.jpg', 'highlights': ['The implementation of the nearest neighbor classifier in Python makes use of vectorized operations offered by NumPy, resulting in concise and efficient code.', 'The train function for nearest neighbor is simple, involving the memorization of training data with minimal additional steps.', 'At test time, the code compares test images to training examples using the L1 distance function, and successfully identifies the most similar example in the training set with just one or two lines of Python code.', 'The k-nearest neighbor algorithm is applied for classification, where k represents the number of nearest neighbors considered for prediction. It discusses the application of the k-nearest neighbor algorithm in classification tasks.', 'The decision boundaries are impacted by the value of k, as larger k values lead to smoother decision boundaries and better results. It emphasizes the impact of the k value on decision boundaries and the importance of selecting larger k values for better results.', 'Different distance metrics, such as L1 and L2, make different assumptions about the underlying geometry or topology of the space, influencing the shape of decision boundaries. It explains the influence of different distance metrics on decision boundaries and their implications on the underlying space geometry.', 'The k-nearest neighbor algorithm can be generalized to various data types by specifying different distance metrics, making it a versatile algorithm for classification tasks. It highlights the ability to generalize the k-nearest neighbor algorithm to different data types by specifying different distance metrics, making it a versatile classification algorithm.']}, {'end': 2353.478, 'segs': [{'end': 1737.808, 'src': 'embed', 'start': 1694.691, 'weight': 0, 'content': [{'end': 1705.201, 'text': "It's actually pretty fun and kind of nice to build intuition about how the decision boundary changes as you change the k and change your distance metric and all those sorts of things.", 'start': 1694.691, 'duration': 10.51}, {'end': 1717.751, 'text': "Okay, so then the question is, once you're actually trying to use this algorithm in practice, there are several choices you need to make.", 'start': 1711.988, 'duration': 5.763}, {'end': 1720.313, 'text': 'We talked about choosing different values of k.', 'start': 1718.052, 'duration': 2.261}, {'end': 1727.497, 'text': 'we talked about choosing different distance metrics and the question becomes how do you actually make these choices for your problem and for your data?', 'start': 1720.313, 'duration': 7.184}, {'end': 1737.808, 'text': 'So these choices of things like k and the distance metric we call hyperparameters because they are not necessarily learned from the training data.', 'start': 1728.237, 'duration': 9.571}], 'summary': 'Understanding k-nearest neighbors algorithm and choosing hyperparameters for practical use.', 'duration': 43.117, 'max_score': 1694.691, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG01694691.jpg'}, {'end': 1781.727, 'src': 'embed', 'start': 1753.501, 'weight': 4, 'content': [{'end': 1761.225, 'text': 'and the simple thing that most people do is simply try different values of hyperparameters for your data and for your problem and figure out which one works best.', 'start': 1753.501, 'duration': 7.724}, {'end': 1774.564, 'text': "There's a question? So the question is where L1 distance might be preferable to using L2 distance.", 'start': 1761.806, 'duration': 12.758}, {'end': 1777.665, 'text': "I think it's mainly problem dependent.", 'start': 1775.404, 'duration': 2.261}, {'end': 1781.727, 'text': "It's sort of difficult to say in which cases you think one might be better than the other.", 'start': 1777.725, 'duration': 4.002}], 'summary': 'Experiment with hyperparameters to find best values; l1 vs l2 distance is problem-dependent.', 'duration': 28.226, 'max_score': 1753.501, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG01753501.jpg'}, {'end': 1921.218, 'src': 'embed', 'start': 1899.25, 'weight': 5, 'content': [{'end': 1914.141, 'text': "And now I'll try training my algorithm with different choices of hyperparameters on the training data and then I'll go and apply that trained classifier on the test data and now I will pick the set of hyperparameters that cause me to perform best on the test data.", 'start': 1899.25, 'duration': 14.891}, {'end': 1921.218, 'text': 'This seems like maybe a more reasonable strategy, but in fact, this is also a terrible idea and you should never do this.', 'start': 1915.463, 'duration': 5.755}], 'summary': 'Training algorithm with different hyperparameters and applying it to test data is not a good strategy.', 'duration': 21.968, 'max_score': 1899.25, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG01899250.jpg'}, {'end': 2038.209, 'src': 'embed', 'start': 2008.181, 'weight': 3, 'content': [{'end': 2012.364, 'text': "that's the number that actually is telling you how your algorithm is doing on unseen data.", 'start': 2008.181, 'duration': 4.183}, {'end': 2019.265, 'text': 'And this is actually really, really important that you keep a very strict separation between the validation data and the test data.', 'start': 2013.324, 'duration': 5.941}, {'end': 2026.427, 'text': "So for example, when we're working on research papers, we typically only touch the test set at the very last minute.", 'start': 2019.765, 'duration': 6.662}, {'end': 2032.688, 'text': "So when I'm writing papers, I tend to only touch the test set for my problem in maybe the week before the deadline or so,", 'start': 2026.567, 'duration': 6.121}, {'end': 2038.209, 'text': "to really ensure that we're not being dishonest here and we're not reporting a number which is unfair.", 'start': 2032.688, 'duration': 5.521}], 'summary': 'Keep strict separation between validation and test data to ensure fairness and accuracy in algorithm evaluation.', 'duration': 30.028, 'max_score': 2008.181, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG02008181.jpg'}, {'end': 2072.192, 'src': 'embed', 'start': 2047.361, 'weight': 6, 'content': [{'end': 2055.985, 'text': 'So another strategy for setting hyperparameters is called cross-validation, and this is used a little bit more commonly for small data sets,', 'start': 2047.361, 'duration': 8.624}, {'end': 2057.766, 'text': 'not used so much in deep learning.', 'start': 2055.985, 'duration': 1.781}, {'end': 2063.467, 'text': "So here the idea is that we're gonna take our test data or we're gonna take our data set as usual,", 'start': 2058.246, 'duration': 5.221}, {'end': 2067.909, 'text': 'hold out some test set to use at the very end and now for the rest of the data,', 'start': 2063.467, 'duration': 4.442}, {'end': 2072.192, 'text': 'rather than splitting it into a single training and validation partition.', 'start': 2067.909, 'duration': 4.283}], 'summary': 'Cross-validation is a strategy for small datasets, not commonly used in deep learning.', 'duration': 24.831, 'max_score': 2047.361, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG02047361.jpg'}], 'start': 1694.691, 'title': 'K-nearest neighbors algorithm and hyperparameter selection', 'summary': 'Explores the k-nearest neighbors algorithm, analyzing the impact of changing k and distance metrics. it also discusses the importance of selecting hyperparameters in machine learning, emphasizing the preference of l1 distance over l2, the flaws of using training data accuracy for hyperparameter selection, the significance of separating validation and test data, and the use of cross-validation for small datasets.', 'chapters': [{'end': 1753.501, 'start': 1694.691, 'title': 'K-nearest neighbors algorithm', 'summary': 'Discusses the k-nearest neighbors algorithm, exploring the impact of changing k and distance metrics, as well as the challenge of selecting hyperparameters for practical application.', 'duration': 58.81, 'highlights': ['The importance of selecting hyperparameters like k and distance metrics for the k-nearest neighbors algorithm is emphasized, as these choices significantly impact its performance and are not directly learned from training data.', 'Understanding the impact of changing k and distance metrics on the decision boundary is highlighted as a key aspect of building intuition about the k-nearest neighbors algorithm.', 'The need to make informed choices about hyperparameters such as k and distance metrics for practical application is emphasized, highlighting the problem-dependent nature of these decisions.']}, {'end': 2353.478, 'start': 1753.501, 'title': 'Choosing hyperparameters in machine learning', 'summary': 'Discusses the importance of choosing hyperparameters in machine learning, including the preference of l1 distance over l2, the flaws of choosing hyperparameters based on training data accuracy, the significance of a strict separation between validation and test data, and the use of cross-validation for small datasets.', 'duration': 599.977, 'highlights': ['The significance of a strict separation between validation and test data Emphasizes the importance of maintaining a strict separation between validation and test data to accurately assess algorithm performance on unseen data.', 'Preference of L1 distance over L2 Discusses the preference of L1 distance over L2 in cases where individual elements of the vector have meaning, such as in employee classification based on features like salary and years of employment.', 'Flaws of choosing hyperparameters based on training data accuracy Highlights the drawbacks of selecting hyperparameters based on training data accuracy, emphasizing the importance of evaluating algorithm performance on unseen data after training.', 'Use of cross-validation for small datasets Explains the use of cross-validation for small datasets as a strategy to gain higher confidence about the performance of hyperparameters.']}], 'duration': 658.787, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG01694691.jpg', 'highlights': ['The importance of selecting hyperparameters like k and distance metrics for the k-nearest neighbors algorithm is emphasized, as these choices significantly impact its performance and are not directly learned from training data.', 'Understanding the impact of changing k and distance metrics on the decision boundary is highlighted as a key aspect of building intuition about the k-nearest neighbors algorithm.', 'The need to make informed choices about hyperparameters such as k and distance metrics for practical application is emphasized, highlighting the problem-dependent nature of these decisions.', 'The significance of a strict separation between validation and test data Emphasizes the importance of maintaining a strict separation between validation and test data to accurately assess algorithm performance on unseen data.', 'Preference of L1 distance over L2 Discusses the preference of L1 distance over L2 in cases where individual elements of the vector have meaning, such as in employee classification based on features like salary and years of employment.', 'Flaws of choosing hyperparameters based on training data accuracy Highlights the drawbacks of selecting hyperparameters based on training data accuracy, emphasizing the importance of evaluating algorithm performance on unseen data after training.', 'Use of cross-validation for small datasets Explains the use of cross-validation for small datasets as a strategy to gain higher confidence about the performance of hyperparameters.']}, {'end': 2698.11, 'segs': [{'end': 2403.955, 'src': 'embed', 'start': 2375.679, 'weight': 4, 'content': [{'end': 2378.803, 'text': "because they're just with all of these problems that we've talked about.", 'start': 2375.679, 'duration': 3.124}, {'end': 2384.911, 'text': "So one problem is that it's very slow at test time, which is kind of the reverse of what we want, which we talked about earlier.", 'start': 2379.424, 'duration': 5.487}, {'end': 2395.825, 'text': 'Another problem is that these things like Euclidean distance or L1 distance are really not a very good way to measure distances between images.', 'start': 2385.532, 'duration': 10.293}, {'end': 2402.173, 'text': 'These sort of vectorial distance functions do not correspond very well to perceptual similarity between images.', 'start': 2395.965, 'duration': 6.208}, {'end': 2403.955, 'text': 'how you perceive differences between images?', 'start': 2402.173, 'duration': 1.782}], 'summary': 'Challenges include slow test time and ineffective distance measures for image similarity.', 'duration': 28.276, 'max_score': 2375.679, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG02375679.jpg'}, {'end': 2466.247, 'src': 'heatmap', 'start': 2428.655, 'weight': 0.805, 'content': [{'end': 2438.227, 'text': 'which is maybe not so good because it sort of gives you the sense that the L2 distance is really not doing a very good job at capturing these perceptual differences between images.', 'start': 2428.655, 'duration': 9.572}, {'end': 2447.161, 'text': 'Another sort of problem with the k-nearest neighbor classifier has to do with something we call the curse of dimensionality.', 'start': 2441.519, 'duration': 5.642}, {'end': 2452.222, 'text': 'So, if you recall back this viewpoint we had of the k-nearest neighbor classifier,', 'start': 2447.741, 'duration': 4.481}, {'end': 2457.704, 'text': "it's sort of dropping paint around each of the training data points and using that to sort of partition the space.", 'start': 2452.222, 'duration': 5.482}, {'end': 2462.066, 'text': 'So that means that if we expect the k-nearest neighbor classifier to work well,', 'start': 2458.344, 'duration': 3.722}, {'end': 2466.247, 'text': 'we kind of need our training examples to cover the space quite densely.', 'start': 2462.066, 'duration': 4.181}], 'summary': 'L2 distance not capturing perceptual differences; k-nearest neighbor suffers from curse of dimensionality.', 'duration': 37.592, 'max_score': 2428.655, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG02428655.jpg'}, {'end': 2506.428, 'src': 'embed', 'start': 2468.088, 'weight': 0, 'content': [{'end': 2476.57, 'text': 'otherwise our nearest neighbors could actually be quite far away and might not actually be very similar to our testing points.', 'start': 2468.088, 'duration': 8.482}, {'end': 2482.371, 'text': 'And the problem is that actually densely covering the space means that we need a number of training examples,', 'start': 2476.73, 'duration': 5.641}, {'end': 2484.952, 'text': 'which is exponential in the dimension of the problem.', 'start': 2482.371, 'duration': 2.581}, {'end': 2486.693, 'text': 'So this is very bad.', 'start': 2485.732, 'duration': 0.961}, {'end': 2490.113, 'text': "exponential growth is always bad and you're never gonna get enough.", 'start': 2486.693, 'duration': 3.42}, {'end': 2491.574, 'text': "basically, you're never gonna get enough.", 'start': 2490.113, 'duration': 1.461}, {'end': 2495.415, 'text': 'images to densely cover this space of pixels in this high dimensional space.', 'start': 2491.574, 'duration': 3.841}, {'end': 2500.26, 'text': "So that's maybe another thing to keep in mind when you're thinking about using k-nearest-neighbor.", 'start': 2496.476, 'duration': 3.784}, {'end': 2506.428, 'text': "So kind of the summary is that we're using k-nearest-neighbor to introduce this idea of image classification.", 'start': 2502.143, 'duration': 4.285}], 'summary': 'Densely covering space for k-nearest-neighbors requires exponential training examples, posing challenges for image classification.', 'duration': 38.34, 'max_score': 2468.088, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG02468088.jpg'}, {'end': 2625.685, 'src': 'embed', 'start': 2549.71, 'weight': 2, 'content': [{'end': 2554.797, 'text': 'the number of training examples that we need to densely cover the space grows exponentially with the dimension.', 'start': 2549.71, 'duration': 5.087}, {'end': 2561.851, 'text': 'So this is kind of giving you this sense that maybe in two dimensions we might have this kind of funny curved shape,', 'start': 2556.066, 'duration': 5.785}, {'end': 2567.035, 'text': 'or you might have sort of arbitrary manifolds of labels in different dimensional spaces.', 'start': 2561.851, 'duration': 5.184}, {'end': 2573.9, 'text': "And the only way, because the k-nearest neighbor algorithm doesn't really make any assumptions about these underlying manifolds.", 'start': 2567.475, 'duration': 6.425}, {'end': 2579.104, 'text': 'the only way it can perform properly is if it has quite a dense sample of training points to work with.', 'start': 2573.9, 'duration': 5.204}, {'end': 2585.834, 'text': 'So this is kind of the overview of k-nearest-neighbors,', 'start': 2582.651, 'duration': 3.183}, {'end': 2589.977, 'text': "and you'll get a chance to actually implement this and try it out on images in the first assignment.", 'start': 2585.834, 'duration': 4.143}, {'end': 2595.522, 'text': "So if there's any last minute questions about KNN, I'm going to move on to the next topic.", 'start': 2591.999, 'duration': 3.523}, {'end': 2612.242, 'text': 'Question?. Sorry, say that again? Yeah, so the question is why do these images have the same L2 distance?', 'start': 2596.923, 'duration': 15.319}, {'end': 2616.743, 'text': 'And the answer is that I carefully constructed them to have the same L2 distance.', 'start': 2612.922, 'duration': 3.821}, {'end': 2625.685, 'text': "But it's just giving you the sense that the L2 distance is not a very good measure of similarity between images.", 'start': 2619.703, 'duration': 5.982}], 'summary': 'Training examples for k-nearest neighbor algorithm grow exponentially with dimension, requires dense sample of training points for proper performance.', 'duration': 75.975, 'max_score': 2549.71, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG02549710.jpg'}], 'start': 2353.478, 'title': 'K-nearest neighbor classifiers', 'summary': 'Discusses the challenges of k-nearest neighbor classifiers, including inefficiency in image classification, inadequacy of distance metrics in capturing perceptual differences, and the curse of dimensionality potentially requiring an exponential number of training examples. it also provides an overview of the exponential increase in the number of training examples required to densely cover the space as dimensions increase, emphasizing the impact on performance and the limitations of l2 distance in measuring similarity between images.', 'chapters': [{'end': 2531.509, 'start': 2353.478, 'title': 'Challenges of k-nearest neighbor classifiers', 'summary': 'Emphasizes the limitations of k-nearest neighbor classifiers, including its inefficiency in image classification, the inadequacy of distance metrics in capturing perceptual differences, and the curse of dimensionality when densely covering the space, potentially requiring an exponential number of training examples.', 'duration': 178.031, 'highlights': ['K-nearest neighbor classifiers are almost never used in practice for image classification due to various limitations. K-nearest neighbor classifiers are almost never used in practice for image classification due to inefficiency at test time, inadequacy of distance metrics in capturing perceptual differences, and the curse of dimensionality, which requires a potentially exponential number of training examples to cover the space.', 'Euclidean and L1 distance metrics are ineffective in measuring distances between images, failing to capture perceptual similarity. Euclidean and L1 distance metrics are ineffective in measuring distances between images, failing to capture perceptual similarity, as demonstrated by the equal L2 distances between original and distorted images.', 'The curse of dimensionality in k-nearest neighbor classifiers requires a potentially exponential number of training examples to densely cover the space, making it impractical in high-dimensional problems. The curse of dimensionality in k-nearest neighbor classifiers requires a potentially exponential number of training examples to densely cover the space, making it impractical in high-dimensional problems where densely covering the space in high dimensions is infeasible.']}, {'end': 2698.11, 'start': 2532.049, 'title': 'K-nearest neighbors overview', 'summary': 'Discusses the exponential increase in the number of training examples required to densely cover the space as dimensions increase in k-nearest neighbors algorithm, emphasizing the impact on performance and the limitations of l2 distance in measuring similarity between images.', 'duration': 166.061, 'highlights': ['The number of training examples needed to densely cover the space grows exponentially with the dimension, e.g., 16 training examples for two dimensions, emphasizing the impact of dimensionality on training requirements.', "The k-nearest neighbor algorithm relies on having a dense sample of training points to perform properly, highlighting the importance of sufficient training data for the algorithm's performance.", 'The limitations of L2 distance as a measure of similarity between images are demonstrated through carefully constructed examples, illustrating how the distance metric may not capture the full description of distance or difference between images.']}], 'duration': 344.632, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG02353478.jpg', 'highlights': ['K-nearest neighbor classifiers are almost never used in practice for image classification due to inefficiency at test time, inadequacy of distance metrics in capturing perceptual differences, and the curse of dimensionality, which requires a potentially exponential number of training examples to cover the space.', 'The curse of dimensionality in k-nearest neighbor classifiers requires a potentially exponential number of training examples to densely cover the space, making it impractical in high-dimensional problems where densely covering the space in high dimensions is infeasible.', 'The number of training examples needed to densely cover the space grows exponentially with the dimension, e.g., 16 training examples for two dimensions, emphasizing the impact of dimensionality on training requirements.', 'The limitations of L2 distance as a measure of similarity between images are demonstrated through carefully constructed examples, illustrating how the distance metric may not capture the full description of distance or difference between images.', 'Euclidean and L1 distance metrics are ineffective in measuring distances between images, failing to capture perceptual similarity, as demonstrated by the equal L2 distances between original and distorted images.', "The k-nearest neighbor algorithm relies on having a dense sample of training points to perform properly, highlighting the importance of sufficient training data for the algorithm's performance."]}, {'end': 3567.494, 'segs': [{'end': 2759.648, 'src': 'embed', 'start': 2731.893, 'weight': 0, 'content': [{'end': 2735.337, 'text': "If you're like really rushing for that deadline and you really gotta get this model out the door,", 'start': 2731.893, 'duration': 3.444}, {'end': 2740.442, 'text': "then if it takes a long time to retrain the model on the whole dataset, then maybe you won't do it.", 'start': 2735.337, 'duration': 5.105}, {'end': 2747.844, 'text': 'but if you have a little bit more time to spare and a little bit more compute to spare and you want to squeeze out maybe that extra 1% of performance,', 'start': 2740.842, 'duration': 7.002}, {'end': 2749.125, 'text': 'then that is a trick you can use.', 'start': 2747.844, 'duration': 1.281}, {'end': 2759.648, 'text': 'So, moving on from so we kind of saw the k-nearest neighbor has a lot of the nice properties of machine learning algorithms,', 'start': 2753.226, 'duration': 6.422}], 'summary': 'Retraining model for 1% boost if time and compute allow', 'duration': 27.755, 'max_score': 2731.893, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG02731893.jpg'}, {'end': 2809.168, 'src': 'embed', 'start': 2787.229, 'weight': 1, 'content': [{'end': 2795.799, 'text': 'That you can have different kinds of components of neural networks and you can stick these components together to build these large different towers of convolutional networks.', 'start': 2787.229, 'duration': 8.57}, {'end': 2803.684, 'text': "And one of the most basic building blocks that we'll see in different types of deep learning applications is this linear classifier.", 'start': 2796.4, 'duration': 7.284}, {'end': 2809.168, 'text': "So I think it's actually really important to have a good understanding of what's happening with linear classification,", 'start': 2804.205, 'duration': 4.963}], 'summary': 'Neural networks use components to build convolutional networks, with linear classifiers being a fundamental building block.', 'duration': 21.939, 'max_score': 2787.229, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG02787229.jpg'}, {'end': 2866.258, 'src': 'heatmap', 'start': 2821.255, 'weight': 0.711, 'content': [{'end': 2826.519, 'text': 'So here the setup is that we want to input an image and then output a descriptive sentence describing the image.', 'start': 2821.255, 'duration': 5.264}, {'end': 2835.144, 'text': "And the way this kind of works is that we have one convolutional neural network that's looking at the image and a recurrent neural network that knows about language.", 'start': 2827.139, 'duration': 8.005}, {'end': 2838.407, 'text': 'And we can kind of just stick these two pieces together like Lego blocks,', 'start': 2835.545, 'duration': 2.862}, {'end': 2843.23, 'text': 'and train the whole thing together and end up with a pretty cool system that can do some non-trivial things.', 'start': 2838.407, 'duration': 4.823}, {'end': 2847.194, 'text': "And we'll work through the details of this model as we go forward in the class.", 'start': 2843.991, 'duration': 3.203}, {'end': 2851.298, 'text': 'but this just gives you the sense that these deep neural networks are kind of like Legos,', 'start': 2847.194, 'duration': 4.104}, {'end': 2855.943, 'text': 'and this linear classifier is kind of like the most basic building block of these giant networks.', 'start': 2851.298, 'duration': 4.645}, {'end': 2862.095, 'text': "But that's a little bit too exciting for lecture two, so we have to go back to CIFAR 10 for the moment.", 'start': 2857.972, 'duration': 4.123}, {'end': 2866.258, 'text': 'So recall that CIFAR 10 has these 50,000 training examples.', 'start': 2863.316, 'duration': 2.942}], 'summary': 'Combining convolutional and recurrent neural networks to describe images, with 50,000 training examples.', 'duration': 45.003, 'max_score': 2821.255, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG02821255.jpg'}, {'end': 3111.665, 'src': 'heatmap', 'start': 3035.598, 'weight': 0.941, 'content': [{'end': 3039.361, 'text': 'And now we need to end up, we want to end up with 10 class scores.', 'start': 3035.598, 'duration': 3.763}, {'end': 3045.206, 'text': 'We want to end up with 10 numbers for this image, giving us the scores for each of the 10 categories,', 'start': 3040.562, 'duration': 4.644}, {'end': 3048.849, 'text': 'which means that now our matrix W needs to be 10 by 3072..', 'start': 3045.206, 'duration': 3.643}, {'end': 3056.475, 'text': "So that once we multiply these two things out, then we'll end up with a single column vector 10 by one, giving us our 10 class scores.", 'start': 3048.849, 'duration': 7.626}, {'end': 3062.795, 'text': "Also, sometimes you'll typically see this is we'll often add a bias term,", 'start': 3058.192, 'duration': 4.603}, {'end': 3069.838, 'text': 'which will be a constant vector of 10 elements that does not interact with the training data and instead just gives us some sort of data,', 'start': 3062.795, 'duration': 7.043}, {'end': 3072.6, 'text': 'independent preferences for some classes over another.', 'start': 3069.838, 'duration': 2.762}, {'end': 3078.083, 'text': 'So you might imagine that if your data set was unbalanced and had many more cats than dogs, for example,', 'start': 3073.12, 'duration': 4.963}, {'end': 3082.345, 'text': 'then the bias elements corresponding to cat would be higher than the other ones.', 'start': 3078.083, 'duration': 4.262}, {'end': 3088.513, 'text': 'So if you kind of think about pictorially what this function is doing,', 'start': 3084.471, 'duration': 4.042}, {'end': 3097.778, 'text': "we're taking we have in this figure we have an example on the left of a simple image with just a two by two image, so it has four pixels total.", 'start': 3088.513, 'duration': 9.265}, {'end': 3105.022, 'text': 'So the way that the linear classifier works is that we take this two by two image, we stretch it out into a four,', 'start': 3098.298, 'duration': 6.724}, {'end': 3111.665, 'text': 'into a column vector with four elements, And now, in this example, we are just restricting to three classes cat,', 'start': 3105.022, 'duration': 6.643}], 'summary': 'Linear classifier aims to produce 10 class scores by multiplying 10x3072 matrix with image data, potentially utilizing a bias term for balanced preferences.', 'duration': 76.067, 'max_score': 3035.598, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG03035598.jpg'}, {'end': 3216.107, 'src': 'heatmap', 'start': 3143.806, 'weight': 0.88, 'content': [{'end': 3151.192, 'text': 'So when you look at it this way, you can kind of understand linear classification as almost a template matching approach,', 'start': 3143.806, 'duration': 7.386}, {'end': 3157.617, 'text': 'where each of the rows in this matrix correspond to some template of the image and now the dot product,', 'start': 3151.192, 'duration': 6.425}, {'end': 3164.082, 'text': 'the inner product or dot product between the row of the matrix and the column giving the pixels of the image.', 'start': 3157.617, 'duration': 6.465}, {'end': 3170.246, 'text': 'computing. this dot product kind of gives us a similarity between this template for the class and the pixels of our image.', 'start': 3164.082, 'duration': 6.164}, {'end': 3177.23, 'text': 'And then this bias just again gives you this data independent scaling offset to each of the classes.', 'start': 3171.347, 'duration': 5.883}, {'end': 3181.632, 'text': 'So now from this template matching.', 'start': 3179.331, 'duration': 2.301}, {'end': 3185.774, 'text': 'if we think about linear classification, from this viewpoint of template matching,', 'start': 3181.632, 'duration': 4.142}, {'end': 3193.238, 'text': 'we can actually take the rows of that weight matrix and unravel them back into images and actually visualize those templates as images.', 'start': 3185.774, 'duration': 7.464}, {'end': 3199.241, 'text': 'And this gives us some sense of what a linear classifier might actually be doing to try to understand our data.', 'start': 3193.698, 'duration': 5.543}, {'end': 3203.782, 'text': "So in this example, we've gone ahead and trained a linear classifier on our images.", 'start': 3199.841, 'duration': 3.941}, {'end': 3212.606, 'text': "And now on the bottom, we're visualizing what are those rows in that learned weight matrix corresponding to each of the 10 categories in CIFAR-10.", 'start': 3204.183, 'duration': 8.423}, {'end': 3216.107, 'text': "And in this way, we kind of get a sense for what's going on in these images.", 'start': 3213.246, 'duration': 2.861}], 'summary': 'Linear classification as template matching, visualizing learned templates for cifar-10 categories.', 'duration': 72.301, 'max_score': 3143.806, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG03143806.jpg'}, {'end': 3524.949, 'src': 'embed', 'start': 3509.221, 'weight': 2, 'content': [{'end': 3524.949, 'text': "So at this point we've kind of talked about what is the functional form corresponding to a linear classifier and we've seen that this functional form of matrix vector multiply corresponds to this idea of template matching and learning a single template for each category in your data.", 'start': 3509.221, 'duration': 15.728}], 'summary': 'Functional form of linear classifier involves template matching and learning single template for each category.', 'duration': 15.728, 'max_score': 3509.221, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG03509221.jpg'}], 'start': 2698.511, 'title': 'Retraining on entire dataset and linear classification in deep learning', 'summary': 'Discusses retraining on the entire dataset after finding best hyperparameters, with potential performance gain of 1%, and the importance of linear classification in building neural networks, its functional form, and potential issues, highlighting its role as a basic building block, the use of matrix vector multiplication for template matching, and the struggle with multimodal data and parity problems.', 'chapters': [{'end': 2749.125, 'start': 2698.511, 'title': 'Retraining on entire dataset', 'summary': 'Discusses retraining on the entire dataset after finding best hyperparameters, highlighting that it is a matter of time and computational resources, with potential performance gain of 1%.', 'duration': 50.614, 'highlights': ['Retraining on the entire dataset after finding best hyperparameters is a matter of time and computational resources, with potential performance gain of 1%.', 'It is a matter of taste whether to retrain on the whole dataset, with time and computational resources being significant factors.', 'Retraining on the entire dataset is sometimes done in practice, especially when striving for an additional 1% performance gain.']}, {'end': 3567.494, 'start': 2753.226, 'title': 'Linear classification in deep learning', 'summary': 'Discusses the importance of linear classification in building neural networks, its functional form, and potential issues, highlighting its role as a basic building block, the use of matrix vector multiplication for template matching, and the struggle with multimodal data and parity problems.', 'duration': 814.268, 'highlights': ['Linear classification as a basic building block for building neural networks The chapter emphasizes the importance of linear classification as a basic building block that helps in building up to whole neural networks and whole convolutional networks.', 'Functional form of matrix vector multiplication for template matching The functional form of matrix vector multiplication corresponds to the idea of template matching and learning a single template for each category in the data.', 'Struggle with multimodal data and parity problems Linear classifiers struggle with multimodal data and parity problems, making it difficult to draw a single linear boundary in such cases.']}], 'duration': 868.983, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/OoUX-nOEjG0/pics/OoUX-nOEjG02698511.jpg', 'highlights': ['Retraining on the entire dataset after finding best hyperparameters is a matter of time and computational resources, with potential performance gain of 1%.', 'Linear classification as a basic building block for building neural networks The chapter emphasizes the importance of linear classification as a basic building block that helps in building up to whole neural networks and whole convolutional networks.', 'Functional form of matrix vector multiplication for template matching The functional form of matrix vector multiplication corresponds to the idea of template matching and learning a single template for each category in the data.']}], 'highlights': ['Support from Google Cloud, including provision of free Google Cloud Credits for the class, enabling students to utilize GPUs and larger machines for assignments and course projects.', 'The importance of efficiently implementing vectorized operations using NumPy in Python for numerical computing and machine learning.', 'Data-driven approach for training machine learning classifiers involves collecting a large dataset of different object categories, training a machine learning classifier, and using the trained model for predictions.', 'The implementation of the nearest neighbor classifier in Python makes use of vectorized operations offered by NumPy, resulting in concise and efficient code.', 'The k-nearest neighbor algorithm is applied for classification, where k represents the number of nearest neighbors considered for prediction. It discusses the application of the k-nearest neighbor algorithm in classification tasks.', 'The decision boundaries are impacted by the value of k, as larger k values lead to smoother decision boundaries and better results. It emphasizes the impact of the k value on decision boundaries and the importance of selecting larger k values for better results.', 'Different distance metrics, such as L1 and L2, make different assumptions about the underlying geometry or topology of the space, influencing the shape of decision boundaries. It explains the influence of different distance metrics on decision boundaries and their implications on the underlying space geometry.', 'The k-nearest neighbor algorithm can be generalized to various data types by specifying different distance metrics, making it a versatile algorithm for classification tasks. It highlights the ability to generalize the k-nearest neighbor algorithm to different data types by specifying different distance metrics, making it a versatile classification algorithm.', 'The importance of selecting hyperparameters like k and distance metrics for the k-nearest neighbors algorithm is emphasized, as these choices significantly impact its performance and are not directly learned from training data.', 'Understanding the impact of changing k and distance metrics on the decision boundary is highlighted as a key aspect of building intuition about the k-nearest neighbors algorithm.', 'The need to make informed choices about hyperparameters such as k and distance metrics for practical application is emphasized, highlighting the problem-dependent nature of these decisions.', 'The significance of a strict separation between validation and test data Emphasizes the importance of maintaining a strict separation between validation and test data to accurately assess algorithm performance on unseen data.', 'Preference of L1 distance over L2 Discusses the preference of L1 distance over L2 in cases where individual elements of the vector have meaning, such as in employee classification based on features like salary and years of employment.', 'Flaws of choosing hyperparameters based on training data accuracy Highlights the drawbacks of selecting hyperparameters based on training data accuracy, emphasizing the importance of evaluating algorithm performance on unseen data after training.', 'Use of cross-validation for small datasets Explains the use of cross-validation for small datasets as a strategy to gain higher confidence about the performance of hyperparameters.', 'K-nearest neighbor classifiers are almost never used in practice for image classification due to inefficiency at test time, inadequacy of distance metrics in capturing perceptual differences, and the curse of dimensionality, which requires a potentially exponential number of training examples to cover the space.', 'The curse of dimensionality in k-nearest neighbor classifiers requires a potentially exponential number of training examples to densely cover the space, making it impractical in high-dimensional problems where densely covering the space in high dimensions is infeasible.', 'The number of training examples needed to densely cover the space grows exponentially with the dimension, e.g., 16 training examples for two dimensions, emphasizing the impact of dimensionality on training requirements.', 'The limitations of L2 distance as a measure of similarity between images are demonstrated through carefully constructed examples, illustrating how the distance metric may not capture the full description of distance or difference between images.', 'Euclidean and L1 distance metrics are ineffective in measuring distances between images, failing to capture perceptual similarity, as demonstrated by the equal L2 distances between original and distorted images.', "The k-nearest neighbor algorithm relies on having a dense sample of training points to perform properly, highlighting the importance of sufficient training data for the algorithm's performance.", 'Retraining on the entire dataset after finding best hyperparameters is a matter of time and computational resources, with potential performance gain of 1%.', 'Linear classification as a basic building block for building neural networks The chapter emphasizes the importance of linear classification as a basic building block that helps in building up to whole neural networks and whole convolutional networks.', 'Functional form of matrix vector multiplication for template matching The functional form of matrix vector multiplication corresponds to the idea of template matching and learning a single template for each category in the data.']}