Coursnap

title
Lesson 3: Deep Learning 2019 - Data blocks; Multi-label classification; Segmentation

description
Lots to cover today! We start lesson 3 looking at an interesting dataset: Planet's Understanding the Amazon from Space. In order to get this data in to the shape we need it for modeling, we'll use one of fastai's most powerful (and unique!) tools: the data block API (https://docs.fast.ai/data_block.html). We'll be coming back to this API many times over the coming lessons, and mastery of it will make you a real fastai superstar! Once you've finished this lesson, if you're ready to learn more about the data block API, have a look at this great article: https://blog.usejournal.com/finding-data-block-nirvana-a-journey-through-the-fastai-data-block-api-c38210537fe4. One important feature of the Planet dataset is that it is a *multi-label* dataset. That is: each satellite image can contain *multiple* labels, whereas previous datasets we've looked at have had exactly one label per image. We'll look at what changes we need to make to work with multi-label datasets. Next, we will look at *image segmentation*, which is the process of labeling every pixel in an image with a category that shows what kind of object is portrayed by that pixel. We will use similar techniques to the earlier image classification models, with a few tweaks. fastai makes image segmentation modeling and interpretation just as easy as image classification, so there won't be too many tweaks required. We will be using the popular Camvid dataset for this part of the lesson. In future lessons, we will come back to it and show a few extra tricks. Our final Camvid model will have dramatically lower error than an model we've been able to find in the academic literature! What if your dependent variable is a continuous value, instead of a category? We answer that question next, looking at a keypoint dataset, and building a model that predicts face keypoints with high accuracy.

detail
{'title': 'Lesson 3: Deep Learning 2019 - Data blocks; Multi-label classification; Segmentation', 'heatmap': [{'end': 1126.102, 'start': 896.22, 'weight': 0.701}, {'end': 1501.258, 'start': 1197.761, 'weight': 0.738}, {'end': 1652.184, 'start': 1573.737, 'weight': 0.761}, {'end': 2476.956, 'start': 2398.81, 'weight': 0.702}, {'end': 3897.021, 'start': 3814.497, 'weight': 0.776}, {'end': 4281.491, 'start': 4121.076, 'weight': 0.716}, {'end': 6147.088, 'start': 6064.582, 'weight': 0.712}, {'end': 6375.185, 'start': 6293.005, 'weight': 0.739}, {'end': 6745.42, 'start': 6669.111, 'weight': 0.729}], 'summary': 'Covers a range of topics including correction of citation error, recommendation of ml courses with 4.9/5 stars rating, promotion of their own ml course at course.fast.ai, showcasing ai applications, deep learning success, data handling with kaggle and pytorch, satellite image recognition, model fine-tuning, transfer learning, image segmentation techniques, optimizing segmentation model training, mixed precision for better segmentation, creating image regression model, and nlp classification in deep learning.', 'chapters': [{'end': 314.91, 'segs': [{'end': 62.2, 'src': 'embed', 'start': 22.942, 'weight': 0, 'content': [{'end': 26.985, 'text': "But in exchange, let's talk about Andrew Ng's excellent machine learning course on Coursera.", 'start': 22.942, 'duration': 4.043}, {'end': 29.414, 'text': "It's really great.", 'start': 28.453, 'duration': 0.961}, {'end': 32.676, 'text': 'As you can see, people gave it 4.9 out of 5 stars.', 'start': 29.754, 'duration': 2.922}, {'end': 35.557, 'text': "In some ways, it's a little dated.", 'start': 34.076, 'duration': 1.481}, {'end': 43.802, 'text': 'But a lot of the content really is as appropriate as ever and taught in a more bottom-up style.', 'start': 36.258, 'duration': 7.544}, {'end': 49.746, 'text': "So it can be quite nice to combine Andrew's bottom-up style and our top-down style and meet somewhere in the middle.", 'start': 43.902, 'duration': 5.844}, {'end': 56.23, 'text': "Also, if you're interested in more machine learning foundations, you should check out our machine learning course as well.", 'start': 50.987, 'duration': 5.243}, {'end': 62.2, 'text': 'If you go to course.fast.ai and click on the machine learning button, that will take you to our course,', 'start': 56.758, 'duration': 5.442}], 'summary': "Andrew ng's machine learning course on coursera has a 4.9/5 rating, while fast.ai offers a complementary machine learning course.", 'duration': 39.258, 'max_score': 22.942, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM22942.jpg'}, {'end': 134.808, 'src': 'embed', 'start': 107.855, 'weight': 2, 'content': [{'end': 115.741, 'text': "there's a production section where right now we have one platform, but more will be added by the time this video comes out,", 'start': 107.855, 'duration': 7.886}, {'end': 119.685, 'text': 'showing you how to deploy your web app really really easily.', 'start': 115.741, 'duration': 3.944}, {'end': 129.613, 'text': "And when I say easily, for example, here's the how to deploy on Zite guide created by San Francisco study group member Navjot.", 'start': 120.685, 'duration': 8.928}, {'end': 131.906, 'text': "As you can see, it's just a page.", 'start': 130.365, 'duration': 1.541}, {'end': 133.667, 'text': "There's almost nothing to do.", 'start': 132.627, 'duration': 1.04}, {'end': 134.808, 'text': "And it's free.", 'start': 134.308, 'duration': 0.5}], 'summary': 'One platform available now, more to be added, showing easy web app deployment with a guide by navjot, free of charge.', 'duration': 26.953, 'max_score': 107.855, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM107855.jpg'}, {'end': 227.594, 'src': 'embed', 'start': 205.136, 'weight': 3, 'content': [{'end': 213.322, 'text': 'So yeah, it should be a good way to get a sense of how to build a web app which talks to a PyTorch model.', 'start': 205.136, 'duration': 8.186}, {'end': 220.148, 'text': 'So examples of web apps people have built during the week.', 'start': 217.065, 'duration': 3.083}, {'end': 224.131, 'text': 'Edward Ross built the What Car Is That??', 'start': 221.409, 'duration': 2.722}, {'end': 227.594, 'text': 'app, or more specifically, the What Australian Car Is That?', 'start': 224.771, 'duration': 2.823}], 'summary': 'Web app examples include what car is that?? and what australian car is that?', 'duration': 22.458, 'max_score': 205.136, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM205136.jpg'}], 'start': 1.547, 'title': 'Correction, ml courses, web app deployment & examples', 'summary': "Corrects a citation error, recommends andrew ng's ml course with 4.9/5 stars rating, and promotes their own ml course at course.fast.ai, which is twice as long as the deep learning course. it also discusses the ease of web app deployment using a simple guide like zite and provides examples of student-created web apps, including a car recognition app and a guitar classifier.", 'chapters': [{'end': 83.947, 'start': 1.547, 'title': 'Correction and machine learning courses', 'summary': "Corrects a previous citation error and recommends andrew ng's machine learning course on coursera, which has a 4.9 out of 5 stars rating, and also promotes their own machine learning course at course.fast.ai, which is twice as long as the deep learning course.", 'duration': 82.4, 'highlights': ["The chapter corrects a citation error regarding the source of a chart and recommends Andrew Ng's machine learning course on Coursera, which has a 4.9 out of 5 stars rating.", 'The chapter promotes their own machine learning course at course.fast.ai, which is twice as long as the deep learning course and covers foundational concepts in machine learning.']}, {'end': 314.91, 'start': 84.067, 'title': 'Web app deployment and examples', 'summary': 'Discusses the ease of deploying a web app using a simple guide, such as the zite guide, and provides examples of web apps created by students, including a car recognition app and a guitar classifier.', 'duration': 230.843, 'highlights': ["The chapter discusses the ease of deploying a web app using a simple guide, such as the Zite guide. The course V3 website offers a production section with a guide on how to deploy web apps easily, like the 'how to deploy on Zite' guide created by a study group member.", "Examples of web apps created by students are provided, including a car recognition app and a guitar classifier. Students have created various web apps like the 'What Car Is That??' app and a guitar classifier, showcasing the diverse applications of deploying web apps.", 'The ease of deploying web apps is highlighted through the simplicity and speed of the process. Deploying a web app is described as being fast, uncomplicated, and a feasible option for creating an MVP, with the potential to handle 1,000 simultaneous requests.']}], 'duration': 313.363, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM1547.jpg', 'highlights': ['The chapter promotes their own machine learning course at course.fast.ai, which is twice as long as the deep learning course and covers foundational concepts in machine learning.', "The chapter corrects a citation error regarding the source of a chart and recommends Andrew Ng's machine learning course on Coursera, which has a 4.9 out of 5 stars rating.", 'The ease of deploying web apps is highlighted through the simplicity and speed of the process. Deploying a web app is described as being fast, uncomplicated, and a feasible option for creating an MVP, with the potential to handle 1,000 simultaneous requests.', "Examples of web apps created by students are provided, including a car recognition app and a guitar classifier. Students have created various web apps like the 'What Car Is That??' app and a guitar classifier, showcasing the diverse applications of deploying web apps."]}, {'end': 658.835, 'segs': [{'end': 352.106, 'src': 'embed', 'start': 314.93, 'weight': 4, 'content': [{'end': 321.632, 'text': 'So, you know, a fairly niche application, but, you know, apparently there are 36 people who will appreciate this at least.', 'start': 314.93, 'duration': 6.702}, {'end': 324.471, 'text': 'I have no cousins.', 'start': 323.71, 'duration': 0.761}, {'end': 325.451, 'text': "That's a lot of cousins.", 'start': 324.691, 'duration': 0.76}, {'end': 334.316, 'text': 'This is an example of an app which actually takes a video feed and turns it into a motion classifier.', 'start': 327.032, 'duration': 7.284}, {'end': 338.138, 'text': "That's pretty cool.", 'start': 334.336, 'duration': 3.802}, {'end': 340.8, 'text': 'I like it.', 'start': 340.44, 'duration': 0.36}, {'end': 344.822, 'text': 'Team 26, good job.', 'start': 343.361, 'duration': 1.461}, {'end': 352.106, 'text': "Here's a similar one for American Sign Language.", 'start': 348.604, 'duration': 3.502}], 'summary': 'An app with a motion classifier has 36 potential users. also, a similar app for american sign language exists.', 'duration': 37.176, 'max_score': 314.93, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM314930.jpg'}, {'end': 458.528, 'src': 'embed', 'start': 431.589, 'weight': 2, 'content': [{'end': 435.85, 'text': "But he says he's getting close to state of the art results for univariate time series modeling.", 'start': 431.589, 'duration': 4.261}, {'end': 437.452, 'text': 'by turning it into a picture.', 'start': 436.331, 'duration': 1.121}, {'end': 442.916, 'text': "And so I like this idea of turning stuff that's not a picture into a picture.", 'start': 437.792, 'duration': 5.124}, {'end': 452.684, 'text': 'So something really interesting about this project, which was looking at emotion classification from faces,', 'start': 445.919, 'duration': 6.765}, {'end': 458.528, 'text': 'was that he was specifically asking the question how well does it go without changing anything, just using the default settings?', 'start': 452.684, 'duration': 5.844}], 'summary': 'Achieving state-of-the-art results for time series modeling by turning non-pictures into pictures and exploring emotion classification from faces with default settings.', 'duration': 26.939, 'max_score': 431.589, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM431589.jpg'}, {'end': 525.86, 'src': 'embed', 'start': 470.093, 'weight': 0, 'content': [{'end': 474.115, 'text': 'And he looked at this facial expression recognition data set.', 'start': 470.093, 'duration': 4.022}, {'end': 477.616, 'text': 'There was a 2017 paper that he compared his results to.', 'start': 474.135, 'duration': 3.481}, {'end': 486.839, 'text': 'And he got equal or slightly better results than the state of the art paper on emotion recognition,', 'start': 477.776, 'duration': 9.063}, {'end': 489.34, 'text': 'without doing any custom hyperparameter tuning at all.', 'start': 486.839, 'duration': 2.501}, {'end': 490.34, 'text': 'So that was really cool.', 'start': 489.72, 'duration': 0.62}, {'end': 500.317, 'text': 'And then Elena Harley, who I featured one of her works last week, has done another really cool work in the genomic space,', 'start': 491.894, 'duration': 8.423}, {'end': 512.361, 'text': 'which is looking at variant analysis, looking at false positives in these kinds of pictures.', 'start': 500.317, 'duration': 12.044}, {'end': 525.86, 'text': 'And she found she was able to decrease the number of false positives coming out of the kind of industry standard software she was using by 500% by using a deep learning workflow.', 'start': 513.182, 'duration': 12.678}], 'summary': 'Achieved equal or slightly better results in emotion recognition without hyperparameter tuning. reduced false positives by 500% in genomic variant analysis using deep learning workflow.', 'duration': 55.767, 'max_score': 470.093, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM470093.jpg'}, {'end': 614.312, 'src': 'embed', 'start': 587.25, 'weight': 3, 'content': [{'end': 590.211, 'text': "I think that'll be a really good exercise in making sure you understand the material.", 'start': 587.25, 'duration': 2.961}, {'end': 596.572, 'text': "So the first one we're going to look at is a data set of satellite images.", 'start': 591.631, 'duration': 4.941}, {'end': 603.968, 'text': 'And satellite imaging is a really fertile area for deep learning.', 'start': 597.065, 'duration': 6.903}, {'end': 609.83, 'text': "It's certainly a lot of people already using deep learning and satellite imaging, but only scratching the surface.", 'start': 604.888, 'duration': 4.942}, {'end': 614.312, 'text': "And the dataset that we're going to look at looks like this.", 'start': 610.691, 'duration': 3.621}], 'summary': 'Deep learning in satellite imaging is a fertile area with untapped potential.', 'duration': 27.062, 'max_score': 587.25, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM587250.jpg'}], 'start': 314.93, 'title': 'Ai applications and deep learning success', 'summary': 'Showcases innovative ai applications including motion classification, satellite image recognition, time series analysis, and emotion classification, highlighting the ease of model training and diverse applications. it also discusses the success of deep learning in achieving improved results in emotion recognition and decreasing false positives by 500% in genomic variant analysis, and outlines its potential in satellite image classification and multi-label classification.', 'chapters': [{'end': 469.533, 'start': 314.93, 'title': 'Innovative ai applications showcase', 'summary': 'Highlights innovative ai applications including motion classification, satellite image recognition, time series analysis, and emotion classification, showcasing the ease of model training and diverse applications.', 'duration': 154.603, 'highlights': ['An app that turns video feed into a motion classifier, appreciated by 36 people, showcasing a niche yet impactful application.', 'Building an effective model for satellite image recognition by carefully constructing the validation set and acquiring more data, demonstrating the importance of thoughtful data curation for model accuracy and effectiveness.', 'Converting univariate time series into a picture to achieve state-of-the-art results for time series modeling, revealing the innovation in transforming non-image data into a visual format.', 'Conducting an experiment on emotion classification from faces using default settings to demonstrate the ease of model training without extensive specific knowledge, challenging the notion of difficulty in model training.']}, {'end': 658.835, 'start': 470.093, 'title': 'Applications of deep learning in various domains', 'summary': 'Discusses the success of deep learning in achieving improved results in emotion recognition and decreasing false positives by 500% in genomic variant analysis, as well as outlining the potential for deep learning in satellite image classification and multi-label classification.', 'duration': 188.742, 'highlights': ['Elena Harley decreased the number of false positives in genomic variant analysis by 500% using a deep learning workflow. Elena Harley achieved a 500% reduction in false positives in genomic variant analysis by employing a deep learning workflow.', 'Deep learning achieved equal or slightly better results than the state of the art paper on emotion recognition without custom hyperparameter tuning. Deep learning achieved comparable or enhanced results in emotion recognition without custom hyperparameter tuning when compared to the state of the art paper.', 'The chapter introduces the potential of deep learning in satellite image classification and multi-label classification. The chapter introduces the potential of deep learning in satellite image classification and multi-label classification, highlighting the possibility of employing deep learning in diverse domains.']}], 'duration': 343.905, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM314930.jpg', 'highlights': ['Elena Harley achieved a 500% reduction in false positives in genomic variant analysis using a deep learning workflow.', 'Deep learning achieved comparable or enhanced results in emotion recognition without custom hyperparameter tuning when compared to the state of the art paper.', 'Converting univariate time series into a picture to achieve state-of-the-art results for time series modeling, revealing the innovation in transforming non-image data into a visual format.', 'Building an effective model for satellite image recognition by carefully constructing the validation set and acquiring more data, demonstrating the importance of thoughtful data curation for model accuracy and effectiveness.', 'An app that turns video feed into a motion classifier, appreciated by 36 people, showcasing a niche yet impactful application.', 'The chapter introduces the potential of deep learning in satellite image classification and multi-label classification, highlighting the possibility of employing deep learning in diverse domains.', 'Conducting an experiment on emotion classification from faces using default settings to demonstrate the ease of model training without extensive specific knowledge, challenging the notion of difficulty in model training.']}, {'end': 1898.269, 'segs': [{'end': 724.381, 'src': 'embed', 'start': 680.102, 'weight': 0, 'content': [{'end': 684.67, 'text': 'I tend to think the goal is to try and get in the top 10%, And in my experience,', 'start': 680.102, 'duration': 4.568}, {'end': 689.012, 'text': "all the people in the top 10% of a competition really know what they're doing.", 'start': 684.67, 'duration': 4.342}, {'end': 692.714, 'text': "So if you can get in the top 10%, then that's a really good sign.", 'start': 689.632, 'duration': 3.082}, {'end': 700.938, 'text': 'Pretty much every Kaggle data set is not available for download outside of Kaggle, at least the competition data sets.', 'start': 694.795, 'duration': 6.143}, {'end': 702.719, 'text': 'So you have to download it through Kaggle.', 'start': 701.318, 'duration': 1.401}, {'end': 709.302, 'text': 'And the good news is that Kaggle provides a Python-based downloader tool, which you can use.', 'start': 703.139, 'duration': 6.163}, {'end': 713.364, 'text': "So we've got a quick description here of how to download stuff from Kaggle.", 'start': 709.802, 'duration': 3.562}, {'end': 722.861, 'text': 'So to install stuff, to download stuff from Kaggle, you first have to install the Kaggle download tool.', 'start': 715.758, 'duration': 7.103}, {'end': 724.381, 'text': 'So just pip install Kaggle.', 'start': 722.941, 'duration': 1.44}], 'summary': 'Goal: achieve top 10% in competitions, access kaggle data using python downloader.', 'duration': 44.279, 'max_score': 680.102, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM680102.jpg'}, {'end': 807.942, 'src': 'embed', 'start': 775.923, 'weight': 3, 'content': [{'end': 782.584, 'text': "So once you've got that module installed, you can then go ahead and download the data.", 'start': 775.923, 'duration': 6.661}, {'end': 791.446, 'text': "And basically, it's as simple as saying Kaggle competitions download, the competition name, and then the files that you want.", 'start': 782.944, 'duration': 8.502}, {'end': 797.007, 'text': 'The only other steps before you do that is that you have to authenticate yourself.', 'start': 792.226, 'duration': 4.781}, {'end': 807.942, 'text': "And you'll see there's a little bit of information here on exactly how you can go about downloading from Kaggle the file containing your API authentication information.", 'start': 797.777, 'duration': 10.165}], 'summary': 'After installing the module, download data from kaggle competitions after authentication.', 'duration': 32.019, 'max_score': 775.923, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM775923.jpg'}, {'end': 1126.102, 'src': 'heatmap', 'start': 896.22, 'weight': 0.701, 'content': [{'end': 906.945, 'text': 'So in this case, because we have multiple labels for each tile,', 'start': 896.22, 'duration': 10.725}, {'end': 912.299, 'text': "we We clearly can't have a different folder for each image telling us what the label is.", 'start': 906.945, 'duration': 5.354}, {'end': 914.081, 'text': 'We need some different way to label it.', 'start': 912.379, 'duration': 1.702}, {'end': 923.328, 'text': 'And so the way that Kaggle did it was they provided a CSV file that had each file name along with a list of all of the labels.', 'start': 914.661, 'duration': 8.667}, {'end': 930.094, 'text': 'In order to just take a look at that CSV file, we can read it using the pandas library.', 'start': 925.63, 'duration': 4.464}, {'end': 939.525, 'text': "If you haven't used pandas before, it's kind of the standard way of dealing with tabular data in Python.", 'start': 931.134, 'duration': 8.391}, {'end': 942.867, 'text': 'It pretty much always appears in the PD namespace.', 'start': 940.085, 'duration': 2.782}, {'end': 948.131, 'text': "In this case, we're not really doing anything with it other than just showing you the contents of this file.", 'start': 943.388, 'duration': 4.743}, {'end': 951.874, 'text': 'So we can read it, we can take a look at the first few lines, and there it is.', 'start': 948.672, 'duration': 3.202}, {'end': 958.62, 'text': 'So we want to turn this into something we can use for modeling.', 'start': 953.355, 'duration': 5.265}, {'end': 965.759, 'text': 'So the kind of object that we use for modeling is an object of the data bunch class.', 'start': 959.4, 'duration': 6.359}, {'end': 968.74, 'text': 'So we have to somehow create a data bunch out of this.', 'start': 966.38, 'duration': 2.36}, {'end': 974.281, 'text': "Once we have a data bunch, we'll be able to go dot show batch to take a look at it.", 'start': 970.06, 'duration': 4.221}, {'end': 977.382, 'text': "And then we'll be able to go create CNN with it.", 'start': 975.301, 'duration': 2.081}, {'end': 978.702, 'text': "And then we'll be able to start training.", 'start': 977.482, 'duration': 1.22}, {'end': 988.944, 'text': 'So really, the trickiest step previously in deep learning has often been getting your data into a form that you can get it into a model.', 'start': 979.562, 'duration': 9.382}, {'end': 995.496, 'text': "So far, we've been showing you how to do that using various factory methods.", 'start': 990.652, 'duration': 4.844}, {'end': 1001.902, 'text': 'So methods where you basically say, I want to create this kind of data from this kind of source with these kinds of options.', 'start': 995.977, 'duration': 5.925}, {'end': 1005.445, 'text': 'The problem is, I mean, that works fine sometimes.', 'start': 1002.642, 'duration': 2.803}, {'end': 1008.207, 'text': 'And we showed you a few ways of doing it over the last couple of weeks.', 'start': 1005.505, 'duration': 2.702}, {'end': 1012.611, 'text': 'But sometimes you want more flexibility.', 'start': 1009.528, 'duration': 3.083}, {'end': 1020.92, 'text': "Because there's so many choices that you have to make about where do the files live and what's the structure they're in, and how do the labels appear,", 'start': 1013.137, 'duration': 7.783}, {'end': 1024.641, 'text': 'and how do you split out the validation set and how do you transform it, and so forth.', 'start': 1020.92, 'duration': 3.721}, {'end': 1031.143, 'text': "So we've got this unique API that I'm really proud of called the DataBlock API.", 'start': 1025.06, 'duration': 6.083}, {'end': 1036.54, 'text': 'And the DataBlock API makes each one of those decisions a separate decision that you make.', 'start': 1031.604, 'duration': 4.936}, {'end': 1043.748, 'text': "There's separate methods and with their own parameters for every choice that you make around how do I create, you know, set up my data.", 'start': 1036.601, 'duration': 7.147}, {'end': 1052.838, 'text': "So for example, to grab the planet data, we would say we've got a list of image files that are in a folder.", 'start': 1044.669, 'duration': 8.169}, {'end': 1056.971, 'text': "And they're labeled based on a CSV with this name.", 'start': 1053.55, 'duration': 3.421}, {'end': 1058.711, 'text': 'They have this separator.', 'start': 1057.531, 'duration': 1.18}, {'end': 1064.232, 'text': "Remember I showed you back here that there's a space between them? So by passing in separator, it's going to create multiple labels.", 'start': 1058.731, 'duration': 5.501}, {'end': 1066.393, 'text': 'The images are in this folder.', 'start': 1064.832, 'duration': 1.561}, {'end': 1067.473, 'text': 'They have this suffix.', 'start': 1066.473, 'duration': 1}, {'end': 1071.654, 'text': "We're going to randomly split out a validation set with 20% of the data.", 'start': 1067.973, 'duration': 3.681}, {'end': 1078.115, 'text': "We're going to create data sets from that, which we're then going to transform with these transformations.", 'start': 1071.674, 'duration': 6.441}, {'end': 1081.876, 'text': "And then we're going to create a data bunch out of that, which we'll then normalize.", 'start': 1078.135, 'duration': 3.741}, {'end': 1083.582, 'text': 'using these statistics.', 'start': 1082.682, 'duration': 0.9}, {'end': 1085.883, 'text': "So there's all these different steps.", 'start': 1084.063, 'duration': 1.82}, {'end': 1091.305, 'text': 'So, to give you a sense of what that looks like,', 'start': 1087.124, 'duration': 4.181}, {'end': 1102.749, 'text': "the first thing I'm going to do is kind of go back and explain what are all of the PyTorch and FastAI kind of classes that you need to know about that are going to appear in this process.", 'start': 1091.305, 'duration': 11.444}, {'end': 1107.35, 'text': "Because you're going to see them all the time in the FastAI docs and the PyTorch docs.", 'start': 1102.769, 'duration': 4.581}, {'end': 1113.911, 'text': 'So the first one you need to know about is a class called a data set.', 'start': 1110.071, 'duration': 3.84}, {'end': 1117.755, 'text': 'And the data set class is part of PyTorch.', 'start': 1115.373, 'duration': 2.382}, {'end': 1121.798, 'text': 'And this is the source code for the data set class.', 'start': 1118.415, 'duration': 3.383}, {'end': 1126.102, 'text': 'As you can see, it actually does nothing at all.', 'start': 1122.659, 'duration': 3.443}], 'summary': 'Kaggle provided a csv file with image labels, which is read using pandas. the datablock api allows for flexible data setup and transformations.', 'duration': 229.882, 'max_score': 896.22, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM896220.jpg'}, {'end': 1501.258, 'src': 'heatmap', 'start': 1197.761, 'weight': 0.738, 'content': [{'end': 1206.347, 'text': 'So, in other words, your data, the starting point for your data, is something where you can say what is the third item of data in my data set?', 'start': 1197.761, 'duration': 8.586}, {'end': 1208.028, 'text': "So that's what getItem does.", 'start': 1207.027, 'duration': 1.001}, {'end': 1211.05, 'text': "And how big is my data set? That's what the length does.", 'start': 1208.328, 'duration': 2.722}, {'end': 1220.136, 'text': 'So Fast.ai has lots of data set subclasses that do that for all different kinds of stuff.', 'start': 1212.191, 'duration': 7.945}, {'end': 1224.838, 'text': "And so, so far, you've been seeing image classification data sets.", 'start': 1220.576, 'duration': 4.262}, {'end': 1233.505, 'text': "So they're data sets where get item will return an image and a single label of what is that image.", 'start': 1225.539, 'duration': 7.966}, {'end': 1236.587, 'text': "So that's what a data set is.", 'start': 1235.366, 'duration': 1.221}, {'end': 1240.05, 'text': 'Now, a data set is not enough to train a model.', 'start': 1237.208, 'duration': 2.842}, {'end': 1254.712, 'text': 'The first thing we know we have to do if you think back to the gradient descent tutorial last week is we have to have a few images or a few items at a time so that our GPU can work in parallel.', 'start': 1240.71, 'duration': 14.002}, {'end': 1256.833, 'text': 'Remember, we do this thing called a mini batch.', 'start': 1254.932, 'duration': 1.901}, {'end': 1262.157, 'text': 'A mini batch is a few items that we present to the model at a time that it can train from in parallel.', 'start': 1256.913, 'duration': 5.244}, {'end': 1272.204, 'text': 'So to create a mini batch, we use another PyTorch class called a data loader.', 'start': 1263.378, 'duration': 8.826}, {'end': 1278.768, 'text': 'And so a data loader takes a data set in its constructor.', 'start': 1272.724, 'duration': 6.044}, {'end': 1283.425, 'text': "So it's now saying oh, this is something I can get, the third item and the fifth item and the ninth item.", 'start': 1279.244, 'duration': 4.181}, {'end': 1295.329, 'text': "And it's going to grab items at random and create a batch of whatever size you ask for and pop it on the GPU and send it off to your model for you.", 'start': 1283.905, 'duration': 11.424}, {'end': 1303.151, 'text': 'So a data loader is something that grabs individual items, combines them into a mini batch, pops them on the GPU for modeling.', 'start': 1295.829, 'duration': 7.322}, {'end': 1306.712, 'text': "So that's called a data loader, and it comes from a data set.", 'start': 1304.271, 'duration': 2.441}, {'end': 1311.059, 'text': "So you can see already there's kind of choices you have to make.", 'start': 1308.018, 'duration': 3.041}, {'end': 1313.02, 'text': 'You know, what kind of data set am I creating?', 'start': 1311.139, 'duration': 1.881}, {'end': 1314.041, 'text': 'What is the data for it?', 'start': 1313.14, 'duration': 0.901}, {'end': 1315.081, 'text': 'Where is it going to come from?', 'start': 1314.061, 'duration': 1.02}, {'end': 1318.823, 'text': 'And then, when I create my data loader, what batch size do I want to use right?', 'start': 1315.101, 'duration': 3.722}, {'end': 1321.044, 'text': "There still isn't enough to train a model.", 'start': 1319.643, 'duration': 1.401}, {'end': 1325.766, 'text': "Not really, because we've got no way to validate the model.", 'start': 1322.225, 'duration': 3.541}, {'end': 1332.969, 'text': "If all we have is a training set, then we have no way to know how we're doing, because we need a separate set of held out data,", 'start': 1325.926, 'duration': 7.043}, {'end': 1335.07, 'text': "a validation set to see how we're getting along.", 'start': 1332.969, 'duration': 2.101}, {'end': 1340.584, 'text': 'So for that, We use a fast AI class called a data bunch.', 'start': 1335.791, 'duration': 4.793}, {'end': 1347.686, 'text': 'And a data bunch is something which, as it says here, binds together a training data loader and a valid data loader.', 'start': 1341.144, 'duration': 6.542}, {'end': 1356.328, 'text': 'And when you look at the fast AI docs, when you see these kind of monospace font things,', 'start': 1348.506, 'duration': 7.822}, {'end': 1358.929, 'text': "they're always referring to some symbol you can look up elsewhere.", 'start': 1356.328, 'duration': 2.601}, {'end': 1360.63, 'text': 'So in this case, you can see train DL is here.', 'start': 1358.949, 'duration': 1.681}, {'end': 1369.36, 'text': "And there's no point knowing that there's an argument with a certain name unless you know what that argument is.", 'start': 1362.898, 'duration': 6.462}, {'end': 1374.182, 'text': 'So you should always look after the colon to find out that is a data loader.', 'start': 1369.38, 'duration': 4.802}, {'end': 1381.344, 'text': "So when you create a data bunch, you're basically giving it a training set data loader and a validation set data loader.", 'start': 1374.742, 'duration': 6.602}, {'end': 1388.987, 'text': "And that's now an object that you can send off to a learner and start loading, start fitting.", 'start': 1381.845, 'duration': 7.142}, {'end': 1391.808, 'text': 'So those are the basic pieces.', 'start': 1389.827, 'duration': 1.981}, {'end': 1405.265, 'text': 'So coming back to here, this stuff plus this line is all the stuff which is creating the data set.', 'start': 1392.802, 'duration': 12.463}, {'end': 1410.767, 'text': "So it's saying, where did the images come from? Because the data set, the index returns two things.", 'start': 1405.706, 'duration': 5.061}, {'end': 1414.828, 'text': "It returns the image and the labels, assuming it's an image data set.", 'start': 1410.807, 'duration': 4.021}, {'end': 1416.148, 'text': 'So where do the images come from??', 'start': 1414.868, 'duration': 1.28}, {'end': 1417.969, 'text': 'Where do the labels come from?', 'start': 1416.749, 'duration': 1.22}, {'end': 1421.93, 'text': "And then I'm going to create two separate data sets the training and the validation.", 'start': 1418.489, 'duration': 3.441}, {'end': 1425.295, 'text': 'This is the thing that actually turns them into PyTorch data sets.', 'start': 1422.674, 'duration': 2.621}, {'end': 1427.417, 'text': 'This is the thing that transforms them.', 'start': 1425.876, 'duration': 1.541}, {'end': 1435.881, 'text': 'And then this is actually going to create the data loader and the data bunch in one go.', 'start': 1429.238, 'duration': 6.643}, {'end': 1440.244, 'text': "So let's look at some examples of this data block API.", 'start': 1437.222, 'duration': 3.022}, {'end': 1448.048, 'text': "Because once you understand the data block API, you'll never be lost for how to convert your data set into something you can start modeling with.", 'start': 1440.264, 'duration': 7.784}, {'end': 1452.956, 'text': "So here's some examples of using the DataBlock API.", 'start': 1450.695, 'duration': 2.261}, {'end': 1466.76, 'text': "So for example, if you're looking at MNIST, which remember is the pictures and classes of handwritten numerals, you can do something like this.", 'start': 1453.476, 'duration': 13.284}, {'end': 1470.601, 'text': 'What kind of data set is this going to be?', 'start': 1468.501, 'duration': 2.1}, {'end': 1482.113, 'text': "It's going to come from a list of image files which are in some folder and they're labeled according to the folder name that they're in.", 'start': 1470.721, 'duration': 11.392}, {'end': 1489.895, 'text': "And then we're gonna split it into train and validation according to the folder that they're in, train and validation.", 'start': 1483.694, 'duration': 6.201}, {'end': 1493.296, 'text': 'You can optionally add a test set.', 'start': 1491.715, 'duration': 1.581}, {'end': 1495.936, 'text': "We're gonna be talking more about test sets later in the course.", 'start': 1493.356, 'duration': 2.58}, {'end': 1501.258, 'text': "Okay, we'll convert those into PyTorch data sets now that that's all set up.", 'start': 1495.956, 'duration': 5.302}], 'summary': 'Fast.ai provides data set subclasses for image classification. data loaders and data bunches are used to create mini batches and validate the model for training.', 'duration': 303.497, 'max_score': 1197.761, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM1197761.jpg'}, {'end': 1318.823, 'src': 'embed', 'start': 1295.829, 'weight': 2, 'content': [{'end': 1303.151, 'text': 'So a data loader is something that grabs individual items, combines them into a mini batch, pops them on the GPU for modeling.', 'start': 1295.829, 'duration': 7.322}, {'end': 1306.712, 'text': "So that's called a data loader, and it comes from a data set.", 'start': 1304.271, 'duration': 2.441}, {'end': 1311.059, 'text': "So you can see already there's kind of choices you have to make.", 'start': 1308.018, 'duration': 3.041}, {'end': 1313.02, 'text': 'You know, what kind of data set am I creating?', 'start': 1311.139, 'duration': 1.881}, {'end': 1314.041, 'text': 'What is the data for it?', 'start': 1313.14, 'duration': 0.901}, {'end': 1315.081, 'text': 'Where is it going to come from?', 'start': 1314.061, 'duration': 1.02}, {'end': 1318.823, 'text': 'And then, when I create my data loader, what batch size do I want to use right?', 'start': 1315.101, 'duration': 3.722}], 'summary': 'A data loader combines items into mini batches for modeling, offering choices like data set creation and batch size selection.', 'duration': 22.994, 'max_score': 1295.829, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM1295829.jpg'}, {'end': 1466.76, 'src': 'embed', 'start': 1437.222, 'weight': 4, 'content': [{'end': 1440.244, 'text': "So let's look at some examples of this data block API.", 'start': 1437.222, 'duration': 3.022}, {'end': 1448.048, 'text': "Because once you understand the data block API, you'll never be lost for how to convert your data set into something you can start modeling with.", 'start': 1440.264, 'duration': 7.784}, {'end': 1452.956, 'text': "So here's some examples of using the DataBlock API.", 'start': 1450.695, 'duration': 2.261}, {'end': 1466.76, 'text': "So for example, if you're looking at MNIST, which remember is the pictures and classes of handwritten numerals, you can do something like this.", 'start': 1453.476, 'duration': 13.284}], 'summary': 'Understanding the data block api is crucial for modeling datasets, as demonstrated with mnist examples.', 'duration': 29.538, 'max_score': 1437.222, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM1437222.jpg'}, {'end': 1652.184, 'src': 'heatmap', 'start': 1573.737, 'weight': 0.761, 'content': [{'end': 1575.258, 'text': "Again, we're grabbing it from a folder.", 'start': 1573.737, 'duration': 1.521}, {'end': 1578.36, 'text': "This time we're labeling it based on a CSV file.", 'start': 1576.058, 'duration': 2.302}, {'end': 1579.5, 'text': "We're randomly splitting it.", 'start': 1578.42, 'duration': 1.08}, {'end': 1585.504, 'text': "By default, it's 20%, creating data sets, transforming it using these transforms.", 'start': 1579.941, 'duration': 5.563}, {'end': 1590.347, 'text': "We're going to use a smaller size and then create a data bunch.", 'start': 1587.785, 'duration': 2.562}, {'end': 1592.388, 'text': 'There it is.', 'start': 1592.048, 'duration': 0.34}, {'end': 1598.249, 'text': 'And so data bunches know how to draw themselves, amongst other things.', 'start': 1594.386, 'duration': 3.863}, {'end': 1601.791, 'text': "So here's some more examples we're going to be seeing later today.", 'start': 1598.649, 'duration': 3.142}, {'end': 1608.135, 'text': 'What if we look at this data set called Canvid? Canvid looks like this.', 'start': 1602.832, 'duration': 5.303}, {'end': 1613.899, 'text': 'It contains pictures and every pixel in the picture is color coded right?', 'start': 1608.755, 'duration': 5.144}, {'end': 1622.605, 'text': "So in this case we have a list of files in a folder and we're going to label them, in this case using a function.", 'start': 1614.619, 'duration': 7.986}, {'end': 1629.608, 'text': "And so this function is basically the thing, we're going to see it later, which tells it whereabouts of the color coding for each pixel.", 'start': 1623.504, 'duration': 6.104}, {'end': 1630.669, 'text': "It's in a different place.", 'start': 1629.889, 'duration': 0.78}, {'end': 1637.054, 'text': 'Randomly split it in some way, create some data sets in some way.', 'start': 1631.69, 'duration': 5.364}, {'end': 1645.64, 'text': 'We can tell it for our particular list of classes, you know, how do we know what pixel, you know, value one versus pixel value two is.', 'start': 1638.034, 'duration': 7.606}, {'end': 1648.782, 'text': 'And that was something that we can basically read in like so.', 'start': 1646.14, 'duration': 2.642}, {'end': 1652.184, 'text': 'Again, some transforms.', 'start': 1650.183, 'duration': 2.001}], 'summary': 'Data sets randomly split into 20%, transformed using transforms, creating data bunch.', 'duration': 78.447, 'max_score': 1573.737, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM1573737.jpg'}], 'start': 659.796, 'title': 'Data handling with kaggle and pytorch', 'summary': 'Covers downloading data from kaggle, aiming for top 10%, utilizing python-based downloader, installing kaggle, preparing data for modeling, creating data sets, using pytorch and fast.ai, and various data set transformations.', 'chapters': [{'end': 743.981, 'start': 659.796, 'title': 'Downloading data from kaggle', 'summary': 'Discusses the process of downloading data from kaggle, emphasizing the significance of learning from competition data, aiming for the top 10%, and utilizing the python-based downloader tool provided by kaggle.', 'duration': 84.185, 'highlights': ['You can see how would I have gone in that competition, which is a good way to assess your knowledge and skills.', 'The goal is to try and get in the top 10%, as it indicates a strong understanding of the subject matter.', 'Kaggle provides a Python-based downloader tool for downloading data, simplifying the process for users.', "To download data from Kaggle, you first have to install the Kaggle download tool using 'pip install Kaggle'.", 'A tip is shared on how to easily uncomment and re-comment lines in the notebook using a keyboard shortcut.']}, {'end': 1183.093, 'start': 744.982, 'title': 'Kaggle installation and data preparation', 'summary': 'Explains the process of installing kaggle and preparing data for modeling, including steps for installation, data downloading, unzipping, csv reading, and creating a data bunch for modeling using the pandas library and the datablock api.', 'duration': 438.111, 'highlights': ['The chapter explains the process of installing Kaggle and preparing data for modeling The transcript covers the installation process for Kaggle and the steps for preparing the data for modeling.', 'Steps for installation, data downloading, unzipping, and CSV reading are outlined The transcript provides instructions for installing Kaggle, downloading data, unzipping files, and reading CSV using the Pandas library.', 'Creating a data bunch for modeling using the Pandas library and the DataBlock API The chapter explains the creation of a data bunch for modeling and introduces the DataBlock API for making data preparation decisions.']}, {'end': 1898.269, 'start': 1183.153, 'title': 'Creating data sets and data loaders', 'summary': 'Explains the process of creating data sets using pytorch and fast.ai, including the role of data loaders and data bunches, with examples of using the datablock api and various data set transformations.', 'duration': 715.116, 'highlights': ['Understanding the purpose of data loaders and data sets Data loaders are used to create mini batches from data sets, allowing the GPU to work in parallel, while data sets define the structure and content of the data. Mini batches enable parallel training on the GPU.', 'Utilizing the DataBlock API to convert data sets for modeling The DataBlock API provides a convenient way to convert data sets into a format suitable for modeling, offering examples of its use with different data sets such as MNIST, Planet, Canvid, and object detection data sets.', 'Customizing data set transformations using the DataBlock API The DataBlock API allows for customization of data set transformations, such as flipping images horizontally or vertically, performing perspective warping, and defining various symmetries, with specific settings that work well for the Planet data set.']}], 'duration': 1238.473, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM659796.jpg', 'highlights': ['Kaggle provides a Python-based downloader tool for downloading data, simplifying the process for users.', 'The goal is to try and get in the top 10%, as it indicates a strong understanding of the subject matter.', 'Understanding the purpose of data loaders and data sets Data loaders are used to create mini batches from data sets, allowing the GPU to work in parallel, while data sets define the structure and content of the data. Mini batches enable parallel training on the GPU.', 'The chapter explains the process of installing Kaggle and preparing data for modeling The transcript covers the installation process for Kaggle and the steps for preparing the data for modeling.', 'Utilizing the DataBlock API to convert data sets for modeling The DataBlock API provides a convenient way to convert data sets into a format suitable for modeling, offering examples of its use with different data sets such as MNIST, Planet, Canvid, and object detection data sets.']}, {'end': 2570.379, 'segs': [{'end': 1940.353, 'src': 'embed', 'start': 1915.59, 'weight': 1, 'content': [{'end': 1923.592, 'text': "then that kind of change of shape is certainly something that you would want to include as you're creating your training batches.", 'start': 1915.59, 'duration': 8.002}, {'end': 1926.132, 'text': 'You want to modify it a little bit each time.', 'start': 1923.632, 'duration': 2.5}, {'end': 1929.033, 'text': 'Not true for satellite images.', 'start': 1927.893, 'duration': 1.14}, {'end': 1932.514, 'text': 'A satellite always points straight down at the planet.', 'start': 1929.353, 'duration': 3.161}, {'end': 1939.173, 'text': "So if you added perspective warping, you would be making changes that aren't going to be there in real life.", 'start': 1933.494, 'duration': 5.679}, {'end': 1940.353, 'text': 'So I turn that off.', 'start': 1939.613, 'duration': 0.74}], 'summary': 'Training batches should include shape changes, but not for satellite images.', 'duration': 24.763, 'max_score': 1915.59, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM1915590.jpg'}, {'end': 1992.371, 'src': 'embed', 'start': 1963.606, 'weight': 0, 'content': [{'end': 1966.808, 'text': "you know data where there isn't really an up or a down.", 'start': 1963.606, 'duration': 3.202}, {'end': 1971.431, 'text': 'turning on flip vert equals, true, is generally going to make your models generalize better.', 'start': 1966.808, 'duration': 4.623}, {'end': 1978.075, 'text': "Okay So here's the steps necessary to create our data bunch.", 'start': 1972.792, 'duration': 5.283}, {'end': 1990.151, 'text': "And so now to create a satellite imagery model multi-label classifier that's going to figure out for each satellite tile,", 'start': 1979.476, 'duration': 10.675}, {'end': 1992.371, 'text': "what's the weather and what else can I see in it.", 'start': 1990.151, 'duration': 2.22}], 'summary': 'Using flip vert equals true improves model generalization. steps to create satellite image multi-label classifier for weather and other features.', 'duration': 28.765, 'max_score': 1963.606, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM1963606.jpg'}, {'end': 2043.73, 'src': 'embed', 'start': 2016.708, 'weight': 4, 'content': [{'end': 2019.81, 'text': 'I found ResNet-50 helped a little bit, and I had some time to run it.', 'start': 2016.708, 'duration': 3.102}, {'end': 2021.63, 'text': 'So in this case, I was using ResNet-50.', 'start': 2019.89, 'duration': 1.74}, {'end': 2027.913, 'text': "There's one more change I make, which is metrics.", 'start': 2023.831, 'duration': 4.082}, {'end': 2034.356, 'text': 'Now, to remind you, a metric has got nothing to do with how the model trains.', 'start': 2029.034, 'duration': 5.322}, {'end': 2039.026, 'text': 'Changing your metrics will not change your resulting model at all.', 'start': 2034.983, 'duration': 4.043}, {'end': 2043.73, 'text': 'The only thing that we use metrics for is we print them out during training.', 'start': 2039.467, 'duration': 4.263}], 'summary': 'Resnet-50 had a slight impact, with a focus on metrics during training.', 'duration': 27.022, 'max_score': 2016.708, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM2016708.jpg'}, {'end': 2136.567, 'src': 'embed', 'start': 2101.183, 'weight': 6, 'content': [{'end': 2106.605, 'text': "How do you weigh up those two things to kind of create a single number? There's lots of different ways of doing that.", 'start': 2101.183, 'duration': 5.422}, {'end': 2113.367, 'text': 'And something called the F score is basically a nice way of combining that into a single number.', 'start': 2107.005, 'duration': 6.362}, {'end': 2118.389, 'text': 'And there are various kinds of F scores, F1, F2, and so forth.', 'start': 2114.388, 'duration': 4.001}, {'end': 2123.799, 'text': "And Kaggle said, in the competition rules, we're going to use a metric called F2.", 'start': 2118.889, 'duration': 4.91}, {'end': 2136.567, 'text': "So we have a metric called F beta, which in other words, it's F with 1 or 2 or whatever, depending on the value of beta.", 'start': 2125.48, 'duration': 11.087}], 'summary': 'F score combines multiple metrics into one number, such as f2 used in kaggle competition.', 'duration': 35.384, 'max_score': 2101.183, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM2101183.jpg'}, {'end': 2184.104, 'src': 'embed', 'start': 2152.854, 'weight': 5, 'content': [{'end': 2157.397, 'text': "But there's one other thing that I need to set, which is a threshold.", 'start': 2152.854, 'duration': 4.543}, {'end': 2161.379, 'text': "What does that mean? Well, here's the thing.", 'start': 2158.317, 'duration': 3.062}, {'end': 2167.963, 'text': 'Do you remember we had a little look the other day at the source code for the accuracy metric?', 'start': 2162.38, 'duration': 5.583}, {'end': 2170.645, 'text': 'So if you put two question marks, you get the source code.', 'start': 2168.443, 'duration': 2.202}, {'end': 2174.347, 'text': 'And we found that it used this thing called argmax.', 'start': 2171.505, 'duration': 2.842}, {'end': 2184.104, 'text': 'And the reason for that, if you remember, was we kind of had this input image that came in.', 'start': 2175.819, 'duration': 8.285}], 'summary': 'Setting a threshold for the accuracy metric using argmax in source code.', 'duration': 31.25, 'max_score': 2152.854, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM2152854.jpg'}, {'end': 2269.964, 'src': 'embed', 'start': 2233.012, 'weight': 2, 'content': [{'end': 2240.319, 'text': 'And then it compared that to the actual and then took the average.', 'start': 2233.012, 'duration': 7.307}, {'end': 2242.98, 'text': 'And that was the accuracy.', 'start': 2241.379, 'duration': 1.601}, {'end': 2250.962, 'text': "We can't do that for satellite recognition in this case, because there isn't one label we're looking for.", 'start': 2243.44, 'duration': 7.522}, {'end': 2251.862, 'text': "There's lots.", 'start': 2251.382, 'duration': 0.48}, {'end': 2269.964, 'text': "So instead, what we do is we look at, so in this case, So I don't know if you remember, but a data bunch has a special attribute called c.", 'start': 2253.003, 'duration': 16.961}], 'summary': 'Average accuracy was computed for comparison, but not applicable for satellite recognition due to multiple labels.', 'duration': 36.952, 'max_score': 2233.012, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM2233012.jpg'}, {'end': 2476.956, 'src': 'heatmap', 'start': 2398.81, 'weight': 0.702, 'content': [{'end': 2403.913, 'text': "But it's so common that you want to, kind of say, create a new function.", 'start': 2398.81, 'duration': 5.103}, {'end': 2408.133, 'text': "that's just like that other function, but we're always going to call it with a particular parameter.", 'start': 2403.913, 'duration': 4.22}, {'end': 2410.453, 'text': 'That computer science has a term for that.', 'start': 2408.673, 'duration': 1.78}, {'end': 2411.594, 'text': "It's called a partial.", 'start': 2410.733, 'duration': 0.861}, {'end': 2413.394, 'text': "It's called a partial function application.", 'start': 2411.794, 'duration': 1.6}, {'end': 2428.138, 'text': 'And so Python 3 has something called partial that takes some function and some list of keywords and values and creates a new function that is exactly the same as this function,', 'start': 2413.534, 'duration': 14.604}, {'end': 2431.259, 'text': 'but is always going to call it with that keyword argument.', 'start': 2428.138, 'duration': 3.121}, {'end': 2434.84, 'text': 'So here, this is exactly the same thing as the thing I just typed in.', 'start': 2431.699, 'duration': 3.141}, {'end': 2440.767, 'text': 'ACK02 is now a new function that calls AccuracyThresh with a threshold of 0.2.', 'start': 2435.262, 'duration': 5.505}, {'end': 2449.173, 'text': "And so this is a really common thing to do, particularly with the FastAI library, because there's lots of places where you have to pass in functions.", 'start': 2440.767, 'duration': 8.406}, {'end': 2453.377, 'text': 'And you very often want to pass in a slightly customized version of a function.', 'start': 2449.594, 'duration': 3.783}, {'end': 2454.518, 'text': "So here's how you do it.", 'start': 2453.777, 'duration': 0.741}, {'end': 2458.341, 'text': "So here I've got an AccuracyThreshold 0.2.", 'start': 2454.958, 'duration': 3.383}, {'end': 2461.524, 'text': "I've got a FBetaThreshold 0.2.", 'start': 2458.341, 'duration': 3.183}, {'end': 2463.045, 'text': 'I can pass them both in as metrics.', 'start': 2461.524, 'duration': 1.521}, {'end': 2467.154, 'text': 'And I can then go ahead and do all the normal stuff.', 'start': 2465.054, 'duration': 2.1}, {'end': 2472.956, 'text': 'LRFind, recorder.plot, find the thing with the steepest slope.', 'start': 2467.354, 'duration': 5.602}, {'end': 2476.956, 'text': "So I don't know, somewhere around 1e neg 2.", 'start': 2473.516, 'duration': 3.44}], 'summary': "Python 3's partial function creates a customized version of a function for common usage with fastai library.", 'duration': 78.146, 'max_score': 2398.81, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM2398810.jpg'}, {'end': 2526.653, 'src': 'embed', 'start': 2495.217, 'weight': 3, 'content': [{'end': 2499.659, 'text': 'at planet, leaderboard, private leaderboard.', 'start': 2495.217, 'duration': 4.442}, {'end': 2505.622, 'text': 'And so the top 50th is about 0.93.', 'start': 2500.679, 'duration': 4.943}, {'end': 2511.204, 'text': "So we kind of say like, oh, we're on the right track with something we're doing fine.", 'start': 2505.622, 'duration': 5.582}, {'end': 2519.948, 'text': "So as you can see, once you get to a point that the data's there, there's very little extra to do most of the time.", 'start': 2511.484, 'duration': 8.464}, {'end': 2526.653, 'text': 'So, when your model makes an incorrect prediction in a deployed app,', 'start': 2522.972, 'duration': 3.681}], 'summary': 'Top 50th accuracy at 0.93 indicates being on the right track with deployed models.', 'duration': 31.436, 'max_score': 2495.217, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM2495217.jpg'}], 'start': 1898.549, 'title': 'Satellite image recognition', 'summary': 'Introduces data augmentation for satellite imagery, emphasizing the significance of flip vert equals true for orientation-less data. it explores implementing resnet-50 in a cnn model with an f2 score metric and threshold setting. it also covers achieving about 96% accuracy and an f beta of about 0.926 for satellite recognition.', 'chapters': [{'end': 1992.371, 'start': 1898.549, 'title': 'Data augmentation and satellite image classifier', 'summary': 'Introduces the concept of data augmentation, emphasizing the importance of adapting training batches for photography but not for satellite imagery, and highlights the significance of flip vert equals true for models dealing with data lacking orientation. it also outlines the steps to create a multi-label classifier for satellite imagery.', 'duration': 93.822, 'highlights': ["The importance of modifying training batches for photography due to changes in shape when taking photos from different perspectives. It's crucial to modify training batches for photography to account for changes in shape when taking photos from different perspectives.", 'The insignificance of perspective warping for satellite images as satellites always point straight down at the planet. Perspective warping is not necessary for satellite images as satellites always point straight down at the planet.', 'The importance of flip vert equals true for models dealing with data lacking orientation, such as astronomical data or pathology digital slide data. Enabling flip vert equals true generally improves model generalization for data lacking orientation, like astronomical or pathology digital slide data.', 'Outline of steps to create a multi-label classifier for satellite imagery to identify weather and other features in each satellite tile. The steps to create a multi-label classifier for satellite imagery are outlined to identify weather and other features in each satellite tile.']}, {'end': 2232.632, 'start': 1993.351, 'title': 'Implementing resnet-50 and metrics in cnn', 'summary': 'Explores implementing resnet-50 in a cnn model, highlighting the use of f2 score as a metric and the significance of setting a threshold in the context of model training and evaluation.', 'duration': 239.281, 'highlights': ['Implementing ResNet-50 in a CNN model The speaker initially used ResNet-34 but found that implementing ResNet-50 helped improve the model slightly, showcasing the practical application of different architectures in CNN.', 'Significance of F2 score as a metric The speaker explains the importance of F2 score as the metric for evaluation, emphasizing its relevance in the context of Kaggle competition rules and its function in weighing false positives and false negatives in a classifier.', 'Importance of setting a threshold in model training The speaker discusses the significance of setting a threshold in the context of model evaluation, emphasizing its role in determining the class ID pet in the pet detector model and its connection to the accuracy function.']}, {'end': 2570.379, 'start': 2233.012, 'title': 'Satellite recognition accuracy', 'summary': 'Explains the process of determining accuracy for satellite recognition, including the use of thresholds, creating customized functions, and achieving an accuracy of about 96% and an f beta of about 0.926, leading to a top 50th position of about 0.93 on the leaderboard.', 'duration': 337.367, 'highlights': ['The process of determining accuracy for satellite recognition involves comparing probabilities to a threshold and assuming the model has a feature if the probability is higher than the threshold.', "Creating a customized function using Python 3's 'partial' for calling the 'AccuracyThresh' function with a threshold of 0.2 is a common practice in the FastAI library.", 'Achieving an accuracy of about 96% and an F beta of about 0.926, leading to a top 50th position of about 0.93 on the leaderboard is a significant result in the context of satellite recognition.']}], 'duration': 671.83, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM1898549.jpg', 'highlights': ['Enabling flip vert equals true generally improves model generalization for data lacking orientation, like astronomical or pathology digital slide data.', 'The importance of modifying training batches for photography due to changes in shape when taking photos from different perspectives.', 'The process of determining accuracy for satellite recognition involves comparing probabilities to a threshold and assuming the model has a feature if the probability is higher than the threshold.', 'Achieving an accuracy of about 96% and an F beta of about 0.926, leading to a top 50th position of about 0.93 on the leaderboard is a significant result in the context of satellite recognition.', 'Implementing ResNet-50 in a CNN model helped improve the model slightly, showcasing the practical application of different architectures in CNN.', 'The importance of setting a threshold in the context of model evaluation, emphasizing its role in determining the class ID pet in the pet detector model and its connection to the accuracy function.', 'The significance of F2 score as a metric for evaluation, emphasizing its relevance in the context of Kaggle competition rules and its function in weighing false positives and false negatives in a classifier.', 'The insignificance of perspective warping for satellite images as satellites always point straight down at the planet.']}, {'end': 2913.024, 'segs': [{'end': 2597.254, 'src': 'embed', 'start': 2571.099, 'weight': 0, 'content': [{'end': 2576.301, 'text': 'And then at the end of the day or at the end of the week, you could set up a little job to run something.', 'start': 2571.099, 'duration': 5.202}, {'end': 2578.562, 'text': 'Or you can manually run something.', 'start': 2576.441, 'duration': 2.121}, {'end': 2582.164, 'text': "And what are you going to do? You're going to do some fine tuning.", 'start': 2578.582, 'duration': 3.582}, {'end': 2585.606, 'text': 'What does fine tuning look like? Good segue, Rachel.', 'start': 2582.844, 'duration': 2.762}, {'end': 2586.667, 'text': 'It looks like this.', 'start': 2585.926, 'duration': 0.741}, {'end': 2590.149, 'text': "So let's pretend here's your saved model.", 'start': 2587.687, 'duration': 2.462}, {'end': 2593.351, 'text': 'And so then we unfreeze.', 'start': 2592.09, 'duration': 1.261}, {'end': 2597.254, 'text': 'And then we fit a little bit more.', 'start': 2594.792, 'duration': 2.462}], 'summary': 'The process involves setting up a job to run or manually running a task for fine-tuning a saved model.', 'duration': 26.155, 'max_score': 2571.099, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM2571099.jpg'}, {'end': 2694.758, 'src': 'embed', 'start': 2665.589, 'weight': 2, 'content': [{'end': 2668.512, 'text': "And it's basically the order that you see in the example of use right?", 'start': 2665.589, 'duration': 2.923}, {'end': 2674.026, 'text': 'What kind of data do you have?', 'start': 2671.024, 'duration': 3.002}, {'end': 2675.867, 'text': 'Where does it come from?', 'start': 2675.066, 'duration': 0.801}, {'end': 2677.428, 'text': 'How do you label it?', 'start': 2676.627, 'duration': 0.801}, {'end': 2679.029, 'text': 'How do you split it?', 'start': 2678.248, 'duration': 0.781}, {'end': 2680.97, 'text': 'What kind of data sets do you want?', 'start': 2679.669, 'duration': 1.301}, {'end': 2682.951, 'text': 'Optionally, how do I transform it?', 'start': 2681.41, 'duration': 1.541}, {'end': 2685.213, 'text': 'And then, how do I create a data bunch from it?', 'start': 2683.732, 'duration': 1.481}, {'end': 2687.194, 'text': "So they're the steps.", 'start': 2686.453, 'duration': 0.741}, {'end': 2694.758, 'text': 'I mean, we invented this API.', 'start': 2688.094, 'duration': 6.664}], 'summary': 'Steps to use the api: order, data type, source, labeling, splitting, desired datasets, optional transformation, and creating a data bunch.', 'duration': 29.169, 'max_score': 2665.589, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM2665589.jpg'}, {'end': 2745.843, 'src': 'embed', 'start': 2721.488, 'weight': 3, 'content': [{'end': 2727.852, 'text': 'You tend to see it more in, like ETL software, like extraction, transformation and loading software,', 'start': 2721.488, 'duration': 6.364}, {'end': 2729.954, 'text': "where there's kind of particular stages in a pipeline.", 'start': 2727.852, 'duration': 2.102}, {'end': 2732.514, 'text': "It's been inspired by a bunch of things.", 'start': 2730.973, 'duration': 1.541}, {'end': 2735.296, 'text': 'But yeah,', 'start': 2732.594, 'duration': 2.702}, {'end': 2745.843, 'text': 'all you need to know is kind of use this example to guide you and then look up the documentation to see you know which particular kind of thing you want.', 'start': 2735.296, 'duration': 10.547}], 'summary': 'Etl software involves stages in a pipeline, inspired by various sources, and can be understood using examples and documentation.', 'duration': 24.355, 'max_score': 2721.488, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM2721488.jpg'}, {'end': 2913.024, 'src': 'embed', 'start': 2888.361, 'weight': 4, 'content': [{'end': 2894.603, 'text': "So you can grab the frames with the web API and then they're just images which you can pass along.", 'start': 2888.361, 'duration': 6.242}, {'end': 2897.243, 'text': "If you're doing it client side.", 'start': 2895.923, 'duration': 1.32}, {'end': 2901.602, 'text': 'I guess most people tend to use OpenCV for that,', 'start': 2897.901, 'duration': 3.701}, {'end': 2913.024, 'text': 'but maybe people during the week who are doing these video apps can tell us what have you used and found useful and we can start to prepare something in the lesson wiki with a list of video resources,', 'start': 2901.602, 'duration': 11.422}], 'summary': 'Web api allows grabbing frames as images. opencv commonly used for video apps.', 'duration': 24.663, 'max_score': 2888.361, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM2888361.jpg'}], 'start': 2571.099, 'title': 'Model fine-tuning and data block ideology', 'summary': 'Covers fine-tuning model for improved performance through unfreezing and adjusting learning rate, as well as discusses data block ideology, creation of data bunch, and application of fast ai functions, including handling video frames using web apis and opencv.', 'chapters': [{'end': 2638.949, 'start': 2571.099, 'title': 'Model fine-tuning for improved performance', 'summary': 'Discusses the process of fine-tuning a model by unfreezing, fitting the misclassified instances with a slightly higher learning rate or running them through a few more epochs, ultimately leading to improved model performance.', 'duration': 67.85, 'highlights': ['You can fit misclassified instances with a slightly higher learning rate or run them through a few more epochs, ultimately leading to improved model performance.', 'Fine-tuning a model involves unfreezing and fitting with the original or new data bunch containing misclassified instances.', 'At the end of the day or week, setting up a job to run or manually running fine-tuning processes for model optimization is recommended.']}, {'end': 2913.024, 'start': 2640.29, 'title': 'Data block ideology and fast ai functions', 'summary': 'Discusses the use of data blocks in a certain order, the steps involved in creating a data bunch, and the availability of fast ai functions for different applications. it also touches on the use of web apis and opencv for handling video frames.', 'duration': 272.734, 'highlights': ['The data blocks need to be in a certain order, as seen in the example of use.', 'The steps involved in creating a data bunch include determining the kind of data, its source, labeling, splitting, desired data sets, optional transformations, and creating the data bunch.', 'The API for data blocks has been inspired by a pipeline of things that dot into each other, commonly seen in ETL software, particularly in JavaScript.', 'The availability of specific data block API pieces for different applications, such as text and vision, and the possibility of creating custom stages using Fast AI functions.', 'The use of web APIs for grabbing frames and OpenCV for client-side video frame handling.']}], 'duration': 341.925, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM2571099.jpg', 'highlights': ['Fine-tuning model involves unfreezing and fitting with original or new data bunch.', 'Setting up a job to run or manually running fine-tuning processes for model optimization is recommended.', 'Steps for creating a data bunch include determining data kind, source, labeling, splitting, desired data sets, optional transformations, and creating the data bunch.', 'Data block API inspired by a pipeline of things that dot into each other, commonly seen in ETL software.', 'Use of web APIs for grabbing frames and OpenCV for client-side video frame handling.']}, {'end': 3368.732, 'segs': [{'end': 2946.726, 'src': 'embed', 'start': 2913.024, 'weight': 3, 'content': [{'end': 2915.105, 'text': 'since it sounds like some people are interested.', 'start': 2913.024, 'duration': 2.081}, {'end': 2929.672, 'text': 'Okay So just like usual, we unfreeze our model, and then we fit some more, and we get down 929-ish.', 'start': 2919.926, 'duration': 9.746}, {'end': 2940.961, 'text': "So one thing to notice here is that before we unfreeze, you'll tend to get this shape pretty much all the time.", 'start': 2931.433, 'duration': 9.528}, {'end': 2944.003, 'text': "If you do your learning rate finder before you unfreeze, it's pretty easy.", 'start': 2941.021, 'duration': 2.982}, {'end': 2946.726, 'text': 'Find the steepest slope, not the bottom.', 'start': 2944.304, 'duration': 2.422}], 'summary': 'Unfreezing the model and fitting leads to a decrease to around 929.', 'duration': 33.702, 'max_score': 2913.024, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM2913024.jpg'}, {'end': 3017.974, 'src': 'embed', 'start': 2990.388, 'weight': 4, 'content': [{'end': 2997.734, 'text': 'And then for the second half of my slice, I normally do whatever learning rate I used for the frozen part.', 'start': 2990.388, 'duration': 7.346}, {'end': 3006.1, 'text': 'So LR, which was 0.01, kind of divided by 5 or divided by 10, somewhere around that.', 'start': 2998.174, 'duration': 7.926}, {'end': 3010.463, 'text': "So that's kind of my rule of thumb, right? Look for the bit kind of at the bottom.", 'start': 3007.241, 'duration': 3.222}, {'end': 3012.204, 'text': 'Find about 10x smaller.', 'start': 3010.623, 'duration': 1.581}, {'end': 3014.126, 'text': "That's the number that I put here.", 'start': 3012.805, 'duration': 1.321}, {'end': 3017.974, 'text': 'And then LR over 5 or LR over 10 is kind of what I put there.', 'start': 3014.673, 'duration': 3.301}], 'summary': 'For the second half of the slice, use lr around 0.01 divided by 5 or 10.', 'duration': 27.586, 'max_score': 2990.388, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM2990388.jpg'}, {'end': 3122.844, 'src': 'embed', 'start': 3085.931, 'weight': 0, 'content': [{'end': 3086.851, 'text': "But there's a second reason.", 'start': 3085.931, 'duration': 0.92}, {'end': 3094.174, 'text': "I now have a model that's pretty good at recognizing the contents of 128 by 128 satellite images.", 'start': 3087.752, 'duration': 6.422}, {'end': 3098.574, 'text': 'So what am I going to do?', 'start': 3097.614, 'duration': 0.96}, {'end': 3103.316, 'text': "if I now want to create a model that's pretty good at 256 by 256 satellite images?", 'start': 3098.574, 'duration': 4.742}, {'end': 3105.997, 'text': "Well, why don't I use transfer learning?", 'start': 3104.257, 'duration': 1.74}, {'end': 3113.08, 'text': "Why don't I start with the model that's good at 128 by 128 images and fine-tune that?", 'start': 3106.597, 'duration': 6.483}, {'end': 3114.761, 'text': "so I don't start again right?", 'start': 3113.08, 'duration': 1.681}, {'end': 3122.844, 'text': "And that's actually going to be really interesting, because if I'm trained quite a lot, if I'm on the verge of overfitting,", 'start': 3115.101, 'duration': 7.743}], 'summary': 'Using transfer learning to upscale model from 128x128 to 256x256 satellite images.', 'duration': 36.913, 'max_score': 3085.931, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM3085931.jpg'}, {'end': 3268.553, 'src': 'embed', 'start': 3239.892, 'weight': 1, 'content': [{'end': 3242.734, 'text': "That's, you know, seems well before it shoots up.", 'start': 3239.892, 'duration': 2.842}, {'end': 3246.076, 'text': "And so let's fit a little bit more.", 'start': 3243.454, 'duration': 2.622}, {'end': 3247.997, 'text': "Okay, so we've frozen again.", 'start': 3246.616, 'duration': 1.381}, {'end': 3250.758, 'text': "So we're just training the last few layers and fit a little bit more.", 'start': 3248.217, 'duration': 2.541}, {'end': 3257.582, 'text': 'And as you can see, I very quickly, remember kind of 928 was where we got to before or after quite a few epochs.', 'start': 3251.178, 'duration': 6.404}, {'end': 3260.423, 'text': "We're straight up there and suddenly we've passed 0.93.", 'start': 3257.662, 'duration': 2.761}, {'end': 3268.553, 'text': "All right, so we're now already kind of into the top 10%.", 'start': 3260.423, 'duration': 8.13}], 'summary': 'Training quickly reached 93% accuracy, surpassing 0.93 in few epochs.', 'duration': 28.661, 'max_score': 3239.892, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM3239892.jpg'}], 'start': 2913.024, 'title': 'Model fine-tuning and transfer learning', 'summary': 'Covers setting learning rates for model fine-tuning, emphasizing unfreezing, identifying steepest slope, and using transfer learning to improve model performance to 0.9315, achieving top 10% out of 1,000 teams through resizing images, freezing layers, and fine-tuning the model.', 'chapters': [{'end': 3017.974, 'start': 2913.024, 'title': 'Fine-tuning model learning rates', 'summary': 'Explains the process of finding and setting learning rates for model fine-tuning, emphasizing the importance of unfreezing and observing changes in the learning rate shape, with particular focus on identifying the steepest slope and adjusting the learning rate accordingly.', 'duration': 104.95, 'highlights': ['Before unfreezing, the learning rate shape typically exhibits a steep slope, which can be easily identified through a learning rate finder, generally around 1e neg 3, while after unfreezing, the shape changes to a more gradual incline and decline, requiring the identification of a point just before a steep incline and adjusting the learning rate to about 10x smaller as a rule of thumb.', 'The recommended learning rate for the first half of the slice after unfreezing is approximately 1e neg 5, while for the second half, it is usually the learning rate used for the frozen part divided by 5 or 10.']}, {'end': 3368.732, 'start': 3018.495, 'title': 'Improving model performance with transfer learning', 'summary': 'Discusses using transfer learning to improve model performance from 0.929 to 0.9315 in a satellite imagery recognition competition, aiming for the top 10% out of 1,000 teams, achieved through resizing images, freezing layers, and fine-tuning the model.', 'duration': 350.237, 'highlights': ['Using transfer learning to fine-tune a model for 256 by 256 satellite images, achieving a performance improvement from 0.929 to 0.9315, aiming for the top 10% in a competition with about 1,000 teams.', 'Utilizing the resizing of images from 128 by 128 to 256 by 256 for improved model recognition, effectively creating a new dataset and preventing overfitting.', 'Freezing layers and training only the last few layers to further improve model performance, achieving a competitive edge in a challenging competition.']}], 'duration': 455.708, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM2913024.jpg', 'highlights': ['Using transfer learning to fine-tune a model for 256 by 256 satellite images, achieving a performance improvement from 0.929 to 0.9315, aiming for the top 10% in a competition with about 1,000 teams.', 'Freezing layers and training only the last few layers to further improve model performance, achieving a competitive edge in a challenging competition.', 'Utilizing the resizing of images from 128 by 128 to 256 by 256 for improved model recognition, effectively creating a new dataset and preventing overfitting.', 'Before unfreezing, the learning rate shape typically exhibits a steep slope, which can be easily identified through a learning rate finder, generally around 1e neg 3, while after unfreezing, the shape changes to a more gradual incline and decline, requiring the identification of a point just before a steep incline and adjusting the learning rate to about 10x smaller as a rule of thumb.', 'The recommended learning rate for the first half of the slice after unfreezing is approximately 1e neg 5, while for the second half, it is usually the learning rate used for the frozen part divided by 5 or 10.']}, {'end': 4683.658, 'segs': [{'end': 3580.243, 'src': 'embed', 'start': 3546.046, 'weight': 0, 'content': [{'end': 3548.307, 'text': 'you know what objects are around and where are they.', 'start': 3546.046, 'duration': 2.261}, {'end': 3557.089, 'text': "So in this case, There's a nice data set called Canva, which we can download.", 'start': 3548.968, 'duration': 8.121}, {'end': 3564.873, 'text': 'And they have already got a whole bunch of images and segment masks prepared for us, which is pretty cool.', 'start': 3557.529, 'duration': 7.344}, {'end': 3580.243, 'text': 'And remember, pretty much all of the data sets that we have provided inbuilt URLs for, you can see their details at course.fast.ai slash data sets.', 'start': 3566.174, 'duration': 14.069}], 'summary': 'Canva data set provides images and segment masks with inbuilt urls available at course.fast.ai/datasets.', 'duration': 34.197, 'max_score': 3546.046, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM3546046.jpg'}, {'end': 3747.852, 'src': 'embed', 'start': 3719.393, 'weight': 1, 'content': [{'end': 3723.082, 'text': "try a few numbers and find out which one's, Which ones work best?", 'start': 3719.393, 'duration': 3.689}, {'end': 3731.405, 'text': "And within a small number of weeks you will find that you're picking the best learning rate most of the time.", 'start': 3723.483, 'duration': 7.922}, {'end': 3733.606, 'text': "So I don't know.", 'start': 3732.446, 'duration': 1.16}, {'end': 3740.749, 'text': 'So at this stage, it still requires a bit of playing around to get a sense of the different kinds of shapes that you see and how to respond to them.', 'start': 3733.886, 'duration': 6.863}, {'end': 3747.852, 'text': 'Maybe by the time this video comes out, someone will have a pretty reliable auto learning rate finder.', 'start': 3742.43, 'duration': 5.422}], 'summary': 'Experiment with different numbers to find the best learning rate; a reliable auto learning rate finder may be developed soon.', 'duration': 28.459, 'max_score': 3719.393, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM3719393.jpg'}, {'end': 3897.021, 'src': 'heatmap', 'start': 3799.415, 'weight': 4, 'content': [{'end': 3805.038, 'text': 'So I always start by untiring my data, do an LS, see what I was given.', 'start': 3799.415, 'duration': 5.623}, {'end': 3809.12, 'text': "In this case, there's a folder called labels and a folder called images.", 'start': 3805.058, 'duration': 4.062}, {'end': 3811.656, 'text': "So I'll create paths for each of those.", 'start': 3809.795, 'duration': 1.861}, {'end': 3814.477, 'text': "We'll take a look inside each of those.", 'start': 3812.836, 'duration': 1.641}, {'end': 3826.341, 'text': "And at this point, you can see there's some kind of coded file names for the images and some kind of coded file names for the segment masks.", 'start': 3814.497, 'duration': 11.844}, {'end': 3829.282, 'text': 'And then you kind of have to figure out how to map from one to the other.', 'start': 3826.881, 'duration': 2.401}, {'end': 3834.543, 'text': 'Normally, these kind of data sets will come with a README you can look at, or you can look at their website.', 'start': 3830.482, 'duration': 4.061}, {'end': 3836.444, 'text': "Often, it's kind of obvious.", 'start': 3835.544, 'duration': 0.9}, {'end': 3841.589, 'text': 'In this case, I can see, like, these ones always have this kind of particular format.', 'start': 3837.048, 'duration': 4.541}, {'end': 3845.71, 'text': 'These ones always have exactly the same format with an underscore P.', 'start': 3842.009, 'duration': 3.701}, {'end': 3848.411, 'text': 'So I kind of, when I did this, honestly, I just guessed.', 'start': 3845.71, 'duration': 2.701}, {'end': 3851.912, 'text': "I thought, oh, it's probably the same thing, underscore P.", 'start': 3848.531, 'duration': 3.381}, {'end': 3860.214, 'text': 'And so I created a little function that basically took the file name and added the underscore P and put it in a different place.', 'start': 3851.912, 'duration': 8.302}, {'end': 3863.014, 'text': 'And I tried opening it, and I noticed it worked.', 'start': 3860.614, 'duration': 2.4}, {'end': 3871.538, 'text': "So, you know, so I've created this little function that converts from the image file names to the equivalent label file names.", 'start': 3863.334, 'duration': 8.204}, {'end': 3874.279, 'text': 'I opened up that to make sure it works.', 'start': 3872.238, 'duration': 2.041}, {'end': 3881.881, 'text': 'Normally, we use open image to open a file, and then you can go .show to take a look at it.', 'start': 3875.619, 'duration': 6.262}, {'end': 3887.403, 'text': 'But this, as we described, this is not a usual image file.', 'start': 3883.962, 'duration': 3.441}, {'end': 3889.704, 'text': 'It contains integers.', 'start': 3887.743, 'duration': 1.961}, {'end': 3893.145, 'text': 'So you have to use open mask.', 'start': 3891.344, 'duration': 1.801}, {'end': 3897.021, 'text': 'rather than open image because we want to return integers, not floats.', 'start': 3893.7, 'duration': 3.321}], 'summary': 'Analyzing and mapping image and label files, creating a function to convert file names, and using open mask for integer data.', 'duration': 26.926, 'max_score': 3799.415, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM3799415.jpg'}, {'end': 4084.829, 'src': 'embed', 'start': 4057.783, 'weight': 2, 'content': [{'end': 4062.804, 'text': 'Remember, I told you that, for example, sometimes we randomly flip an image right?', 'start': 4057.783, 'duration': 5.021}, {'end': 4071.987, 'text': "What if we randomly flip the independent variable image but we don't also randomly flip this one?", 'start': 4063.485, 'duration': 8.502}, {'end': 4074.747, 'text': "They're now not matching anymore, right?", 'start': 4072.447, 'duration': 2.3}, {'end': 4080.229, 'text': 'So we need to tell FastAI that I want to transform the y.', 'start': 4075.068, 'duration': 5.161}, {'end': 4082.648, 'text': 'So X is our independent variable.', 'start': 4081.106, 'duration': 1.542}, {'end': 4083.268, 'text': 'Y is independent.', 'start': 4082.648, 'duration': 0.62}, {'end': 4084.829, 'text': 'I want to transform the Y as well.', 'start': 4083.288, 'duration': 1.541}], 'summary': 'Fastai requires transforming both x and y for matching images.', 'duration': 27.046, 'max_score': 4057.783, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM4057783.jpg'}, {'end': 4285.967, 'src': 'heatmap', 'start': 4108.629, 'weight': 6, 'content': [{'end': 4109.55, 'text': 'And this is quite nice.', 'start': 4108.629, 'duration': 0.921}, {'end': 4115.593, 'text': "Fast AI because it knows that you've given it a segmentation problem.", 'start': 4110.43, 'duration': 5.163}, {'end': 4120.595, 'text': 'when you call show batch, it actually combines the two pieces for you and it will color code the photo.', 'start': 4115.593, 'duration': 5.002}, {'end': 4131.24, 'text': "Isn't that nice? So you can see here the green on the trees and the red on the lines and this kind of color on the walls and so forth.", 'start': 4121.076, 'duration': 10.164}, {'end': 4134.062, 'text': 'So you can see here, here are the pedestrians.', 'start': 4132.001, 'duration': 2.061}, {'end': 4136.063, 'text': "This is the pedestrian's backpack.", 'start': 4134.702, 'duration': 1.361}, {'end': 4139.184, 'text': 'So this is what the ground truth data looks like.', 'start': 4136.502, 'duration': 2.682}, {'end': 4150.148, 'text': "So once we've got that, we can go ahead and create a learner.", 'start': 4140.12, 'duration': 10.028}, {'end': 4151.849, 'text': "I'll show you some more details in a moment.", 'start': 4150.368, 'duration': 1.481}, {'end': 4158.054, 'text': 'Call lrfind, find the sharpest bit, which looks about 1a neg 2.', 'start': 4152.529, 'duration': 5.525}, {'end': 4168.542, 'text': 'Call fit, passing in slice lr, and see the accuracy, and save the model, and unfreeze, and train a little bit more.', 'start': 4158.054, 'duration': 10.488}, {'end': 4171.555, 'text': "So that's the basic idea.", 'start': 4170.515, 'duration': 1.04}, {'end': 4173.917, 'text': "And so we're going to have a break.", 'start': 4172.377, 'duration': 1.54}, {'end': 4178.761, 'text': "And when we come back, I'm going to show you some little tweaks that we can do.", 'start': 4174.218, 'duration': 4.543}, {'end': 4182.825, 'text': "And I'm also going to explain this custom metric that we've created.", 'start': 4179.261, 'duration': 3.564}, {'end': 4186.207, 'text': "And then we'll be able to go on and look at some other cool things.", 'start': 4183.725, 'duration': 2.482}, {'end': 4189.609, 'text': "So let's all come back at 8 o'clock.", 'start': 4186.688, 'duration': 2.921}, {'end': 4191.011, 'text': 'Six minutes.', 'start': 4190.611, 'duration': 0.4}, {'end': 4196.475, 'text': 'OK Welcome back, everybody.', 'start': 4195.154, 'duration': 1.321}, {'end': 4199.538, 'text': "And we're going to start off with a question we got during the break.", 'start': 4196.636, 'duration': 2.902}, {'end': 4211.781, 'text': 'Could you use unsupervised learning here pixel classification with the bike example to avoid needing a human to label a heap of images?', 'start': 4203.598, 'duration': 8.183}, {'end': 4223.725, 'text': "We're not exactly unsupervised learning, but you can certainly get a sense of where things are without needing these kind of labels.", 'start': 4211.801, 'duration': 11.924}, {'end': 4228.666, 'text': "And time permitting, we'll try and see some examples of how to do that.", 'start': 4225.025, 'duration': 3.641}, {'end': 4237.873, 'text': "You're certainly not going to get such a quality and such a specific output as what you see here, though.", 'start': 4231.192, 'duration': 6.681}, {'end': 4246.195, 'text': 'If you want to get this level of segmentation mask, you need a pretty good segmentation mask ground truth to work with.', 'start': 4238.433, 'duration': 7.762}, {'end': 4258.537, 'text': "And is there a reason we shouldn't deliberately make a lot of smaller data sets to step up from in tuning?", 'start': 4252.136, 'duration': 6.401}, {'end': 4264.74, 'text': "Let's say 64 by 64, 128 by 128, 256 by 256 and so on.", 'start': 4258.956, 'duration': 5.784}, {'end': 4268.582, 'text': 'Yes You should totally do that.', 'start': 4266.721, 'duration': 1.861}, {'end': 4269.643, 'text': 'It works great.', 'start': 4269.083, 'duration': 0.56}, {'end': 4271.324, 'text': 'Try it.', 'start': 4271.024, 'duration': 0.3}, {'end': 4281.491, 'text': 'I found this idea is something that I first came up with in the course a couple of years ago,', 'start': 4272.185, 'duration': 9.306}, {'end': 4285.967, 'text': 'and I kind thought it seemed obvious and just presented it as a good idea.', 'start': 4281.491, 'duration': 4.476}], 'summary': 'Using fastai, a segmentation problem is solved with a learner achieving an accuracy of 1a neg 2 and creating a custom metric.', 'duration': 177.338, 'max_score': 4108.629, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM4108629.jpg'}, {'end': 4390.458, 'src': 'embed', 'start': 4355.73, 'weight': 3, 'content': [{'end': 4366.65, 'text': "What does accuracy mean for pixel-wise segmentation? Is it correctly classified pixels divided by the total number of pixels? Yep, that's it.", 'start': 4355.73, 'duration': 10.92}, {'end': 4374.253, 'text': "So if you imagined each pixel was a separate object you're classifying, it's exactly the same accuracy.", 'start': 4367.21, 'duration': 7.043}, {'end': 4383.236, 'text': 'And so you actually can just pass in accuracy as your metric.', 'start': 4375.513, 'duration': 7.723}, {'end': 4386.017, 'text': "But in this case, we actually don't.", 'start': 4384.336, 'duration': 1.681}, {'end': 4390.458, 'text': "We've created a new metric called Accuracy Canvid.", 'start': 4386.477, 'duration': 3.981}], 'summary': 'Accuracy for pixel-wise segmentation is calculated as correctly classified pixels divided by the total number of pixels, and a new metric called accuracy canvid is used instead.', 'duration': 34.728, 'max_score': 4355.73, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM4355730.jpg'}, {'end': 4618.126, 'src': 'embed', 'start': 4573.287, 'weight': 7, 'content': [{'end': 4577.67, 'text': "dropout and data augmentation will be the key things that we'll be talking about.", 'start': 4573.287, 'duration': 4.383}, {'end': 4591.159, 'text': "Okay For segmentation, we don't just create a convolutional neural network.", 'start': 4580.352, 'duration': 10.807}, {'end': 4592.34, 'text': 'We can.', 'start': 4592, 'duration': 0.34}, {'end': 4599.357, 'text': 'But actually, an architecture called UNET turns out to be better.', 'start': 4594.495, 'duration': 4.862}, {'end': 4603.039, 'text': "And actually, let's find it.", 'start': 4600.077, 'duration': 2.962}, {'end': 4612.283, 'text': 'Okay, so this is what a UNET looks like.', 'start': 4610.622, 'duration': 1.661}, {'end': 4618.126, 'text': 'And this is from the university website where they talk about the UNET.', 'start': 4612.883, 'duration': 5.243}], 'summary': 'Key topics: dropout, data augmentation, unet architecture for segmentation.', 'duration': 44.839, 'max_score': 4573.287, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM4573287.jpg'}], 'start': 3368.792, 'title': 'Image segmentation techniques', 'summary': 'Introduces the canvid dataset and its implications, discusses the learning rate finder process, delves into fast ai image segmentation, and addresses accuracy in pixel-wise segmentation, including the introduction of unet architecture for segmentation.', 'chapters': [{'end': 3632.896, 'start': 3368.792, 'title': 'Image segmentation with canvid dataset', 'summary': 'Introduces the canvid dataset for image segmentation, explaining the process of creating color-coded images for different objects and the implications of using segmented datasets in various domains such as medicine and self-driving cars.', 'duration': 264.104, 'highlights': ['The CanVid dataset is used for image segmentation to create color-coded images for different objects. This dataset is used to create color-coded images for different objects such as bicycles, road lines, trees, buildings, and sky, by assigning unique numbers to each pixel.', 'Segmentation involves labeling every single pixel in an image, requiring a significant amount of work and specialized datasets. Segmentation involves classifying every single pixel in an image, which necessitates labeled datasets where every pixel has been marked, commonly found in domains such as medicine, life sciences, and self-driving cars.', 'The CanVid dataset provides pre-prepared images and segment masks, making it accessible for use in projects. The CanVid dataset offers a collection of prepared images and segment masks, eliminating the need to create a segmentation dataset from scratch and providing a valuable resource for various projects.']}, {'end': 4057.303, 'start': 3632.896, 'title': 'Learning rate finder and image segmentation', 'summary': 'Discusses the limitations of finding a suggested number directly for learning rates and the process of experimenting with different learning rates to find the best one, and then delves into the process of image segmentation, including creating paths, mapping image and label file names, and creating a data bunch with non-contiguous validation and training sets.', 'duration': 424.407, 'highlights': ['The process of experimenting with different learning rates to find the best one By trying different learning rates and experimenting, within a small number of weeks, one can find the best learning rate most of the time.', 'The limitations of finding a suggested number directly for learning rates The process of finding a suggested number directly for learning rates is still a bit more artisanal and requires a certain amount of experimentation, as it depends on the stage and shape of the learning rate graph.', 'Creating a data bunch with non-contiguous validation and training sets In the case of non-contiguous parts of the video, the validation and training sets can be split using a file name file to avoid having two frames next to each other, preventing cheating.', 'The process of image segmentation, including creating paths, mapping image and label file names The process of image segmentation involves creating paths for images and labels, mapping file names, and creating a function to convert image file names to label file names.']}, {'end': 4350.888, 'start': 4057.783, 'title': 'Fast ai image segmentation', 'summary': "Discusses the importance of transforming both independent (x) and dependent (y) variables in image segmentation, utilizing a smaller batch size of 8 for creating a classifier, leveraging fast ai's ability to combine and color code segmented images, and the effectiveness of progressive resizing in training models for better generalization and faster training.", 'duration': 293.105, 'highlights': ['The importance of transforming both independent (X) and dependent (Y) variables in image segmentation It is crucial to transform both X and Y variables to ensure matching in image segmentation tasks, emphasizing the need for Y transformation alongside X transformation.', 'Utilizing a smaller batch size of 8 for creating a classifier A smaller batch size of 8 is used for creating a classifier to handle the computational load of creating a classifier for every pixel in image segmentation.', "Leveraging Fast AI's ability to combine and color code segmented images Fast AI has the capability to combine and color code segmented images, simplifying visualization and understanding of the segmented data.", 'The effectiveness of progressive resizing in training models for better generalization and faster training Progressive resizing, a technique involving training with smaller datasets and gradually increasing the size, has been found to result in faster training, better generalization, and was a key factor in winning the ImageNet competition.']}, {'end': 4683.658, 'start': 4355.73, 'title': 'Accuracy in pixel-wise segmentation', 'summary': 'Discusses the creation of the accuracy canvid metric for pixel-wise segmentation, addressing the removal of void pixels and underfitting issues in training loss, and introduces the unet architecture for segmentation.', 'duration': 327.928, 'highlights': ['The creation of accuracy Canvid metric for pixel-wise segmentation, addressing the removal of void pixels. The chapter explains the creation of the accuracy Canvid metric, which involves removing void pixels from the accuracy calculation for pixel-wise segmentation.', 'Addressing underfitting issues in training loss and potential solutions. Underfitting issues in training loss are discussed, with suggestions provided such as training for longer, adjusting learning rates, and decreasing regularization.', 'Introduction of the UNET architecture for segmentation and its significance. The UNET architecture is introduced as a better alternative for segmentation, featuring a U-shaped structure and being widely cited and useful beyond biomedical image segmentation.']}], 'duration': 1314.866, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM3368792.jpg', 'highlights': ['The CanVid dataset provides pre-prepared images and segment masks, making it accessible for use in projects.', 'The process of experimenting with different learning rates to find the best one By trying different learning rates and experimenting, within a small number of weeks, one can find the best learning rate most of the time.', 'The importance of transforming both independent (X) and dependent (Y) variables in image segmentation It is crucial to transform both X and Y variables to ensure matching in image segmentation tasks, emphasizing the need for Y transformation alongside X transformation.', 'The creation of accuracy Canvid metric for pixel-wise segmentation, addressing the removal of void pixels. The chapter explains the creation of the accuracy Canvid metric, which involves removing void pixels from the accuracy calculation for pixel-wise segmentation.', 'The process of image segmentation, including creating paths, mapping image and label file names The process of image segmentation involves creating paths for images and labels, mapping file names, and creating a function to convert image file names to label file names.', 'The limitations of finding a suggested number directly for learning rates The process of finding a suggested number directly for learning rates is still a bit more artisanal and requires a certain amount of experimentation, as it depends on the stage and shape of the learning rate graph.', "Leveraging Fast AI's ability to combine and color code segmented images Fast AI has the capability to combine and color code segmented images, simplifying visualization and understanding of the segmented data.", 'Introduction of the UNET architecture for segmentation and its significance. The UNET architecture is introduced as a better alternative for segmentation, featuring a U-shaped structure and being widely cited and useful beyond biomedical image segmentation.']}, {'end': 5564.689, 'segs': [{'end': 4717.035, 'src': 'embed', 'start': 4684.099, 'weight': 1, 'content': [{'end': 4694.782, 'text': 'All you need to know is if you want to create a segmentation model, you want to be saying Learner.CreateUnit rather than CreateCNN.', 'start': 4684.099, 'duration': 10.683}, {'end': 4706.866, 'text': 'But you pass it the normal stuff, your data bunch, an architecture, and some metrics, okay? So having done that, everything else works the same.', 'start': 4695.762, 'duration': 11.104}, {'end': 4717.035, 'text': 'You can do the LR finder, find the slope, train it for a while, watch the accuracy, go up, save it from time to time.', 'start': 4707.226, 'duration': 9.809}], 'summary': 'To create a segmentation model, use learner.createunit instead of createcnn, and pass data bunch, architecture, and metrics. lr finder, find slope, train, watch accuracy, and save model periodically.', 'duration': 32.936, 'max_score': 4684.099, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM4684099.jpg'}, {'end': 4774.392, 'src': 'embed', 'start': 4745.087, 'weight': 2, 'content': [{'end': 4749.37, 'text': 'And this plots your training loss and your validation loss.', 'start': 4745.087, 'duration': 4.283}, {'end': 4755.814, 'text': "And you'll see quite often they actually go up a bit before they go down.", 'start': 4750.611, 'duration': 5.203}, {'end': 4768.348, 'text': "Why is that? That's because you can also plot your learning rate over time, and you'll see that your learning rate goes up and then it goes down.", 'start': 4757.235, 'duration': 11.113}, {'end': 4774.392, 'text': "Why is that? Because we said fit one cycle, and that's what fit one cycle does.", 'start': 4768.969, 'duration': 5.423}], 'summary': 'Plot training and validation loss, observe learning rate cycle.', 'duration': 29.305, 'max_score': 4745.087, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM4745087.jpg'}, {'end': 5089.011, 'src': 'embed', 'start': 5061.024, 'weight': 3, 'content': [{'end': 5063.106, 'text': 'And eventually, the learning rate starts to come down again.', 'start': 5061.024, 'duration': 2.082}, {'end': 5067.69, 'text': "And so it'll tend to find its way to these flat areas.", 'start': 5065.388, 'duration': 2.302}, {'end': 5086.41, 'text': "It turns out that gradually increasing the learning rate is a really good way of helping the model to explore the whole function surface and try and find areas where both the loss is low and also it's not bumpy.", 'start': 5069.884, 'duration': 16.526}, {'end': 5089.011, 'text': 'Because if it was bumpy, it would get kicked out again.', 'start': 5086.69, 'duration': 2.321}], 'summary': 'Gradually increasing the learning rate helps model explore function surface and find low-loss, smooth areas.', 'duration': 27.987, 'max_score': 5061.024, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM5061024.jpg'}, {'end': 5335.716, 'src': 'embed', 'start': 5299.398, 'weight': 0, 'content': [{'end': 5310.926, 'text': 'The best paper I know of for segmentation was a paper called the Hundred Layers Tiramisu, which developed a convolutional dense net,', 'start': 5299.398, 'duration': 11.528}, {'end': 5312.267, 'text': 'came out about two years ago.', 'start': 5310.926, 'duration': 1.341}, {'end': 5320.633, 'text': 'So after I trained this today, I went back and looked at the paper to find their state of the art accuracy.', 'start': 5312.807, 'duration': 7.826}, {'end': 5335.716, 'text': 'Here it is, and I looked it up, and their best was 91.5, and we got 92.1.', 'start': 5325.031, 'duration': 10.685}], 'summary': 'Developed a convolutional dense net achieving 92.1 accuracy, surpassing the state-of-the-art 91.5 accuracy.', 'duration': 36.318, 'max_score': 5299.398, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM5299398.jpg'}, {'end': 5482.976, 'src': 'embed', 'start': 5459.97, 'weight': 4, 'content': [{'end': 5472.748, 'text': "There's another trick you can use if you're running out of memory a lot, which is you can actually do something called mixed precision training.", 'start': 5459.97, 'duration': 12.778}, {'end': 5476.311, 'text': 'And mixed precision training means that,', 'start': 5473.669, 'duration': 2.642}, {'end': 5482.976, 'text': 'instead of using for those of you that have done a little bit of computer science instead of using single precision floating point numbers,', 'start': 5476.311, 'duration': 6.665}], 'summary': 'Mixed precision training can optimize memory usage by using lower precision floating point numbers.', 'duration': 23.006, 'max_score': 5459.97, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM5459970.jpg'}], 'start': 4684.099, 'title': 'Optimizing segmentation model training', 'summary': 'Explains best practices for training a segmentation model, emphasizing the importance of using learner.createunit over createcnn, adjusting learning rates for faster convergence, and decreasing the learning rate during training to take smaller steps. it also discusses the benefits of gradually increasing the learning rate, leading to faster training and more generalizable solutions, and achieving an accuracy of 92.15% with fast ai for segmentation, surpassing the state-of-the-art accuracy of 91.5%.', 'chapters': [{'end': 4981.164, 'start': 4684.099, 'title': 'Optimizing segmentation model training', 'summary': 'Explains the best practices for training a segmentation model, including the importance of using learner.createunit over createcnn, adjusting learning rates for faster convergence, and the significance of decreasing the learning rate during training to take smaller steps as the model gets closer to the best answer.', 'duration': 297.065, 'highlights': ['The importance of using Learner.CreateUnit over CreateCNN for creating a segmentation model, and the process for passing data bunch, architecture, and metrics to it.', 'The significance of adjusting the learning rate, demonstrated by learning.recorder plotting the training loss and validation loss, and the behavior of learning rate during fit one cycle.', "The impact of different learning rates on model training, illustrated by Jose Fernandez Portal's project, showing the effects of learning rates of 0.1, 0.7, and extremely high values on weights convergence and loss improvement.", 'The rationale behind gradually decreasing the learning rate during training to take smaller steps as the model gets closer to the best answer, and the concept of learning rate annealing and the recent idea of gradually increasing it at the start, attributed to Leslie Smith.']}, {'end': 5260.016, 'start': 4983.025, 'title': 'Gradual learning rate increase', 'summary': 'Discusses the benefits of gradually increasing the learning rate, leading to faster training and more generalizable solutions, with a particular emphasis on exploring the entire function surface and finding areas with low loss and minimal bumpiness.', 'duration': 276.991, 'highlights': ['Gradually increasing the learning rate helps the model explore the entire function surface and find areas with both low loss and minimal bumpiness, leading to more generalizable solutions and faster problem-solving.', 'Using a maximum learning rate that results in a slight worsening followed by significant improvement indicates a good maximum learning rate, leading to faster training and better generalization.', 'Experimenting with different learning rates, epochs, and batch sizes while observing validation set results helps in understanding the impact on training speed and generalization.', 'Adjusting batch size and image size based on GPU memory constraints to ensure effective problem-solving and model training.', 'Restarting the kernel, creating a new learner, and loading saved weights can help address GPU memory issues and ensure continuity in the learning process.']}, {'end': 5564.689, 'start': 5260.836, 'title': 'Improving segmentation accuracy with fast ai', 'summary': 'Discusses using fast ai for segmentation, achieving an accuracy of 92.15%, surpassing the state-of-the-art accuracy of 91.5%, and utilizing mixed precision training for faster and efficient gpu utilization.', 'duration': 303.853, 'highlights': ["Achieving an accuracy of 92.15% in segmentation using Fast AI, surpassing the state-of-the-art accuracy of 91.5% from the paper 'Hundred Layers Tiramisu'. The achieved accuracy of 92.15% using Fast AI in segmentation, exceeding the state-of-the-art accuracy of 91.5% from the 'Hundred Layers Tiramisu' paper, demonstrating the effectiveness of the approach.", 'Utilizing mixed precision training for faster and efficient GPU utilization, resulting in training being about twice as fast and using less GPU RAM. The use of mixed precision training in Fast AI for faster and efficient GPU utilization, leading to about twice as fast training and reduced GPU RAM usage, especially beneficial for recent GPUs like 2080 TI.']}], 'duration': 880.59, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM4684099.jpg', 'highlights': ["Achieving an accuracy of 92.15% in segmentation using Fast AI, surpassing the state-of-the-art accuracy of 91.5% from the paper 'Hundred Layers Tiramisu'.", 'The importance of using Learner.CreateUnit over CreateCNN for creating a segmentation model, and the process for passing data bunch, architecture, and metrics to it.', 'The significance of adjusting the learning rate, demonstrated by learning.recorder plotting the training loss and validation loss, and the behavior of learning rate during fit one cycle.', 'Gradually increasing the learning rate helps the model explore the entire function surface and find areas with both low loss and minimal bumpiness, leading to more generalizable solutions and faster problem-solving.', 'Utilizing mixed precision training for faster and efficient GPU utilization, resulting in training being about twice as fast and using less GPU RAM.']}, {'end': 6056.124, 'segs': [{'end': 5594.74, 'src': 'embed', 'start': 5565.97, 'weight': 0, 'content': [{'end': 5573.374, 'text': 'Now, I actually have never seen people use mixed precision floating point for segmentation before.', 'start': 5565.97, 'duration': 7.404}, {'end': 5582.74, 'text': 'Just for a bit of a laugh, I tried it and actually discovered that I got an even better result.', 'start': 5574.095, 'duration': 8.645}, {'end': 5589.877, 'text': "So I only found this this morning, so I don't have anything more to add here other than Quite.", 'start': 5583.46, 'duration': 6.417}, {'end': 5594.74, 'text': 'often when you make things a little bit less precise in deep learning, it generalizes a little bit better.', 'start': 5589.877, 'duration': 4.863}], 'summary': 'Using mixed precision floating point for segmentation yielded better results, demonstrating improved generalization in deep learning.', 'duration': 28.77, 'max_score': 5565.97, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM5565970.jpg'}, {'end': 5701.671, 'src': 'embed', 'start': 5674.812, 'weight': 5, 'content': [{'end': 5678.635, 'text': "And so we're going to try and create a model that can find the center of a face.", 'start': 5674.812, 'duration': 3.823}, {'end': 5683.879, 'text': 'So for this data set.', 'start': 5679.936, 'duration': 3.943}, {'end': 5691.504, 'text': "there's a few data set, specific things we have to do, which I don't really even understand, but I just know from the readme that you have to.", 'start': 5683.879, 'duration': 7.625}, {'end': 5694.346, 'text': 'They used some kind of depth sensing camera.', 'start': 5692.145, 'duration': 2.201}, {'end': 5696.928, 'text': 'I think they actually used a Kinect, you know, Xbox Kinect.', 'start': 5694.366, 'duration': 2.562}, {'end': 5701.671, 'text': "There's some kind of calibration numbers that they provide in a little file, which I had to read in.", 'start': 5697.568, 'duration': 4.103}], 'summary': 'Creating a model to find the center of a face using kinect data.', 'duration': 26.859, 'max_score': 5674.812, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM5674812.jpg'}, {'end': 6056.124, 'src': 'embed', 'start': 6027.324, 'weight': 1, 'content': [{'end': 6032.847, 'text': "So we're trying to predict something which is somewhere around a few hundred.", 'start': 6027.324, 'duration': 5.523}, {'end': 6038.378, 'text': "And we're getting a squared error on average of 0.0004.", 'start': 6033.917, 'duration': 4.461}, {'end': 6040.619, 'text': 'So we can feel pretty confident that this is a really good model.', 'start': 6038.378, 'duration': 2.241}, {'end': 6043.4, 'text': 'And then we can look at the results by learn.showResults.', 'start': 6041.079, 'duration': 2.321}, {'end': 6046.541, 'text': 'And we can see predictions, ground truth.', 'start': 6043.44, 'duration': 3.101}, {'end': 6051.543, 'text': "It's doing a nearly perfect job.", 'start': 6047.781, 'duration': 3.762}, {'end': 6056.124, 'text': "So that's how you can do image regression models.", 'start': 6051.923, 'duration': 4.201}], 'summary': 'A few hundred predictions with average squared error of 0.0004 show a nearly perfect image regression model.', 'duration': 28.8, 'max_score': 6027.324, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM6027324.jpg'}], 'start': 5565.97, 'title': 'Mixed precision for better segmentation and creating image regression model', 'summary': 'Explores the utilization of mixed precision floating point for segmentation, achieving 92.5% accuracy on canva and potential performance improvements. additionally, it discusses building an image regression model with a mean squared error of 0.0004, demonstrating highly accurate predictions.', 'chapters': [{'end': 5617.355, 'start': 5565.97, 'title': 'Mixed precision for better segmentation', 'summary': 'Discusses the use of mixed precision floating point for segmentation, resulting in a 92.5% accuracy on canva and potentially faster processing and improved results, indicating the benefits of reduced precision in deep learning.', 'duration': 51.385, 'highlights': ['Using mixed precision floating point for segmentation led to a 92.5% accuracy on Canva, showcasing the effectiveness of this approach.', 'Reducing precision in deep learning can lead to better generalization, potentially improving results and enabling the use of larger batch sizes.', 'The discovery of improved results through mixed precision floating point for segmentation demonstrates the practical benefits of this approach.']}, {'end': 6056.124, 'start': 5618.362, 'title': 'Creating image regression model', 'summary': 'Discusses creating an image regression model to predict the center of a face using a specific dataset, achieving a mean squared error of 0.0004 and showing nearly perfect results.', 'duration': 437.762, 'highlights': ['Creating an image regression model to predict the center of a face using a specific dataset The discussion revolves around creating a model to predict the center of a face using the BWI HeadPose dataset, which involves specific data set requirements and depth sensing camera calibration numbers.', 'Achieving a mean squared error of 0.0004 The model achieves a mean squared error of 0.0004, indicating a high level of accuracy in predicting the center of the face.', 'Showing nearly perfect results The results of the model are nearly perfect, as indicated by the predictions and ground truth shown, demonstrating the effectiveness of the image regression model.']}], 'duration': 490.154, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM5565970.jpg', 'highlights': ['Using mixed precision floating point for segmentation led to a 92.5% accuracy on Canva, showcasing the effectiveness of this approach.', 'Achieving a mean squared error of 0.0004, indicating a high level of accuracy in predicting the center of the face.', 'Reducing precision in deep learning can lead to better generalization, potentially improving results and enabling the use of larger batch sizes.', 'The discovery of improved results through mixed precision floating point for segmentation demonstrates the practical benefits of this approach.', 'The results of the model are nearly perfect, as indicated by the predictions and ground truth shown, demonstrating the effectiveness of the image regression model.', 'Creating an image regression model to predict the center of a face using a specific dataset involving specific data set requirements and depth sensing camera calibration numbers.']}, {'end': 7490.267, 'segs': [{'end': 6147.088, 'src': 'heatmap', 'start': 6064.582, 'weight': 0.712, 'content': [{'end': 6073.488, 'text': 'So last example before we look at some kind of more foundational theory stuff, NLP.', 'start': 6064.582, 'duration': 8.906}, {'end': 6077.271, 'text': "And next week we're going to be looking at a lot more NLP.", 'start': 6074.069, 'duration': 3.202}, {'end': 6087.038, 'text': "But let's now do the same thing, but rather than creating a classification of pictures, let's try and classify documents.", 'start': 6077.911, 'duration': 9.127}, {'end': 6092.261, 'text': "And so we're going to go through this in a lot more detail next week, but let's do the quick version.", 'start': 6088.719, 'duration': 3.542}, {'end': 6098.634, 'text': 'Rather than importing from fastai.vision, I now import for the first time from fastai.text.', 'start': 6093.792, 'duration': 4.842}, {'end': 6103.016, 'text': "That's where you'll find all the application-specific stuff for analyzing text documents.", 'start': 6098.934, 'duration': 4.082}, {'end': 6107.278, 'text': "And in this case, we're going to use a dataset called imdb.", 'start': 6104.236, 'duration': 3.042}, {'end': 6112.06, 'text': 'And imdb has lots of movie reviews.', 'start': 6108.138, 'duration': 3.922}, {'end': 6114.981, 'text': "They're generally about a couple of thousand words.", 'start': 6113.34, 'duration': 1.641}, {'end': 6122.224, 'text': 'And each movie review has been classified as either negative or positive.', 'start': 6117.042, 'duration': 5.182}, {'end': 6125.098, 'text': "So it's just in a CSV file.", 'start': 6123.277, 'duration': 1.821}, {'end': 6126.338, 'text': 'So we can use pandas to read it.', 'start': 6125.138, 'duration': 1.2}, {'end': 6127.359, 'text': 'We can take a little look.', 'start': 6126.379, 'duration': 0.98}, {'end': 6129.7, 'text': 'We can take a look at a review.', 'start': 6127.799, 'duration': 1.901}, {'end': 6140.765, 'text': 'And basically, as per usual, we can either use factory methods or the data block API to create a data bunch.', 'start': 6132.281, 'duration': 8.484}, {'end': 6147.088, 'text': "So here's the quick way to create a data bunch from a CSV of texts, data bunch from CSV.", 'start': 6141.366, 'duration': 5.722}], 'summary': 'Introduction to classifying movie reviews as negative or positive using nlp with fastai.text and the imdb dataset.', 'duration': 82.506, 'max_score': 6064.582, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM6064582.jpg'}, {'end': 6375.185, 'src': 'heatmap', 'start': 6282.798, 'weight': 1, 'content': [{'end': 6290.562, 'text': 'So through tokenization and numericalization, this is the standard way in NLP of turning a document into a list of numbers.', 'start': 6282.798, 'duration': 7.764}, {'end': 6299.291, 'text': "We can do that with the data block API, right? So this time it's not image files list.", 'start': 6293.005, 'duration': 6.286}, {'end': 6310.721, 'text': "It's text split data from a CSV, convert them to data sets, tokenize them, numericalize them, create a data bunch.", 'start': 6299.831, 'duration': 10.89}, {'end': 6317.847, 'text': 'And at that point, we can start to create a model.', 'start': 6312.102, 'duration': 5.745}, {'end': 6325.648, 'text': "As we'll learn about next week, when we do NLP classification, we actually create two models.", 'start': 6319.385, 'duration': 6.263}, {'end': 6335.131, 'text': 'The first model is something called a language model, which, as you can see, we train in a kind of a usual way.', 'start': 6326.428, 'duration': 8.703}, {'end': 6337.192, 'text': 'We say we want to create a language model learner.', 'start': 6335.171, 'duration': 2.021}, {'end': 6338.632, 'text': 'We train it.', 'start': 6337.892, 'duration': 0.74}, {'end': 6340.773, 'text': 'We can save it.', 'start': 6339.993, 'duration': 0.78}, {'end': 6341.654, 'text': 'We unfreeze.', 'start': 6340.853, 'duration': 0.801}, {'end': 6342.714, 'text': 'We train some more.', 'start': 6341.754, 'duration': 0.96}, {'end': 6348.019, 'text': "And then after we've created a language model, We fine tune it to create the classifier.", 'start': 6343.414, 'duration': 4.605}, {'end': 6350.72, 'text': "So here's the thing where we create the data bunch for the classifier.", 'start': 6348.239, 'duration': 2.481}, {'end': 6353.821, 'text': 'We create a learner.', 'start': 6352.92, 'duration': 0.901}, {'end': 6356.301, 'text': 'We train it.', 'start': 6355.661, 'duration': 0.64}, {'end': 6361.422, 'text': 'And we end up with some accuracy.', 'start': 6360.422, 'duration': 1}, {'end': 6364.383, 'text': "So that's the really quick version.", 'start': 6362.382, 'duration': 2.001}, {'end': 6366.203, 'text': "We're going to go through it in more detail next week.", 'start': 6364.423, 'duration': 1.78}, {'end': 6375.185, 'text': "But you can see the basic idea of training an NLP classifier is very, very, very similar to creating every other model we've seen so far.", 'start': 6366.243, 'duration': 8.942}], 'summary': 'Nlp model training involves tokenization, numericalization, and creating language and classifier models with data bunch, achieving accuracy.', 'duration': 59.916, 'max_score': 6282.798, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM6282798.jpg'}, {'end': 6433.95, 'src': 'embed', 'start': 6390.732, 'weight': 0, 'content': [{'end': 6396.136, 'text': 'And this basically what I just showed you is pretty much the state of the art algorithm with some minor tweaks.', 'start': 6390.732, 'duration': 5.404}, {'end': 6400.199, 'text': 'You can get this up to about 95% if you try really hard.', 'start': 6396.636, 'duration': 3.563}, {'end': 6404.783, 'text': 'So this is very close to the state of the art accuracy that we developed.', 'start': 6400.339, 'duration': 4.444}, {'end': 6408.492, 'text': "That's a question.", 'start': 6407.932, 'duration': 0.56}, {'end': 6409.893, 'text': "Okay, now's a great time for a question.", 'start': 6408.572, 'duration': 1.321}, {'end': 6421.461, 'text': 'For a data set very different than ImageNet, like the satellite images or genomic images shown in Lesson 2, we should use our own stats.', 'start': 6413.836, 'duration': 7.625}, {'end': 6426.805, 'text': "Jeremy once said if you're using a pre-trained model, you need to use the same stats it was trained with.", 'start': 6421.681, 'duration': 5.124}, {'end': 6428.346, 'text': 'Why is that??', 'start': 6427.605, 'duration': 0.741}, {'end': 6433.95, 'text': "Isn't it that normalized data with its own stats will have roughly the same distribution like ImageNet??", 'start': 6428.746, 'duration': 5.204}], 'summary': 'State-of-the-art algorithm achieves about 95% accuracy, applicable to diverse data sets like satellite and genomic images.', 'duration': 43.218, 'max_score': 6390.732, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM6390732.jpg'}, {'end': 6595.521, 'src': 'embed', 'start': 6565.561, 'weight': 4, 'content': [{'end': 6567.883, 'text': "That's far more than a linear classifier can do.", 'start': 6565.561, 'duration': 2.322}, {'end': 6574.587, 'text': 'Now, we know these are deep neural networks, and deep neural networks contain lots of these matrix multiplications.', 'start': 6568.643, 'duration': 5.944}, {'end': 6580.311, 'text': 'But every matrix multiplication is just a linear model.', 'start': 6575.408, 'duration': 4.903}, {'end': 6585.915, 'text': 'And a linear function on top of a linear function is just another linear function.', 'start': 6580.952, 'duration': 4.963}, {'end': 6595.521, 'text': 'If you remember back to your, you know high school math you might remember that if you, you know, have a y equals ax plus b,', 'start': 6587.499, 'duration': 8.022}], 'summary': 'Deep neural networks consist of many matrix multiplications, which are essentially linear models, limiting their capabilities.', 'duration': 29.96, 'max_score': 6565.561, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM6565561.jpg'}, {'end': 6745.42, 'src': 'heatmap', 'start': 6669.111, 'weight': 0.729, 'content': [{'end': 6672.601, 'text': 'And they have, you know, particular mathematical definitions.', 'start': 6669.111, 'duration': 3.49}, {'end': 6680.105, 'text': 'Nowadays, we almost never use those for these between each matrix model play.', 'start': 6673.502, 'duration': 6.603}, {'end': 6684.547, 'text': 'Nowadays, we nearly always use this one.', 'start': 6680.665, 'duration': 3.882}, {'end': 6687.249, 'text': "It's called a rectified linear unit.", 'start': 6685.167, 'duration': 2.082}, {'end': 6692.151, 'text': "It's very important when you're doing deep learning to use big, long words that sound impressive.", 'start': 6687.969, 'duration': 4.182}, {'end': 6694.632, 'text': 'Otherwise, normal people might think they can do it too.', 'start': 6692.371, 'duration': 2.261}, {'end': 6703.953, 'text': 'But just between you and me, a rectified linear unit is defined using the following function.', 'start': 6695.333, 'duration': 8.62}, {'end': 6707.716, 'text': "That's it.", 'start': 6707.396, 'duration': 0.32}, {'end': 6712.96, 'text': 'Okay, So, and if you want to be really exclusive, of course,', 'start': 6708.817, 'duration': 4.143}, {'end': 6719.305, 'text': "you then shorten the long version and you call it a relu to show that you're really in the exclusive team.", 'start': 6712.96, 'duration': 6.345}, {'end': 6722.728, 'text': 'So, this is a relu activation.', 'start': 6719.725, 'duration': 3.003}, {'end': 6724.789, 'text': "So, here's the crazy thing.", 'start': 6723.688, 'duration': 1.101}, {'end': 6728.312, 'text': 'If you take your red, green,', 'start': 6725.89, 'duration': 2.422}, {'end': 6738.098, 'text': 'blue pixel inputs and you chuck them through a matrix multiplication and then you replace the negatives with zero and you put it through another matrix multiplication.', 'start': 6728.312, 'duration': 9.786}, {'end': 6742.679, 'text': 'replace the negatives with zero and you keep doing that again and again and again.', 'start': 6738.098, 'duration': 4.581}, {'end': 6744.38, 'text': 'you have a deep learning neural network.', 'start': 6742.679, 'duration': 1.701}, {'end': 6745.42, 'text': "That's it.", 'start': 6745.14, 'duration': 0.28}], 'summary': 'Rectified linear unit (relu) is crucial for deep learning; replacing negatives with zero yields a neural network.', 'duration': 76.309, 'max_score': 6669.111, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM6669111.jpg'}, {'end': 6824.435, 'src': 'embed', 'start': 6793.447, 'weight': 5, 'content': [{'end': 6806.677, 'text': 'Right? And so this idea that these combinations of linear functions and nonlinearities can create arbitrary shapes actually has a name.', 'start': 6793.447, 'duration': 13.23}, {'end': 6810.5, 'text': 'And this name is the universal approximation theorem.', 'start': 6807.217, 'duration': 3.283}, {'end': 6818.034, 'text': 'And what it says is that if you have stacks of linear functions and nonlinearities,', 'start': 6811.04, 'duration': 6.994}, {'end': 6824.435, 'text': 'the thing you end up with can approximate any function arbitrarily closely.', 'start': 6818.034, 'duration': 6.401}], 'summary': 'The universal approximation theorem states that stacks of linear functions and nonlinearities can approximate any function closely.', 'duration': 30.988, 'max_score': 6793.447, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM6793447.jpg'}, {'end': 6898.186, 'src': 'embed', 'start': 6874.197, 'weight': 6, 'content': [{'end': 6881.84, 'text': "People often come up to me after this lesson and they say, what's the rest? Please explain to me the rest of deep learning.", 'start': 6874.197, 'duration': 7.643}, {'end': 6883.761, 'text': "But there's no rest.", 'start': 6882.5, 'duration': 1.261}, {'end': 6888.122, 'text': 'We have a function where we take our input pixels or whatever.', 'start': 6884.941, 'duration': 3.181}, {'end': 6890.323, 'text': 'we multiply them by some weight matrix.', 'start': 6888.122, 'duration': 2.201}, {'end': 6892.244, 'text': 'we replace the negatives with zeros.', 'start': 6890.323, 'duration': 1.921}, {'end': 6893.644, 'text': 'we multiply it by another weight matrix.', 'start': 6892.244, 'duration': 1.4}, {'end': 6894.605, 'text': 'replace the negatives with zeros.', 'start': 6893.644, 'duration': 0.961}, {'end': 6895.525, 'text': 'We do that a few times.', 'start': 6894.625, 'duration': 0.9}, {'end': 6898.186, 'text': 'We see how close it is to our target.', 'start': 6896.325, 'duration': 1.861}], 'summary': 'Deep learning involves iterative processing of input data through weight matrices, replacing negatives with zeros, and evaluating closeness to target.', 'duration': 23.989, 'max_score': 6874.197, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM6874197.jpg'}, {'end': 7099.563, 'src': 'embed', 'start': 7072.092, 'weight': 7, 'content': [{'end': 7077.433, 'text': 'Like with many things, a lot of the complex feature engineering disappears when you do deep learning.', 'start': 7072.092, 'duration': 5.341}, {'end': 7087.878, 'text': 'So, with deep learning, each token is literally just a word, or in the case that the word really consists of two words like your,', 'start': 7077.854, 'duration': 10.024}, {'end': 7088.838, 'text': 'you split it into two words.', 'start': 7087.878, 'duration': 0.96}, {'end': 7099.563, 'text': "And then what we're going to do is we're going to then let the deep learning model figure out how best to combine words together.", 'start': 7090.399, 'duration': 9.164}], 'summary': 'Deep learning simplifies feature engineering, treating each token as a word and allowing the model to combine words efficiently.', 'duration': 27.471, 'max_score': 7072.092, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM7072092.jpg'}, {'end': 7355.503, 'src': 'embed', 'start': 7314.144, 'weight': 8, 'content': [{'end': 7327.355, 'text': 'But the cool thing is the exact same steps we use to do single-label classification you can also do to do multi-label classification,', 'start': 7314.144, 'duration': 13.211}, {'end': 7328.856, 'text': 'such as in the planet.', 'start': 7327.355, 'duration': 1.501}, {'end': 7336.448, 'text': 'Or you could use to do segmentation.', 'start': 7332.144, 'duration': 4.304}, {'end': 7345.275, 'text': 'Or you could use to do..', 'start': 7338.67, 'duration': 6.605}, {'end': 7348.197, 'text': 'Or you could use to do any kind of image regression.', 'start': 7345.275, 'duration': 2.922}, {'end': 7355.503, 'text': 'Or, this is probably a bit early for you to actually try this yet, you could do for NLP classification and a lot more.', 'start': 7350.118, 'duration': 5.385}], 'summary': 'The same steps for single-label classification can be used for multi-label classification, segmentation, image regression, and more.', 'duration': 41.359, 'max_score': 7314.144, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM7314144.jpg'}], 'start': 6056.464, 'title': 'Nlp classification and deep learning', 'summary': 'Covers nlp classification using fast.ai with about 95% accuracy, the importance of pre-trained models and neural networks, the universal approximation theorem, and the process of tokenization and model training in deep learning.', 'chapters': [{'end': 6409.893, 'start': 6056.464, 'title': 'Nlp classification with fast.ai', 'summary': 'Introduces the process of nlp classification using fast.ai, demonstrating tokenization, numericalization, and model training, achieving an accuracy of about 95% for imdb classification.', 'duration': 353.429, 'highlights': ['The state of the art accuracy for IMDB classification using the demonstrated algorithm is about 95%. The accuracy achieved for IMDB classification using the demonstrated algorithm is approximately 95%, which is close to the state of the art.', 'The process involves tokenization and numericalization of text documents to create a data bunch for model training. The process includes tokenization and numericalization of text documents to create a data bunch for training the NLP classifier.', 'NLP classification using Fast.ai follows a similar process to creating other models, involving the creation of a language model and a classifier model. NLP classification using Fast.ai follows a similar process to creating other models, involving the creation of a language model and a classifier model for training.']}, {'end': 6769.23, 'start': 6413.836, 'title': 'Pre-trained models and neural networks', 'summary': 'Discusses the importance of using pre-trained models with the same statistics they were trained with for different datasets, the limitations of linear models in deep neural networks, and the role of activation functions like relu in deep learning.', 'duration': 355.394, 'highlights': ['The importance of using pre-trained models with the same statistics they were trained with for different datasets Jeremy emphasizes the importance of using the same statistics that pre-trained models were trained with for different datasets to maintain the unique characteristics of the dataset, such as using ImageNet stats for classifying green frogs to preserve their color distribution.', "The limitations of linear models in deep neural networks The chapter discusses the limitations of linear models in deep neural networks, where stacking matrix multiplications or convolutions without nonlinearity functions does not significantly enhance the model's capabilities beyond linear functions.", 'The role of activation functions like ReLU in deep learning The role of activation functions like ReLU in deep learning is highlighted, showing how applying nonlinearity functions like ReLU after matrix multiplications forms the basis of deep learning neural networks and enhances their capabilities.']}, {'end': 6983.712, 'start': 6770.17, 'title': 'Universal approximation theorem and deep learning process', 'summary': 'Explains the universal approximation theorem, stating that stacks of linear functions and nonlinearities can create arbitrary shapes and how deep learning involves the process of multiplying input pixels by weight matrices, replacing negatives with zeros, and using gradient descent to update weight matrices to achieve classification or recognition tasks.', 'duration': 213.542, 'highlights': ['The universal approximation theorem states that stacks of linear functions and nonlinearities can create arbitrary shapes. This theorem emphasizes the capability of combinations of linear functions and nonlinearities to approximate any function arbitrarily closely.', 'Deep learning involves multiplying input pixels by weight matrices, replacing negatives with zeros, and using gradient descent to update weight matrices for classification or recognition tasks. The process of deep learning entails the iterative use of weight matrices, nonlinearities, and gradient descent to find values of weight matrices that solve specific problems, ultimately achieving classification or recognition.', 'The intuition about multiplying something by a linear model and replacing the negatives with zeros multiple times does not hold when dealing with large weight matrices. The explanation of the challenge in intuitively understanding the impact of large weight matrices and the repeated application of linear models and nonlinearities, leading to the need for empirical acceptance of their effectiveness.']}, {'end': 7490.267, 'start': 6983.712, 'title': 'Tokenization and model training', 'summary': 'Discusses tokenization in nlp, the use of deep learning for feature engineering, handling multi-channel data in pre-trained models, and the versatility of deep learning in solving various problems, emphasizing the ease of model training and the upcoming topics on nlp and training techniques.', 'duration': 506.555, 'highlights': ['Deep learning eliminates the need for N-grams and complex feature engineering in tokenization, allowing the model to figure out word combinations using weight matrices and gradient descent. Deep learning simplifies NLP by eliminating the need for N-grams and complex feature engineering, allowing the model to figure out word combinations using weight matrices and gradient descent.', "Handling multi-channel data in pre-trained models may involve creating a third channel by averaging existing channels or modifying the model's weight tensors to accommodate additional channels. Handling multi-channel data in pre-trained models may involve creating a third channel by averaging existing channels or modifying the model's weight tensors to accommodate additional channels.", "The versatility of deep learning allows for solving problems like multi-label classification, image regression, and segmentation, emphasizing the use of gradient descent and non-linearities for model training. Deep learning's versatility enables solving problems like multi-label classification, image regression, and segmentation using gradient descent and non-linearities for model training."]}], 'duration': 1433.803, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/MpZxV6DVsmM/pics/MpZxV6DVsmM6056464.jpg', 'highlights': ['The state of the art accuracy for IMDB classification using the demonstrated algorithm is about 95%.', 'The process involves tokenization and numericalization of text documents to create a data bunch for model training.', 'NLP classification using Fast.ai follows a similar process to creating other models, involving the creation of a language model and a classifier model for training.', 'The importance of using pre-trained models with the same statistics they were trained with for different datasets.', 'The limitations of linear models in deep neural networks.', 'The universal approximation theorem states that stacks of linear functions and nonlinearities can create arbitrary shapes.', 'Deep learning involves multiplying input pixels by weight matrices, replacing negatives with zeros, and using gradient descent to update weight matrices for classification or recognition tasks.', 'Deep learning eliminates the need for N-grams and complex feature engineering in tokenization, allowing the model to figure out word combinations using weight matrices and gradient descent.', 'The versatility of deep learning allows for solving problems like multi-label classification, image regression, and segmentation, emphasizing the use of gradient descent and non-linearities for model training.']}], 'highlights': ['The chapter promotes their own machine learning course at course.fast.ai, which is twice as long as the deep learning course and covers foundational concepts in machine learning.', "The chapter corrects a citation error regarding the source of a chart and recommends Andrew Ng's machine learning course on Coursera, which has a 4.9 out of 5 stars rating.", 'Elena Harley achieved a 500% reduction in false positives in genomic variant analysis using a deep learning workflow.', 'Deep learning achieved comparable or enhanced results in emotion recognition without custom hyperparameter tuning when compared to the state of the art paper.', 'Kaggle provides a Python-based downloader tool for downloading data, simplifying the process for users.', 'Enabling flip vert equals true generally improves model generalization for data lacking orientation, like astronomical or pathology digital slide data.', 'Fine-tuning model involves unfreezing and fitting with original or new data bunch.', 'Using transfer learning to fine-tune a model for 256 by 256 satellite images, achieving a performance improvement from 0.929 to 0.9315, aiming for the top 10% in a competition with about 1,000 teams.', 'The CanVid dataset provides pre-prepared images and segment masks, making it accessible for use in projects.', "Achieving an accuracy of 92.15% in segmentation using Fast AI, surpassing the state-of-the-art accuracy of 91.5% from the paper 'Hundred Layers Tiramisu'.", 'Using mixed precision floating point for segmentation led to a 92.5% accuracy on Canva, showcasing the effectiveness of this approach.', 'The state of the art accuracy for IMDB classification using the demonstrated algorithm is about 95%.']}