Coursnap

title
Lesson 2 - Deep Learning for Coders (2020)

description
NB: We recommend watching these videos through https://course.fast.ai rather than directly on YouTube, to get access to the searchable transcript, interactive notebooks, setup guides, questionnaires, and so forth. In today's lesson we finish covering chapter 1 of the book, looking more at test/validation sets, avoiding machine learning project failures, and the foundations of transfer learning. Then we move on to looking at the critical machine learning topic of evidence, including discussing confidence intervals, priors, and the use of visualization to better understand evidence. Finally, we begin our look into productionization of models (chapter 2 of the book), including discussing the overall project plan for model development, and how to create your own datasets. 0:00 - Lesson 1 recap 2:10 - Classification vs Regression 4:50 - Validation data set 6:42 - Epoch, metrics, error rate and accuracy 9:07 - Overfitting, training, validation and testing data set 12:10 - How to choose your training set 15:55 - Transfer learning 21:50 - Fine tuning 22:23 - Why transfer learning works so well 28:26 - Vision techniques used for sound 29:30 - Using pictures to create fraud detection at Splunk 30:38 - Detecting viruses using CNN 31:20 - List of most important terms used in this course 31:50 - Arthur Samuel’s overall approach to neural networks 32:35 - End of Chapter 1 of the Book 40:04 - Where to find pretrained models 41:20 - The state of deep learning 44:30 - Recommendation vs Prediction 45:50 - Interpreting Models - P value 57:20 - Null Hypothesis Significance Testing 1:02:48 - Turn predictive model into something useful in production 1:14:06 - Practical exercise with Bing Image Search 1:16:25 - Bing Image Sign up 1:21:38 - Data Block API 1:28:48 - Lesson Summary

detail
{'title': 'Lesson 2 - Deep Learning for Coders (2020)', 'heatmap': [{'end': 3994.576, 'start': 3933.175, 'weight': 1}], 'summary': 'Covers deep learning training, model deployment, classification, regression models, transfer learning, visualization, parameters vs hyperparameters, covid-19 model interpretation, practical model significance, creating image dataset, fastai for image processing, achieving a 1% error rate with 450 images using a smaller resnet 18 model.', 'chapters': [{'end': 129.24, 'segs': [{'end': 110.897, 'src': 'embed', 'start': 33.328, 'weight': 0, 'content': [{'end': 43.237, 'text': 'And we realized that based on how machine learning worked, that there are some fundamental limitations on what it can do.', 'start': 33.328, 'duration': 9.909}, {'end': 45.759, 'text': 'And we talked about some of those limitations.', 'start': 44.177, 'duration': 1.582}, {'end': 49.842, 'text': "And we also talked about how, after you've trained a machine learning model,", 'start': 46.199, 'duration': 3.643}, {'end': 57.348, 'text': 'you end up with a program which behaves much like a normal program or something with inputs and a thing in the middle and outputs.', 'start': 49.842, 'duration': 7.506}, {'end': 70.639, 'text': "So today we're going to finish up talking about that and we're going to then look at how we get those models into production and what some of the issues with doing that might be.", 'start': 58.849, 'duration': 11.79}, {'end': 79.456, 'text': 'I wanted to remind you that there are two sets of notebooks available to you.', 'start': 73.17, 'duration': 6.286}, {'end': 89.305, 'text': "One is the fastbook repo, the full actual notebooks containing all the text of the O'Reilly book.", 'start': 81.017, 'duration': 8.288}, {'end': 95.971, 'text': "And so this lets you see everything that I'm telling you in much more detail.", 'start': 91.306, 'duration': 4.665}, {'end': 106.755, 'text': "And then as well as that there's the course v4 repo which contains exactly the same notebooks but with all the pros stripped away to help you study.", 'start': 97.171, 'duration': 9.584}, {'end': 110.897, 'text': "So that's where you really want to be doing your experiment and your practice.", 'start': 107.515, 'duration': 3.382}], 'summary': 'Machine learning has fundamental limitations, and we discuss getting models into production.', 'duration': 77.569, 'max_score': 33.328, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ833328.jpg'}], 'start': 0.928, 'title': 'Deep learning training and model deployment', 'summary': 'Delves into the limitations of machine learning, the process of model training, and deploying them into production. it highlights the availability of two sets of notebooks for detailed learning and practice.', 'chapters': [{'end': 129.24, 'start': 0.928, 'title': 'Deep learning training and model deployment', 'summary': 'Discusses the fundamental limitations of machine learning, the process of training models, and transitioning them into production, emphasizing the availability of two sets of notebooks for detailed learning and practice.', 'duration': 128.312, 'highlights': ['The chapter explores the fundamental limitations of machine learning and the behavior of trained machine learning models, equating them to normal programs with inputs, a process, and outputs.', "The chapter emphasizes the availability of two sets of notebooks, the fastbook repo containing detailed O'Reilly book content and the course v4 repo with stripped pros to facilitate learning and practice.", 'The chapter introduces the process of training models and the transition into production, highlighting the importance of practicing with the course v4 notebooks to understand the concepts and experiment with code.']}], 'duration': 128.312, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ8928.jpg', 'highlights': ['The chapter introduces the process of training models and the transition into production, highlighting the importance of practicing with the course v4 notebooks to understand the concepts and experiment with code.', "The chapter emphasizes the availability of two sets of notebooks, the fastbook repo containing detailed O'Reilly book content and the course v4 repo with stripped pros to facilitate learning and practice.", 'The chapter explores the fundamental limitations of machine learning and the behavior of trained machine learning models, equating them to normal programs with inputs, a process, and outputs.']}, {'end': 926.174, 'segs': [{'end': 214.755, 'src': 'embed', 'start': 160.557, 'weight': 0, 'content': [{'end': 163.22, 'text': "That's just how this data set, they tell you in the README works.", 'start': 160.557, 'duration': 2.663}, {'end': 172.227, 'text': 'And we also looked particularly at this idea of valid percent equals 0.2, and like what does that mean? It creates a validation set.', 'start': 164.761, 'duration': 7.466}, {'end': 175.93, 'text': 'And that was something I wanted to talk more about.', 'start': 173.248, 'duration': 2.682}, {'end': 187.662, 'text': "The first thing I do want to do though is point out that this particular labeling function returns something that's either true or false.", 'start': 177.171, 'duration': 10.491}, {'end': 197.572, 'text': "And actually this data set, as we'll see later, also contains the actual breed of 37 different cat and dog breeds.", 'start': 188.843, 'duration': 8.729}, {'end': 200.435, 'text': 'So you can also grab that from the file name.', 'start': 198.313, 'duration': 2.122}, {'end': 206.049, 'text': "In each of those two cases, we're trying to predict a category.", 'start': 202.406, 'duration': 3.643}, {'end': 207.87, 'text': 'Is it a cat or is it a dog??', 'start': 206.629, 'duration': 1.241}, {'end': 214.755, 'text': 'Or is it a German Shepherd or a Beagle or a Ragdoll cat or whatever??', 'start': 208.47, 'duration': 6.285}], 'summary': 'Analyzing a dataset with 37 cat and dog breeds, creating validation set with 0.2 valid percent.', 'duration': 54.198, 'max_score': 160.557, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ8160557.jpg'}, {'end': 271.264, 'src': 'embed', 'start': 245.08, 'weight': 2, 'content': [{'end': 249.582, 'text': 'Okay, so those are the two main types of model, classification and regressions.', 'start': 245.08, 'duration': 4.502}, {'end': 252.243, 'text': 'This is very important jargon to know about.', 'start': 249.622, 'duration': 2.621}, {'end': 259.411, 'text': 'So, the regression model attempts to predict one or more numeric quantities such as temperature or location or whatever.', 'start': 252.783, 'duration': 6.628}, {'end': 267.099, 'text': 'This is a bit confusing, because sometimes people use the word regression as a shortcut to a particular,', 'start': 260.791, 'duration': 6.308}, {'end': 271.264, 'text': 'like an abbreviation for a particular kind of model called linear regression.', 'start': 267.099, 'duration': 4.165}], 'summary': 'Two main types of models: classification and regression. regression predicts numeric quantities like temperature or location.', 'duration': 26.184, 'max_score': 245.08, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ8245080.jpg'}, {'end': 347.516, 'src': 'embed', 'start': 320.413, 'weight': 3, 'content': [{'end': 328.119, 'text': 'So if you train for too long and or with not enough data and or a model with too many parameters,', 'start': 320.413, 'duration': 7.706}, {'end': 331.422, 'text': 'after a while the accuracy of your model will actually get worse.', 'start': 328.119, 'duration': 3.303}, {'end': 339.528, 'text': "And this is called overfitting, right? So we use the validation set to ensure that we're not overfitting.", 'start': 332.523, 'duration': 7.005}, {'end': 347.516, 'text': 'The next line of code that we looked at is this one, where we created something called a learner.', 'start': 342.13, 'duration': 5.386}], 'summary': 'Overfitting can occur with excessive training, insufficient data, or too many model parameters, leading to decreased accuracy. validation sets are used to prevent overfitting.', 'duration': 27.103, 'max_score': 320.413, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ8320413.jpg'}, {'end': 460.412, 'src': 'embed', 'start': 432.76, 'weight': 4, 'content': [{'end': 437.362, 'text': 'And the most important thing we print out is the result of calling these metrics.', 'start': 432.76, 'duration': 4.602}, {'end': 439.923, 'text': 'So error rate is the name of a metric,', 'start': 437.942, 'duration': 1.981}, {'end': 446.527, 'text': "and it's a function that just prints out what percent of the validation set are being incorrectly classified by your model.", 'start': 439.923, 'duration': 6.604}, {'end': 454.868, 'text': "So a metric's a function that measures the quality of the predictions using the validation set.", 'start': 449.925, 'duration': 4.943}, {'end': 456.85, 'text': "So error rate's one.", 'start': 455.789, 'duration': 1.061}, {'end': 460.412, 'text': 'Another common metric is accuracy, which is just one minus error rate.', 'start': 457.01, 'duration': 3.402}], 'summary': 'Metrics like error rate and accuracy measure prediction quality.', 'duration': 27.652, 'max_score': 432.76, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ8432760.jpg'}, {'end': 505.168, 'src': 'embed', 'start': 477.605, 'weight': 5, 'content': [{'end': 484.711, 'text': 'so that when we change the parameters, we can figure out which set of parameters make that performance measurement get better or worse.', 'start': 477.605, 'duration': 7.106}, {'end': 487.833, 'text': 'That performance measurement is called the loss.', 'start': 485.571, 'duration': 2.262}, {'end': 493.518, 'text': 'The loss is not necessarily the same as your metric.', 'start': 488.594, 'duration': 4.924}, {'end': 500.643, 'text': "The reason why is a bit subtle and we'll be seeing it in a lot of detail once we delve into the math in the coming lessons.", 'start': 495.079, 'duration': 5.564}, {'end': 505.168, 'text': 'but basically you need a function.', 'start': 500.643, 'duration': 4.525}], 'summary': 'Analyzing parameter changes to improve performance (loss) in machine learning.', 'duration': 27.563, 'max_score': 477.605, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ8477605.jpg'}, {'end': 689.752, 'src': 'embed', 'start': 656.75, 'weight': 6, 'content': [{'end': 658.732, 'text': 'But now we might have fit to the validation set.', 'start': 656.75, 'duration': 1.982}, {'end': 668.098, 'text': 'So, if you want to be really rigorous about this, you should actually set aside a third bit of data called the test set.', 'start': 660.293, 'duration': 7.805}, {'end': 671.54, 'text': "that is not used for training and it's not used for your metrics.", 'start': 668.098, 'duration': 3.442}, {'end': 675.323, 'text': "It's actually, you don't look at it until the whole project's finished.", 'start': 672.741, 'duration': 2.582}, {'end': 679.165, 'text': "And this is what's used on competition platforms like Kaggle.", 'start': 676.324, 'duration': 2.841}, {'end': 689.752, 'text': 'On Kaggle, after the competition finishes, your performance will be measured against a data set that you have never seen.', 'start': 680.546, 'duration': 9.206}], 'summary': 'For rigorous validation, set aside a third of the data for the test set, not used for training or metrics, as done on platforms like kaggle.', 'duration': 33.002, 'max_score': 656.75, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ8656750.jpg'}], 'start': 132.214, 'title': 'Understanding classification and regression models', 'summary': 'Discusses the importance of labeling in data sets, creating validation sets, and the distinction between classification and regression models, emphasizing their significance in machine learning. it also highlights the role of validation sets in preventing overfitting, the use of error rate and accuracy as metrics, and the significance of loss function in measuring model performance, emphasizing the need for a test set to evaluate model performance objectively.', 'chapters': [{'end': 288.705, 'start': 132.214, 'title': 'Understanding classification and regression models', 'summary': 'Discusses the importance of labeling in data sets, the creation of validation sets, and the distinction between classification and regression models, highlighting the significance of these concepts in machine learning.', 'duration': 156.491, 'highlights': ['The data set contains the actual breed of 37 different cat and dog breeds, allowing for specific categorization.', 'The labeling function returns a binary output, either true or false, which is significant in classification models.', 'The concept of regression models is explained, emphasizing the prediction of numeric quantities such as temperature or location.']}, {'end': 926.174, 'start': 291.249, 'title': 'Machine learning: overfitting and validation sets', 'summary': 'Discusses the importance of validation sets in preventing overfitting, the role of error rate and accuracy as metrics, and the significance of loss function in measuring model performance. it emphasizes the need for a test set to evaluate model performance objectively, especially in competition platforms like kaggle.', 'duration': 634.925, 'highlights': ["The chapter emphasizes the significance of validation sets in preventing overfitting by ensuring that the model doesn't memorize the specific training data, with a clear explanation of how the overfitting occurs and the need to use validation sets to measure it. The chapter explains how overfitting occurs when the model memorizes specific data and stresses the importance of using validation sets to measure overfitting.", 'The role of error rate and accuracy as metrics is highlighted, with error rate identified as a function that measures the percentage of incorrectly classified data in the validation set, emphasizing the importance of these metrics in evaluating model predictions. The chapter emphasizes the role of error rate and accuracy as metrics, with error rate defined as a function measuring the percentage of incorrectly classified data in the validation set.', "The significance of the loss function in measuring model performance is explained, with a clear distinction made between the loss function and the evaluation metric, emphasizing the need for a loss function that reflects the model's performance when parameters are adjusted. The chapter explains the importance of the loss function in measuring model performance, highlighting the need for a function that reflects the model's performance when parameters are adjusted.", 'The importance of a test set to objectively evaluate model performance, especially in competition platforms like Kaggle, is emphasized, with a clear explanation of how the test set is used after the project is finished to measure model performance against unseen data. The chapter emphasizes the need for a test set to objectively evaluate model performance, particularly in competition platforms like Kaggle.']}], 'duration': 793.96, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ8132214.jpg', 'highlights': ['The labeling function returns a binary output, either true or false, significant in classification models.', 'The data set contains the actual breed of 37 different cat and dog breeds, allowing for specific categorization.', 'The concept of regression models is explained, emphasizing the prediction of numeric quantities such as temperature or location.', "The chapter emphasizes the significance of validation sets in preventing overfitting by ensuring that the model doesn't memorize the specific training data.", 'The role of error rate and accuracy as metrics is highlighted, with error rate identified as a function that measures the percentage of incorrectly classified data in the validation set.', 'The significance of the loss function in measuring model performance is explained, with a clear distinction made between the loss function and the evaluation metric.', 'The importance of a test set to objectively evaluate model performance, especially in competition platforms like Kaggle, is emphasized.']}, {'end': 2117.718, 'segs': [{'end': 1024.587, 'src': 'embed', 'start': 1000.74, 'weight': 1, 'content': [{'end': 1009.412, 'text': "There's a big data set called ImageNet that contains 1.3 million pictures of a thousand different types of thing, whether it be mushrooms or animals,", 'start': 1000.74, 'duration': 8.672}, {'end': 1012.636, 'text': 'or airplanes or hammers or whatever.', 'start': 1009.412, 'duration': 3.224}, {'end': 1021.324, 'text': "There's a competition or there used to be a competition that runs every year to see who could get the best accuracy on the ImageNet competition.", 'start': 1015.079, 'duration': 6.245}, {'end': 1024.587, 'text': 'And the models that did really well.', 'start': 1022.065, 'duration': 2.522}], 'summary': 'Imagenet contains 1.3m pictures of 1000 types, with annual competition for best accuracy.', 'duration': 23.847, 'max_score': 1000.74, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ81000740.jpg'}, {'end': 1101.227, 'src': 'embed', 'start': 1071.909, 'weight': 2, 'content': [{'end': 1073.65, 'text': "And we'll see why in just a moment, right?", 'start': 1071.909, 'duration': 1.741}, {'end': 1078.894, 'text': "But this idea of transfer learning it's kind of it makes intuitive sense, right?", 'start': 1074.991, 'duration': 3.903}, {'end': 1084.678, 'text': "ImageNet already has some cats and some dogs in it and it's you know.", 'start': 1080.455, 'duration': 4.223}, {'end': 1090.902, 'text': "it can say this is a cat and this is a dog, but you want to maybe do something that recognizes lots of breeds that aren't in ImageNet.", 'start': 1084.678, 'duration': 6.224}, {'end': 1101.227, 'text': 'Well, for it to be able to recognize cats versus dogs, versus airplanes versus hammers, it has to understand things like what does metal look like?', 'start': 1091.902, 'duration': 9.325}], 'summary': 'Transfer learning uses existing data to recognize diverse objects beyond imagenet.', 'duration': 29.318, 'max_score': 1071.909, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ81071909.jpg'}, {'end': 1141.58, 'src': 'embed', 'start': 1112.913, 'weight': 0, 'content': [{'end': 1117.655, 'text': 'So all these kinds of concepts get implicitly learnt by a pre-trained model.', 'start': 1112.913, 'duration': 4.742}, {'end': 1124.496, 'text': "So if you start with a pre-trained model, then you don't have to learn all these features from scratch.", 'start': 1118.855, 'duration': 5.641}, {'end': 1135.939, 'text': 'And so transfer learning is the single most important thing for being able to use less data and less compute and get better accuracy.', 'start': 1125.417, 'duration': 10.522}, {'end': 1141.58, 'text': "So that's a key focus for the Fast.ai library and a key focus for this course.", 'start': 1136.479, 'duration': 5.101}], 'summary': 'Transfer learning is crucial for using less data and compute to achieve better accuracy.', 'duration': 28.667, 'max_score': 1112.913, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ81112913.jpg'}, {'end': 1213.24, 'src': 'embed', 'start': 1181.956, 'weight': 3, 'content': [{'end': 1183.257, 'text': 'So that would be a metric.', 'start': 1181.956, 'duration': 1.301}, {'end': 1192.545, 'text': "On the other hand, if you're trying to predict whether this is a cat or a dog, your metric would be what percentage of the time am I wrong.", 'start': 1184.178, 'duration': 8.367}, {'end': 1195.828, 'text': 'So that latter metric is called the error rate.', 'start': 1193.626, 'duration': 2.202}, {'end': 1199.19, 'text': 'Okay, so error is one particular metric.', 'start': 1196.388, 'duration': 2.802}, {'end': 1205.955, 'text': "It's a thing that measures how well you're doing, and it's like it should be the thing that you most care about.", 'start': 1199.69, 'duration': 6.265}, {'end': 1213.24, 'text': "So you write a function or use one of FastAI's predefined ones, which measures how well you're doing.", 'start': 1206.335, 'duration': 6.905}], 'summary': 'Metrics like error rate measure model performance in classification tasks.', 'duration': 31.284, 'max_score': 1181.956, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ81181956.jpg'}, {'end': 1285.442, 'src': 'embed', 'start': 1259.609, 'weight': 4, 'content': [{'end': 1267.714, 'text': 'we call this the loss function, and the loss function is the measure of performance that the algorithm uses to try to make the parameters better,', 'start': 1259.609, 'duration': 8.105}, {'end': 1275.358, 'text': "and it's something which should kind of track pretty closely to the metric you care about, but it's something which,", 'start': 1267.714, 'duration': 7.644}, {'end': 1278.92, 'text': 'as you change the parameters a bit, the loss should always change.', 'start': 1275.358, 'duration': 3.562}, {'end': 1285.442, 'text': "bit, And so there's a lot of hand waving there, because we need to look at some of the math of how that works,", 'start': 1280, 'duration': 5.442}], 'summary': 'The loss function measures algorithm performance and should closely track the metric of interest as parameters change.', 'duration': 25.833, 'max_score': 1259.609, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ81259609.jpg'}, {'end': 1344.248, 'src': 'embed', 'start': 1311.491, 'weight': 5, 'content': [{'end': 1316.656, 'text': 'So fine-tuning is a transfer learning technique where the weights this is not quite the right word.', 'start': 1311.491, 'duration': 5.165}, {'end': 1318.218, 'text': 'we should say the parameters,', 'start': 1316.656, 'duration': 1.562}, {'end': 1325.665, 'text': 'where the parameters of a pre-trained model are updated by training for additional epochs using a different task to that used for pre-training.', 'start': 1318.218, 'duration': 7.447}, {'end': 1333.633, 'text': 'So pre-training the task might have been ImageNet classification, and then our different task might be recognizing cats versus dogs.', 'start': 1326.105, 'duration': 7.528}, {'end': 1344.248, 'text': 'So the way, by default, FastAI does fine-tuning is that we use one epoch which, remember,', 'start': 1336.423, 'duration': 7.825}], 'summary': 'Fine-tuning updates pre-trained model parameters for additional epochs using a different task, such as recognizing cats versus dogs.', 'duration': 32.757, 'max_score': 1311.491, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ81311491.jpg'}, {'end': 1405.617, 'src': 'embed', 'start': 1374.198, 'weight': 6, 'content': [{'end': 1377.16, 'text': 'So why does transfer learning work and why does it work so well?', 'start': 1374.198, 'duration': 2.962}, {'end': 1390.168, 'text': 'The best way, in my opinion, to look at this is to see this paper by Zeiler and Fergus, who were actually 2012 ImageNet winners, and, interestingly,', 'start': 1378.461, 'duration': 11.707}, {'end': 1395.191, 'text': "their key insights came from their ability to visualize what's going on inside a model.", 'start': 1390.168, 'duration': 5.023}, {'end': 1399.674, 'text': 'So visualization very often turns out to be super important to getting great results.', 'start': 1395.751, 'duration': 3.923}, {'end': 1405.617, 'text': 'What they were able to do was they looked remember, I told you like a ResNet34 has 34 layers?', 'start': 1400.894, 'duration': 4.723}], 'summary': 'Transfer learning works well due to insights from visualization, as demonstrated by 2012 imagenet winners zeiler and fergus.', 'duration': 31.419, 'max_score': 1374.198, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ81374198.jpg'}, {'end': 1621.858, 'src': 'embed', 'start': 1598.732, 'weight': 8, 'content': [{'end': 1605.654, 'text': 'So the further we get, Layer 3 then gets to combine all the kinds of features in Layer 2.', 'start': 1598.732, 'duration': 6.922}, {'end': 1610.895, 'text': "And remember we're only seeing, so we're only seeing here 12 of the features, but actually there's probably hundreds of them.", 'start': 1605.654, 'duration': 5.241}, {'end': 1613.396, 'text': "I don't remember exactly in AlexNet, but there's lots.", 'start': 1610.995, 'duration': 2.401}, {'end': 1621.858, 'text': 'But by the time we get to Layer 3, by combining features from Layer 2, it already has something which is finding text.', 'start': 1614.156, 'duration': 7.702}], 'summary': 'Layer 3 combines features from layer 2, detecting text with hundreds of features.', 'duration': 23.126, 'max_score': 1598.732, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ81598732.jpg'}, {'end': 1759.497, 'src': 'embed', 'start': 1727.666, 'weight': 9, 'content': [{'end': 1734.608, 'text': 'One important thing to realize then is that these techniques for computer vision are not just good at recognizing photos.', 'start': 1727.666, 'duration': 6.942}, {'end': 1737.81, 'text': "There's all kinds of things you can turn into pictures.", 'start': 1735.509, 'duration': 2.301}, {'end': 1748.034, 'text': 'For example, these are sounds that have been turned into pictures by representing their frequencies over time.', 'start': 1738.57, 'duration': 9.464}, {'end': 1759.497, 'text': 'And it turns out that if you convert a sound into these kinds of pictures, you can get basically state-of-the-art results at sound detection,', 'start': 1749.254, 'duration': 10.243}], 'summary': 'Computer vision techniques can convert sounds into pictures, achieving state-of-the-art sound detection results.', 'duration': 31.831, 'max_score': 1727.666, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ81727666.jpg'}], 'start': 926.174, 'title': 'Transfer learning and visualization', 'summary': "Emphasizes the importance of transfer learning and pre-trained models for improving accuracy and reducing resource requirements, explains the distinctions between loss, error, and metric in machine learning, and discusses the effectiveness of transfer learning in deep learning, highlighting insights from zeiler and fergus' paper on visualizing model layers and features.", 'chapters': [{'end': 1141.58, 'start': 926.174, 'title': 'Transfer learning and pre-trained models', 'summary': 'Elaborates on the importance of transfer learning and pre-trained models, explaining how utilizing pre-trained models for transfer learning can lead to a significant improvement in accuracy by implicitly learning features and concepts from a large dataset, thus reducing the need for extensive data and compute resources.', 'duration': 215.406, 'highlights': ['Transfer learning allows for significant improvement in accuracy by implicitly learning features and concepts from a large dataset, reducing the need for extensive data and compute resources. By utilizing pre-trained models for transfer learning, one can achieve a far more accurate model by implicitly learning features and concepts from a large dataset, thus reducing the need for extensive data and compute resources.', 'Pre-trained models, such as those from ImageNet, can recognize a thousand categories of things in images, providing a starting point for training more epochs on custom data. Models pretrained on datasets like ImageNet can recognize a thousand categories of things in images, serving as a foundation for training additional epochs on custom data to achieve improved accuracy.', "Transfer learning enables the understanding of various concepts such as different breeds of animals by leveraging pre-trained models' implicit knowledge. Through transfer learning, pre-trained models can implicitly learn concepts such as distinguishing different breeds of animals, which is crucial for achieving accuracy in image recognition tasks."]}, {'end': 1372.431, 'start': 1144.156, 'title': 'Loss, error, and metric in machine learning', 'summary': 'Explains the differences between loss, error, and metric in machine learning, where error is a particular metric that measures how well the model is doing, and fine-tuning is a transfer learning technique that updates parameters of a pre-trained model for a different task.', 'duration': 228.275, 'highlights': ["Error is a particular metric that measures how well you're doing, and it's the thing that you most care about. Error is a key metric that measures the performance of a model, reflecting how often the model is wrong, which is crucial for model evaluation and improvement.", 'Loss function is the measure of performance that the algorithm uses to try to make the parameters better, and it should closely track the metric you care about. The loss function is critical for parameter optimization in machine learning, as it guides the model to improve its performance and should align closely with the desired metric.', 'Fine-tuning is a transfer learning technique that updates the parameters of a pre-trained model by training for additional epochs using a different task. Fine-tuning is an advanced transfer learning technique that updates the parameters of a pre-trained model by training for additional epochs using a different task, such as updating a model pre-trained on ImageNet for recognizing cats versus dogs.']}, {'end': 2117.718, 'start': 1374.198, 'title': 'Transfer learning & visualization in deep learning', 'summary': "Discusses the effectiveness of transfer learning in deep learning, citing the insights from zeiler and fergus' paper on visualizing model layers and features, highlighting the significant impact of visualization and the progressive complexity of features in each layer, and emphasizing the wide applicability of computer vision techniques beyond photo recognition, such as sound detection and virus detection.", 'duration': 743.52, 'highlights': ["Zeiler and Fergus' paper provided key insights on visualizing model layers and features, leading to the understanding of the effectiveness of transfer learning in deep learning. The paper by Zeiler and Fergus, 2012 ImageNet winners, demonstrated the importance of visualizing model layers and features, providing key insights into the effectiveness of transfer learning.", 'Visualization is crucial in understanding model behavior and achieving superior results in deep learning tasks. The significance of visualization in understanding model behavior and achieving superior results in deep learning tasks was emphasized, highlighting its crucial role in obtaining great outcomes.', 'The progressive complexity of features in each layer of the model contributes to its effectiveness, with each layer being capable of recognizing more sophisticated patterns and concepts. The progressive complexity of features in each layer of the model was highlighted, showcasing the capability of each layer to recognize increasingly sophisticated patterns and concepts.', 'Computer vision techniques extend beyond photo recognition, with applications in sound detection and virus detection by converting sounds and virus program signatures into images for analysis. The wide applicability of computer vision techniques beyond photo recognition was discussed, including their effectiveness in sound detection and virus detection through the conversion of sounds and virus program signatures into images for analysis.']}], 'duration': 1191.544, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ8926174.jpg', 'highlights': ['Transfer learning enables implicit learning of features from a large dataset, reducing resource requirements.', 'Pre-trained models like ImageNet recognize a thousand image categories, serving as a foundation for training custom data.', "Transfer learning allows understanding of concepts like animal breeds through pre-trained models' implicit knowledge.", 'Error is a crucial metric reflecting model performance and the frequency of incorrect predictions.', 'Loss function guides parameter optimization and should align closely with the desired metric.', 'Fine-tuning updates pre-trained model parameters by training for additional epochs using a different task.', "Zeiler and Fergus' paper provided insights into visualizing model layers and features, emphasizing transfer learning's effectiveness.", 'Visualization is crucial for understanding model behavior and achieving superior results in deep learning tasks.', "Progressive complexity of features in each layer contributes to the model's effectiveness in recognizing sophisticated patterns.", 'Computer vision techniques extend beyond photo recognition, including sound and virus detection through image analysis.']}, {'end': 2736.136, 'segs': [{'end': 2146.644, 'src': 'embed', 'start': 2119.239, 'weight': 0, 'content': [{'end': 2127.3, 'text': 'So if you want to fine-tune something which is good at a new task but also continues to be good at the previous task,', 'start': 2119.239, 'duration': 8.061}, {'end': 2130.821, 'text': 'you need to keep putting in examples of the previous task as well.', 'start': 2127.3, 'duration': 3.521}, {'end': 2139.042, 'text': 'What are the differences between parameters and hyperparameters?', 'start': 2134.382, 'duration': 4.66}, {'end': 2146.644, 'text': 'If I am feeding an image of a dog as an input and then changing the hyperparameters of batch size in the model,', 'start': 2140.563, 'duration': 6.081}], 'summary': 'To maintain performance on new and previous tasks, continue providing examples. consider differences between parameters and hyperparameters.', 'duration': 27.405, 'max_score': 2119.239, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ82119239.jpg'}, {'end': 2275.099, 'src': 'embed', 'start': 2241.891, 'weight': 1, 'content': [{'end': 2254.284, 'text': "They're the numbers which change what the model does to be something that recognizes malignant tumors versus cats, versus dogs versus colorizes.", 'start': 2241.891, 'duration': 12.393}, {'end': 2255.065, 'text': 'black and white pictures.', 'start': 2254.284, 'duration': 0.781}, {'end': 2269.713, 'text': 'Whereas the hyperparameter is the choices about what numbers do you pass to the actual fitting function to decide how that fitting process happens.', 'start': 2257.118, 'duration': 12.595}, {'end': 2275.099, 'text': "There's a question, I'm curious about the pacing of this course.", 'start': 2272.356, 'duration': 2.743}], 'summary': 'Discussion on numbers and hyperparameters in machine learning models.', 'duration': 33.208, 'max_score': 2241.891, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ82241891.jpg'}, {'end': 2494.712, 'src': 'embed', 'start': 2464.579, 'weight': 3, 'content': [{'end': 2466.781, 'text': "For example, medical imaging, there's hardly any.", 'start': 2464.579, 'duration': 2.202}, {'end': 2473.144, 'text': "There's a lot of opportunities for people to create domain-specific pre-trained models.", 'start': 2469.342, 'duration': 3.802}, {'end': 2477.506, 'text': "It's still an area that's really underdone, because not enough people are working on transfer learning.", 'start': 2473.244, 'duration': 4.262}, {'end': 2487.506, 'text': "Okay, so as I was mentioning, we've kind of got these four applications that we've talked about a bit.", 'start': 2480.359, 'duration': 7.147}, {'end': 2494.712, 'text': 'And deep learning is pretty, you know, pretty good at all of those.', 'start': 2489.548, 'duration': 5.164}], 'summary': "Limited medical imaging, ample opportunities for domain-specific pre-trained models, underexplored area due to lack of focus on transfer learning, and deep learning's proficiency in mentioned applications.", 'duration': 30.133, 'max_score': 2464.579, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ82464579.jpg'}, {'end': 2549.889, 'src': 'embed', 'start': 2516.251, 'weight': 2, 'content': [{'end': 2519.574, 'text': 'Deep learning is really pretty great for those in particular.', 'start': 2516.251, 'duration': 3.323}, {'end': 2527.643, 'text': "For text, it's pretty great at things like classification and translation.", 'start': 2523.142, 'duration': 4.501}, {'end': 2534.245, 'text': "It's actually terrible for conversation, and so that's been something that's been a huge disappointment for a lot of companies.", 'start': 2528.463, 'duration': 5.782}, {'end': 2542.127, 'text': "They tried to create these like conversation bots, but actually deep learning isn't good at providing accurate information.", 'start': 2534.285, 'duration': 7.842}, {'end': 2549.889, 'text': "It's good at providing things that sound accurate and sound compelling, but we don't really have great ways yet of actually making sure it's correct.", 'start': 2542.767, 'duration': 7.122}], 'summary': 'Deep learning excels in text tasks but struggles in conversation, causing disappointment for companies.', 'duration': 33.638, 'max_score': 2516.251, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ82516251.jpg'}, {'end': 2598.647, 'src': 'embed', 'start': 2570.143, 'weight': 4, 'content': [{'end': 2572.745, 'text': 'Deep learning is also good at multimodal.', 'start': 2570.143, 'duration': 2.602}, {'end': 2577.788, 'text': "That means things where you've got multiple different types of data.", 'start': 2573.645, 'duration': 4.143}, {'end': 2585.893, 'text': 'So you might have some tabular data, including a text column and an image, and some collaborative filtering data,', 'start': 2577.808, 'duration': 8.085}, {'end': 2589.696, 'text': 'and combining that all together is something that deep learning is really good at.', 'start': 2585.893, 'duration': 3.803}, {'end': 2598.647, 'text': 'So, for example, putting captions on photos is something which deep learning is pretty good at.', 'start': 2590.917, 'duration': 7.73}], 'summary': 'Deep learning excels at handling multimodal data, including tabular, text, and image data, as well as collaborative filtering data, making it effective for tasks such as adding captions to photos.', 'duration': 28.504, 'max_score': 2570.143, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ82570143.jpg'}], 'start': 2119.239, 'title': 'Parameters vs hyperparameters and deep learning applications', 'summary': 'Explains the distinction between parameters and hyperparameters, as well as the significance of incorporating previous task examples for fine-tuning. it also delves into the limitations and applications of deep learning, pre-trained models, and the need for domain-specific pre-trained models in computer vision, text, tabular data, and recommendation systems.', 'chapters': [{'end': 2275.099, 'start': 2119.239, 'title': 'Parameters vs hyperparameters', 'summary': "Explains the difference between parameters and hyperparameters, emphasizing that parameters are the numbers that change the model's behavior, while hyperparameters are the choices for the fitting process. it also touches on the importance of including examples of the previous task to fine-tune a model for a new task.", 'duration': 155.86, 'highlights': ['Parameters are the numbers that change what the model does, representing input and learned parameters, while hyperparameters are the choices for the fitting process. Parameters are the numbers that change what the model does, representing input and learned parameters, while hyperparameters are the choices for the fitting process.', 'Importance of including examples of the previous task to fine-tune a model for a new task. To fine-tune a model for a new task while maintaining performance on the previous task, it is important to include examples of the previous task as well.', 'Explanation of parameters in neural networks and their abstract nature compared to non-neural net examples. Parameters in neural networks are described as more abstract compared to non-neural net examples, with a detailed understanding to be covered in the upcoming lessons.']}, {'end': 2736.136, 'start': 2275.239, 'title': 'Deep learning applications and pre-trained models', 'summary': 'Covers the limitations and applications of deep learning, pre-trained models, and the need for domain-specific pre-trained models, with a focus on four key areas: computer vision, text, tabular data, and recommendation systems.', 'duration': 460.897, 'highlights': ['Deep learning is particularly good for high cardinality variables like zip codes, product IDs, and is great for text classification and translation, but not for conversation bots. high cardinality variables, text classification, translation, limitations of conversation bots', "There's a scarcity of domain-specific pre-trained models, particularly in medical imaging, indicating opportunities for creating more diverse pre-trained models. scarcity of domain-specific pre-trained models, lack of diversity in pre-trained models in medical imaging", 'The need for more people to work on transfer learning and the creation of domain-specific pre-trained models is highlighted, particularly in areas such as medical imaging. importance of transfer learning, need for domain-specific pre-trained models in medical imaging', 'Deep learning is good at multimodal tasks, combining different types of data like tabular, text, and collaborative filtering data, making it suitable for tasks like putting captions on photos. multimodal tasks, combining different data types, example of putting captions on photos']}], 'duration': 616.897, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ82119239.jpg', 'highlights': ['Importance of including examples of the previous task to fine-tune a model for a new task.', 'Parameters are the numbers that change what the model does, representing input and learned parameters, while hyperparameters are the choices for the fitting process.', 'Deep learning is particularly good for high cardinality variables like zip codes, product IDs, and is great for text classification and translation, but not for conversation bots.', 'The need for more people to work on transfer learning and the creation of domain-specific pre-trained models is highlighted, particularly in areas such as medical imaging.', 'Deep learning is good at multimodal tasks, combining different types of data like tabular, text, and collaborative filtering data, making it suitable for tasks like putting captions on photos.']}, {'end': 3737.864, 'segs': [{'end': 2784.627, 'src': 'embed', 'start': 2755.769, 'weight': 0, 'content': [{'end': 2763.813, 'text': "And for a case study for this, I thought let's pick something that's actually super important right now, which is a model in this paper.", 'start': 2755.769, 'duration': 8.044}, {'end': 2766.995, 'text': "One of the things we're going to try and do in this course is learn how to read papers.", 'start': 2764.033, 'duration': 2.962}, {'end': 2775.859, 'text': 'So here is a paper, which I would love for everybody to read, called High Temperature and High Humidity Reduce the Transmission of COVID-19.', 'start': 2768.135, 'duration': 7.724}, {'end': 2784.627, 'text': 'Now this is a very important issue, because if the claim of this paper is true, then that would mean that this is going to be a seasonal disease.', 'start': 2776.659, 'duration': 7.968}], 'summary': 'Learning to read a paper on covid-19 transmission and seasonal impact.', 'duration': 28.858, 'max_score': 2755.769, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ82755769.jpg'}, {'end': 2828.07, 'src': 'embed', 'start': 2804.343, 'weight': 1, 'content': [{'end': 2814.046, 'text': "And what they've done here is they've taken a hundred cities in China and they've plotted the temperature on one axis in Celsius and R on the other axis,", 'start': 2804.343, 'duration': 9.703}, {'end': 2817.087, 'text': 'where R is a measure of transmissibility.', 'start': 2814.046, 'duration': 3.041}, {'end': 2824.349, 'text': 'It says for each person that has this disease, how many people on average will they infect.', 'start': 2817.227, 'duration': 7.122}, {'end': 2828.07, 'text': 'So if R is under one, then the disease will not spread.', 'start': 2824.989, 'duration': 3.081}], 'summary': 'Analysis of temperature and transmissibility in 100 chinese cities.', 'duration': 23.727, 'max_score': 2804.343, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ82804343.jpg'}, {'end': 3206.272, 'src': 'embed', 'start': 3178.948, 'weight': 3, 'content': [{'end': 3181.969, 'text': 'And so one way to measure that is we use something called a p-value.', 'start': 3178.948, 'duration': 3.021}, {'end': 3185.23, 'text': "So a p-value, here's how a p-value works.", 'start': 3183.01, 'duration': 2.22}, {'end': 3188.511, 'text': 'We start out with something called a null hypothesis.', 'start': 3185.85, 'duration': 2.661}, {'end': 3194.613, 'text': "And the null hypothesis is basically what's our starting point assumption.", 'start': 3188.651, 'duration': 5.962}, {'end': 3200.395, 'text': "So our starting point assumption might be, oh there's no relationship between temperature and R.", 'start': 3195.073, 'duration': 5.322}, {'end': 3201.575, 'text': 'And then we gather some data.', 'start': 3200.395, 'duration': 1.18}, {'end': 3204.25, 'text': 'And have you explained what R is? I have, yes.', 'start': 3201.929, 'duration': 2.321}, {'end': 3206.272, 'text': 'R is the transmissibility of the virus.', 'start': 3204.551, 'duration': 1.721}], 'summary': 'Using p-value to measure relationship between temperature and virus transmissibility.', 'duration': 27.324, 'max_score': 3178.948, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ83178948.jpg'}, {'end': 3253.028, 'src': 'embed', 'start': 3224.342, 'weight': 2, 'content': [{'end': 3226.444, 'text': "There's the data that was gathered in this example.", 'start': 3224.342, 'duration': 2.102}, {'end': 3235.547, 'text': 'And then we say what percentage of the time would we see this amount of relationship, which is a slope of 0.023, by chance?', 'start': 3227.424, 'duration': 8.123}, {'end': 3242.79, 'text': "And, as we've seen, one way to do that is by what we would call a simulation, which is by generating random numbers,", 'start': 3236.508, 'duration': 6.282}, {'end': 3250.073, 'text': 'a hundred pairs of random numbers, a bunch of times, and seeing how often you see this, this relationship.', 'start': 3242.79, 'duration': 7.283}, {'end': 3253.028, 'text': "We don't actually have to do it that though.", 'start': 3251.565, 'duration': 1.463}], 'summary': 'Analyzing relationship data to determine chance occurrence percentage of a slope of 0.023, using simulation and random numbers.', 'duration': 28.686, 'max_score': 3224.342, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ83224342.jpg'}, {'end': 3441.949, 'src': 'embed', 'start': 3412.63, 'weight': 5, 'content': [{'end': 3419.614, 'text': 'So therefore, conclusions and policy decisions should not be based on whether a p-value passes some threshold.', 'start': 3412.63, 'duration': 6.984}, {'end': 3426.497, 'text': 'P-value does not measure the importance of a result.', 'start': 3422.335, 'duration': 4.162}, {'end': 3434.302, 'text': "Because again, it could just tell you that you collected lots of data, which doesn't tell you that the results actually have any practical import.", 'start': 3427.818, 'duration': 6.484}, {'end': 3438.284, 'text': 'And so by itself, it does not provide a good measure of evidence.', 'start': 3434.942, 'duration': 3.342}, {'end': 3441.949, 'text': 'So Frank Harrell,', 'start': 3440.908, 'duration': 1.041}], 'summary': 'P-value should not be the sole basis for conclusions or policy decisions as it does not measure result importance or provide good evidence.', 'duration': 29.319, 'max_score': 3412.63, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ83412630.jpg'}, {'end': 3676.272, 'src': 'embed', 'start': 3646.191, 'weight': 4, 'content': [{'end': 3655.654, 'text': 'And the reasons include denser cities are going to have higher transmission, for instance, and probably more humid will have less transmission.', 'start': 3646.191, 'duration': 9.463}, {'end': 3666.717, 'text': 'So when you do a multivariate model, it actually allows you to be more confident of your results, right?', 'start': 3656.394, 'duration': 10.323}, {'end': 3676.272, 'text': 'But the p-value, as noted by the American Statistical Association, does not tell us whether this is of practical importance.', 'start': 3669.71, 'duration': 6.562}], 'summary': 'Denser cities have higher transmission, while humidity reduces it. multivariate models offer more confidence, but p-values may not indicate practical importance.', 'duration': 30.081, 'max_score': 3646.191, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ83646191.jpg'}], 'start': 2736.136, 'title': 'Interpreting covid-19 model and p-values in research', 'summary': 'Delves into interpreting a model on covid-19 transmission, analyzing the relationship between temperature and transmissibility (r), and discussing the limitations of p-values in medical research. it raises skepticism about the claimed relationship between temperature and covid-19 transmission, introduces the concept of p-value as a measure, and emphasizes the need for more comprehensive statistical analysis.', 'chapters': [{'end': 3059.913, 'start': 2736.136, 'title': 'Interpreting model on covid-19 transmission', 'summary': "Discusses the importance of interpreting models, focusing on a paper that claims high temperature and humidity reduce the transmission of covid-19, raising concerns about its potential seasonal nature and policy implications. it delves into a case study analyzing a model's relationship between temperature and transmissibility (r) and raises skepticism about the claimed relationship by demonstrating a random distribution of data in a spreadsheet.", 'duration': 323.777, 'highlights': ["The paper 'High Temperature and High Humidity Reduce the Transmission of COVID-19' is discussed, emphasizing its potential policy implications if the claim holds true. The paper's potential impact on policy is emphasized, as its claim could indicate a seasonal nature of COVID-19, with significant policy implications.", "A case study is presented, analyzing the relationship between temperature and transmissibility (R), highlighting concerns about the claimed relationship and potential randomness in the data. The analysis delves into the relationship between temperature and transmissibility (R), expressing concerns about the claimed relationship's validity and demonstrating potential randomness in the data through a spreadsheet.", 'A demonstration of random distribution of data is provided in a spreadsheet, raising skepticism about the claimed relationship between temperature and transmissibility (R). The use of a spreadsheet to demonstrate a random distribution of data raises skepticism about the claimed relationship between temperature and transmissibility (R), potentially challenging the findings of the study.']}, {'end': 3223.402, 'start': 3059.913, 'title': 'Temperature and r relationship analysis', 'summary': 'Discusses the analysis of the relationship between temperature and r, highlighting the need to measure relationships confidently and introducing the concept of p-value as a measure, using examples and randomly generated data.', 'duration': 163.489, 'highlights': ['The p-value is introduced as a measure to assess the relationship between temperature and R, emphasizing the need to measure relationships confidently. Introduction of p-value as a measure for relationship assessment, emphasizing the need for confidence in measuring relationships.', 'The analysis of randomly generated data illustrates the occurrence of relationships coincidentally, emphasizing the importance of considering a larger sample size for more confident measurements. Illustration of coincidental relationships in randomly generated data, emphasizing the need for larger sample sizes for confident measurements.', 'Calculation of slopes for different examples using the slope function in Microsoft Excel, demonstrating the presence of random and negligible slopes, indicating no real relationship between temperature and R. Calculation of slopes using Microsoft Excel, demonstrating random and negligible slopes, indicating absence of real relationship.']}, {'end': 3737.864, 'start': 3224.342, 'title': 'P-values in medical research', 'summary': 'Discusses the limitations of p-values, highlighting that they do not measure the probability that a hypothesis is true and should not be the basis for conclusions or policy decisions. it emphasizes the importance of considering the practical significance of results and the need for more comprehensive statistical analysis. the chapter also explains the relevance of multivariate models in increasing confidence in research findings.', 'duration': 513.522, 'highlights': ['P-values do not measure the probability that a hypothesis is true or the importance of a result. P-values are highlighted as inadequate for measuring the probability that a hypothesis is true or the importance of a result, as they can be influenced by the amount of data collected.', 'The limitations of p-values are emphasized by the American Statistical Association and Frank Harrell, who notes that they have done significant harm to science. The American Statistical Association and Frank Harrell emphasize the limitations of p-values, stating that they do not measure the probability that a hypothesis is true and have done significant harm to science.', 'Multivariate models increase confidence in research findings by considering multiple factors and their impact on the results. Multivariate models are highlighted as increasing confidence in research findings by considering multiple factors such as temperature, humidity, GDP per capita, and population density, which can impact the results.', 'The practical importance of results is determined by considering the actual slope found in the research, and not solely relying on p-values. The chapter emphasizes that the practical importance of results is determined by considering the actual slope found in the research, rather than solely relying on p-values, highlighting the need for a more comprehensive statistical analysis.']}], 'duration': 1001.728, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ82736136.jpg', 'highlights': ["The paper 'High Temperature and High Humidity Reduce the Transmission of COVID-19' potential policy implications if the claim holds true.", 'A demonstration of random distribution of data raises skepticism about the claimed relationship between temperature and transmissibility (R).', 'The analysis of randomly generated data illustrates the occurrence of relationships coincidentally, emphasizing the importance of considering a larger sample size for more confident measurements.', 'Introduction of p-value as a measure for relationship assessment, emphasizing the need for confidence in measuring relationships.', 'Multivariate models increase confidence in research findings by considering multiple factors and their impact on the results.', 'The practical importance of results is determined by considering the actual slope found in the research, and not solely relying on p-values.']}, {'end': 4401.199, 'segs': [{'end': 3794.74, 'src': 'embed', 'start': 3762.81, 'weight': 2, 'content': [{'end': 3767.635, 'text': "It's not with p-values, but with looking at kind of actual outcomes.", 'start': 3762.81, 'duration': 4.825}, {'end': 3780.411, 'text': 'So how do you think about the practical importance of a model and how do you turn a predictive model into something useful in production?', 'start': 3769.304, 'duration': 11.107}, {'end': 3788.636, 'text': 'So I spent many, many years thinking about this and I actually created,', 'start': 3781.412, 'duration': 7.224}, {'end': 3794.74, 'text': 'with some other great folks I actually created a paper about it Designing Great Data Products.', 'start': 3788.636, 'duration': 6.104}], 'summary': 'Practical importance of a model, creating great data products', 'duration': 31.93, 'max_score': 3762.81, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ83762810.jpg'}, {'end': 3866.287, 'src': 'embed', 'start': 3838.416, 'weight': 0, 'content': [{'end': 3847.179, 'text': 'what we did was we decided to use a different approach, which I ended up calling the drive train approach, which is described here,', 'start': 3838.416, 'duration': 8.763}, {'end': 3851.64, 'text': 'to set insurance prices and indeed to do all kinds of other things.', 'start': 3847.179, 'duration': 4.461}, {'end': 3862.585, 'text': "And so for the insurance example, the objective would be for an insurance company would be how do I maximize my, let's say, five-year profit?", 'start': 3852.26, 'duration': 10.325}, {'end': 3866.287, 'text': 'And then what inputs can we control?', 'start': 3863.806, 'duration': 2.481}], 'summary': 'Implemented drive train approach to set insurance prices, aiming to maximize five-year profit.', 'duration': 27.871, 'max_score': 3838.416, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ83838416.jpg'}, {'end': 3994.576, 'src': 'heatmap', 'start': 3933.175, 'weight': 1, 'content': [{'end': 3942.021, 'text': 'over many years took this basic process and tried to help lots of companies figure out how to use it to turn predictive models into actions.', 'start': 3933.175, 'duration': 8.846}, {'end': 3953.508, 'text': "So the starting point in like actually getting value in a predictive model is thinking about what is it you're trying to do and you know what are the sources of value in that thing you're trying to do.", 'start': 3943.542, 'duration': 9.966}, {'end': 3961.932, 'text': "levers? what are the things you can change? Like, what's the point of a predictive model if you can't do anything about it right?", 'start': 3954.748, 'duration': 7.184}, {'end': 3966.995, 'text': "Figuring out ways to find what data you you know have, which one's suitable, what's available, then?", 'start': 3962.352, 'duration': 4.643}, {'end': 3969.596, 'text': 'thinking about what approaches to analytics you can then take.', 'start': 3966.995, 'duration': 2.601}, {'end': 3977.06, 'text': 'And then super important, like well, can you actually implement, you know, those changes?', 'start': 3970.777, 'duration': 6.283}, {'end': 3982.143, 'text': 'And super, super important, how do you actually change things as the environment changes?', 'start': 3977.761, 'duration': 4.382}, {'end': 3987.428, 'text': "And you know, interestingly, a lot of these things are areas where there's not very much academic research.", 'start': 3983.024, 'duration': 4.404}, {'end': 3994.576, 'text': "There's a little bit and some of the papers that have been particularly around maintenance of like.", 'start': 3988.029, 'duration': 6.547}], 'summary': 'Helped companies turn predictive models into actions, focusing on value, data sources, analytics, implementation, and adaptability.', 'duration': 61.401, 'max_score': 3933.175, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ83933175.jpg'}, {'end': 3994.576, 'src': 'embed', 'start': 3962.352, 'weight': 1, 'content': [{'end': 3966.995, 'text': "Figuring out ways to find what data you you know have, which one's suitable, what's available, then?", 'start': 3962.352, 'duration': 4.643}, {'end': 3969.596, 'text': 'thinking about what approaches to analytics you can then take.', 'start': 3966.995, 'duration': 2.601}, {'end': 3977.06, 'text': 'And then super important, like well, can you actually implement, you know, those changes?', 'start': 3970.777, 'duration': 6.283}, {'end': 3982.143, 'text': 'And super, super important, how do you actually change things as the environment changes?', 'start': 3977.761, 'duration': 4.382}, {'end': 3987.428, 'text': "And you know, interestingly, a lot of these things are areas where there's not very much academic research.", 'start': 3983.024, 'duration': 4.404}, {'end': 3994.576, 'text': "There's a little bit and some of the papers that have been particularly around maintenance of like.", 'start': 3988.029, 'duration': 6.547}], 'summary': 'Challenges in finding suitable data, implementing changes, and adapting to environment with limited academic research.', 'duration': 32.224, 'max_score': 3962.352, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ83962352.jpg'}, {'end': 4122.917, 'src': 'embed', 'start': 4071.948, 'weight': 3, 'content': [{'end': 4076.852, 'text': 'how you actually get value from machine learning in practice and what you actually have to ask.', 'start': 4071.948, 'duration': 4.904}, {'end': 4079.315, 'text': "So please check it out because hopefully you'll find it helpful.", 'start': 4076.933, 'duration': 2.382}, {'end': 4083.019, 'text': 'So when we think about, like.', 'start': 4080.316, 'duration': 2.703}, {'end': 4090.005, 'text': 'think about this, for the question of how should people think about the relationship between seasonality and transmissibility of COVID-19..', 'start': 4083.019, 'duration': 6.986}, {'end': 4098.571, 'text': 'you kind of need to dig really deeply into the questions about like.', 'start': 4093.55, 'duration': 5.021}, {'end': 4102.532, 'text': 'not just what are those numbers in the data, but what does it really look like right?', 'start': 4098.571, 'duration': 3.961}, {'end': 4112.474, 'text': 'So one of the things in the paper that they show is actual maps right, of temperature and humidity and R right?', 'start': 4102.872, 'duration': 9.602}, {'end': 4122.917, 'text': 'And you can see, like not surprisingly, that humidity and temperature in China are what we would call autocorrelated.', 'start': 4113.755, 'duration': 9.162}], 'summary': 'Exploring the relationship between seasonality and transmissibility of covid-19 using machine learning and data analysis.', 'duration': 50.969, 'max_score': 4071.948, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ84071948.jpg'}, {'end': 4221.072, 'src': 'embed', 'start': 4198.477, 'weight': 4, 'content': [{'end': 4205.982, 'text': "or so that's what the right hand side is or there is no real relationship between temperature and R.", 'start': 4198.477, 'duration': 7.505}, {'end': 4213.307, 'text': "And we might act on the assumption that there is a relationship, or we might act on the assumption that there isn't a relationship.", 'start': 4205.982, 'duration': 7.325}, {'end': 4221.072, 'text': 'And so you kind of want to look at each of these four possibilities and say like well what would be the economic and societal consequences.', 'start': 4213.787, 'duration': 7.285}], 'summary': 'Examining the economic and societal consequences of temperature-r relationship assumptions.', 'duration': 22.595, 'max_score': 4198.477, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ84198477.jpg'}, {'end': 4350.434, 'src': 'embed', 'start': 4327.785, 'weight': 5, 'content': [{'end': 4335.328, 'text': 'If you assume that there will be seasonality and that summer will fix things, then it could lead you to be apathetic now.', 'start': 4327.785, 'duration': 7.543}, {'end': 4340.23, 'text': "If you assume there's no seasonality, and then there is,", 'start': 4335.828, 'duration': 4.402}, {'end': 4350.214, 'text': 'then you could end up kind of creating a larger level of expectation of distraction than actually happens and end up with your population being even more apathetic,', 'start': 4340.23, 'duration': 9.984}, {'end': 4350.434, 'text': 'you know.', 'start': 4350.214, 'duration': 0.22}], 'summary': 'Seasonal assumptions can lead to apathy or unrealistic expectations.', 'duration': 22.649, 'max_score': 4327.785, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ84327785.jpg'}], 'start': 3737.864, 'title': 'Practical model significance and implementation', 'summary': "Discusses the practical significance of models in setting insurance prices, turning predictive models into actionable insights, based on the author's 20 years of experience, which revolutionized conventional predictive modeling in insurance, and the relationship between seasonality and transmissibility of covid-19, analyzing temperature and humidity maps to understand their impact on transmission, and evaluating potential policy impacts and consequences of different assumptions.", 'chapters': [{'end': 4071.948, 'start': 3737.864, 'title': 'Practical model significance and implementation', 'summary': "Discusses the practical significance of models, focusing on the drive train approach in setting insurance prices and the process of turning predictive models into actionable insights, based on the author's 20 years of experience, which revolutionized the conventional predictive modeling in insurance.", 'duration': 334.084, 'highlights': ["The drive train approach revolutionized the conventional predictive modeling in insurance, focusing on maximizing the company's five-year profit and controlling levers such as setting prices, which led to significant value and insights. The drive train approach in setting insurance prices aimed to maximize the company's five-year profit and control levers, leading to significant value and insights.", 'The process of turning predictive models into actionable insights is essential for practical model significance and implementation, emphasizing the importance of identifying sources of value, available data, approaches to analytics, implementation, and adaptation to changing environments. Turning predictive models into actionable insights requires identifying sources of value, available data, approaches to analytics, implementation, and adaptation to changing environments.', 'The chapter emphasizes the importance of practical significance of models, highlighting the need to assess the actual outcomes instead of relying solely on p-values, and the significance of considering the impact of models on real-world decisions and actions. The chapter stresses the importance of practical significance and emphasizes the need to assess actual outcomes, considering the impact of models on real-world decisions and actions.']}, {'end': 4401.199, 'start': 4071.948, 'title': 'Seasonality and transmissibility of covid-19', 'summary': 'Discusses the relationship between seasonality and transmissibility of covid-19, analyzing temperature and humidity maps to understand their impact on transmission, and evaluating the potential policy impacts and consequences of different assumptions.', 'duration': 329.251, 'highlights': ['The paper shows maps of temperature and humidity in China, indicating autocorrelation between geographically close places, which challenges the interpretation of p-values and suggests grouping cities into larger geographies. ', 'The chapter emphasizes the importance of considering policy impacts and consequences of different assumptions, highlighting the potential economic and societal consequences of assuming a relationship between temperature and R-value or not. ', 'The discussion includes the potential policy impacts of assuming seasonality and its effect on public behavior, warning against apathy based on assumptions of seasonality and emphasizing the risks of being wrong in either direction. ', 'The chapter discusses the use of priors in modeling, suggesting the consideration of known climate relationships and previous flu epidemics to form initial guesses and understand the likelihood of seasonality in COVID-19 transmission. ']}], 'duration': 663.335, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ83737864.jpg', 'highlights': ["The drive train approach revolutionized the conventional predictive modeling in insurance, focusing on maximizing the company's five-year profit and controlling levers such as setting prices, which led to significant value and insights.", 'The process of turning predictive models into actionable insights is essential for practical model significance and implementation, emphasizing the importance of identifying sources of value, available data, approaches to analytics, implementation, and adaptation to changing environments.', 'The chapter emphasizes the importance of practical significance of models, highlighting the need to assess the actual outcomes instead of relying solely on p-values, and the significance of considering the impact of models on real-world decisions and actions.', 'The paper shows maps of temperature and humidity in China, indicating autocorrelation between geographically close places, which challenges the interpretation of p-values and suggests grouping cities into larger geographies.', 'The chapter emphasizes the importance of considering policy impacts and consequences of different assumptions, highlighting the potential economic and societal consequences of assuming a relationship between temperature and R-value or not.', 'The discussion includes the potential policy impacts of assuming seasonality and its effect on public behavior, warning against apathy based on assumptions of seasonality and emphasizing the risks of being wrong in either direction.', 'The chapter discusses the use of priors in modeling, suggesting the consideration of known climate relationships and previous flu epidemics to form initial guesses and understand the likelihood of seasonality in COVID-19 transmission.']}, {'end': 4733.287, 'segs': [{'end': 4513.102, 'src': 'embed', 'start': 4401.58, 'weight': 0, 'content': [{'end': 4403.742, 'text': "So maybe we'd say, well, prior belief.", 'start': 4401.58, 'duration': 2.162}, {'end': 4411.41, 'text': "is that this thing is probably seasonal and so then we'd say well this particular paper adds some evidence to that.", 'start': 4404.503, 'duration': 6.907}, {'end': 4423.761, 'text': 'So, like it shows, like how incredibly complex it is to use a model in practice for, in this case, policy discussions,', 'start': 4412.651, 'duration': 11.11}, {'end': 4431.927, 'text': "but also for like organizational decisions, because you know there's always complexities, there's always uncertainties,", 'start': 4423.761, 'duration': 8.166}, {'end': 4439.79, 'text': 'and so you actually have to think about the utilities you know and your best guesses and try to combine everything together as best as you can.', 'start': 4431.927, 'duration': 7.863}, {'end': 4454.931, 'text': "Okay so, with all that said, it's still nice to be able to get our models up and running, even if, you know,", 'start': 4440.711, 'duration': 14.22}, {'end': 4458.953, 'text': 'even just a predictive model is sometimes useful of its own.', 'start': 4454.931, 'duration': 4.022}, {'end': 4465.397, 'text': "sometimes it's useful to prototype something, and sometimes it's just, it's going to be part of some bigger picture.", 'start': 4458.953, 'duration': 6.444}, {'end': 4469.079, 'text': 'So, rather than try to create some huge end-to-end model here,', 'start': 4465.857, 'duration': 3.222}, {'end': 4481.668, 'text': 'we thought we would just show you how to get your PyTorch FastAI model up and running in as raw a form as possible,', 'start': 4469.079, 'duration': 12.589}, {'end': 4485.231, 'text': 'so that from there you can kind of build on top of it as you like.', 'start': 4481.668, 'duration': 3.563}, {'end': 4493.694, 'text': "So, to do that, we are going to download and curate our own dataset, and you're going to do the same thing.", 'start': 4486.031, 'duration': 7.663}, {'end': 4501.737, 'text': "You're going to train your own model on that dataset, and then you're going to create an application, and then you're going to host it.", 'start': 4493.714, 'duration': 8.023}, {'end': 4508.28, 'text': "Now, there's lots of ways to create an image dataset.", 'start': 4503.538, 'duration': 4.742}, {'end': 4510.581, 'text': 'You might have some photos on your own computer.', 'start': 4508.4, 'duration': 2.181}, {'end': 4513.102, 'text': 'There might be stuff at work you can use.', 'start': 4511.481, 'duration': 1.621}], 'summary': 'Complexities in using models for policy and organizational decisions. demonstrating pytorch fastai model setup and application hosting.', 'duration': 111.522, 'max_score': 4401.58, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ84401580.jpg'}, {'end': 4634.822, 'src': 'embed', 'start': 4607.096, 'weight': 3, 'content': [{'end': 4611.697, 'text': 'but they kind of limit it to like three transactions per second or something, which is still plenty.', 'start': 4607.096, 'duration': 4.601}, {'end': 4616.579, 'text': "You can still do thousands for free, so it's at the moment it's pretty great even for free.", 'start': 4612.277, 'duration': 4.302}, {'end': 4626.259, 'text': "So what will happen is when you sign up for Bing Image Search or any of these kind of services, they'll give you an API key.", 'start': 4618.877, 'duration': 7.382}, {'end': 4631.661, 'text': 'So just replace the XXX here with the API key that they give you.', 'start': 4626.699, 'duration': 4.962}, {'end': 4634.822, 'text': "Okay, so that's now going to be called key.", 'start': 4631.681, 'duration': 3.141}], 'summary': 'Bing image search offers thousands of free transactions, limited to three per second.', 'duration': 27.726, 'max_score': 4607.096, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ84607096.jpg'}], 'start': 4401.58, 'title': 'Modeling complexities in practice and creating image dataset for model training', 'summary': 'Discusses complexities and uncertainties in using models for policy and organizational decisions, emphasizing combining utilities and best guesses. it also covers the process of creating an image dataset using bing image search api with a free trial period of seven days and a limit of three transactions per second.', 'chapters': [{'end': 4485.231, 'start': 4401.58, 'title': 'Modeling complexities in practice', 'summary': 'Discusses the complexities and uncertainties in using models for policy and organizational decisions, emphasizing the importance of combining utilities and best guesses, while also highlighting the usefulness of predictive models and the process of prototyping.', 'duration': 83.651, 'highlights': ['The chapter emphasizes the complexities and uncertainties in using models for policy and organizational decisions, stressing the importance of combining utilities and best guesses (quantifiable: complexities, uncertainties).', 'It mentions the usefulness of predictive models for policy and organizational decisions, underscoring their potential utility (quantifiable: usefulness of predictive models).', 'The chapter discusses the process of prototyping and the value of getting models up and running in their raw form, highlighting the potential for further development and building on top of the initial model (quantifiable: value of prototyping and model development).']}, {'end': 4733.287, 'start': 4486.031, 'title': 'Creating image dataset for model training', 'summary': 'Covers the process of downloading and curating a dataset using bing image search api, obtaining an api key, and creating a function to return a list of urls for model training, with a free trial period of seven days and a limit of three transactions per second.', 'duration': 247.256, 'highlights': ['The chapter covers the process of downloading and curating a dataset using Bing Image Search API. This includes using Bing Image Search to download images and creating a function to return a list of URLs that match a search term for model training.', 'Obtaining an API key and creating a function to return a list of URLs for model training. The chapter details the process of obtaining an API key for Bing Image Search and creating a function, SearchImagesBing, to return a list of URLs matching a search term for model training.', 'Free trial period of seven days and a limit of three transactions per second for using Bing Image Search API. The Bing Image Search API offers a free trial period of seven days with a high quota, followed by a limit of three transactions per second, allowing thousands of transactions for free.']}], 'duration': 331.707, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ84401580.jpg', 'highlights': ['The chapter emphasizes the complexities and uncertainties in using models for policy and organizational decisions, stressing the importance of combining utilities and best guesses.', 'The chapter discusses the process of prototyping and the value of getting models up and running in their raw form, highlighting the potential for further development and building on top of the initial model.', 'The chapter covers the process of downloading and curating a dataset using Bing Image Search API, including obtaining an API key and creating a function to return a list of URLs for model training.', 'The Bing Image Search API offers a free trial period of seven days with a high quota, followed by a limit of three transactions per second, allowing thousands of transactions for free.', 'It mentions the usefulness of predictive models for policy and organizational decisions, underscoring their potential utility.']}, {'end': 5463.84, 'segs': [{'end': 4787.712, 'src': 'embed', 'start': 4733.287, 'weight': 0, 'content': [{'end': 4746.162, 'text': 'create a directory with the name of grizzly or black or teddy bear, search Bing for that particular search term along with bear, and download.', 'start': 4733.287, 'duration': 12.875}, {'end': 4748.665, 'text': 'And so download images is a fast AI function.', 'start': 4746.562, 'duration': 2.103}, {'end': 4760.16, 'text': 'So after that, I can call getImageFiles, which is a fastai function that will just return recursively all of the image files inside this path.', 'start': 4749.165, 'duration': 10.995}, {'end': 4764.786, 'text': "And you can see it's given me bears, slash, black, slash, and then lots of numbers.", 'start': 4760.8, 'duration': 3.986}, {'end': 4775.218, 'text': 'So one of the things you have to be careful of is that a lot of the stuff you download will turn out to be like not images at all and will break.', 'start': 4767.649, 'duration': 7.569}, {'end': 4782.105, 'text': 'So you can call verify images to check that all of these file names are actual images.', 'start': 4775.638, 'duration': 6.467}, {'end': 4787.712, 'text': "And in this case I didn't have any failed, so this it's empty.", 'start': 4783.387, 'duration': 4.325}], 'summary': 'Use fast ai to download bear images from bing and verify their success.', 'duration': 54.425, 'max_score': 4733.287, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ84733287.jpg'}, {'end': 4932.775, 'src': 'embed', 'start': 4898.958, 'weight': 3, 'content': [{'end': 4906.021, 'text': 'And so the DataBlock API looks like this.', 'start': 4898.958, 'duration': 7.063}, {'end': 4907.942, 'text': "Here's the DataBlock API.", 'start': 4906.041, 'duration': 1.901}, {'end': 4915.626, 'text': 'You tell FastAI what your independent variable is and what your dependent variable is.', 'start': 4909.683, 'duration': 5.943}, {'end': 4918.187, 'text': 'So what your labels are and what your input data is.', 'start': 4915.686, 'duration': 2.501}, {'end': 4924.85, 'text': 'So in this case our input data are images and our labels are categories.', 'start': 4918.727, 'duration': 6.123}, {'end': 4930.113, 'text': 'So the category is going to be either grizzly or black or teddy.', 'start': 4925.491, 'duration': 4.622}, {'end': 4932.775, 'text': "So that's the first thing you tell it.", 'start': 4931.573, 'duration': 1.202}], 'summary': 'The datablock api defines input data as images and labels as categories such as grizzly, black, or teddy.', 'duration': 33.817, 'max_score': 4898.958, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ84898958.jpg'}, {'end': 5229.223, 'src': 'embed', 'start': 5200.306, 'weight': 2, 'content': [{'end': 5203.889, 'text': "We're going to create a smaller ResNet this time, a ResNet 18.", 'start': 5200.306, 'duration': 3.583}, {'end': 5205.15, 'text': 'Again asking for error rate.', 'start': 5203.889, 'duration': 1.261}, {'end': 5207.672, 'text': 'We can then call dot fine tune again.', 'start': 5205.73, 'duration': 1.942}, {'end': 5210.294, 'text': "So you see it's all the same lines of code we've already seen.", 'start': 5207.732, 'duration': 2.562}, {'end': 5214.617, 'text': 'And you can see our error rate goes down from nine to one.', 'start': 5211.295, 'duration': 3.322}, {'end': 5215.898, 'text': "So we've got one percent error.", 'start': 5214.697, 'duration': 1.201}, {'end': 5218.34, 'text': 'And after training for about 25 seconds.', 'start': 5216.979, 'duration': 1.361}, {'end': 5222.782, 'text': "So you can see, you know, we've only got 450 images.", 'start': 5219.421, 'duration': 3.361}, {'end': 5225.863, 'text': "We've trained for well less than a minute.", 'start': 5224.062, 'duration': 1.801}, {'end': 5229.223, 'text': "And we only have, let's look at the confusion matrix.", 'start': 5226.583, 'duration': 2.64}], 'summary': 'Trained a resnet 18 with 450 images, achieving 1% error in under a minute.', 'duration': 28.917, 'max_score': 5200.306, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ85200306.jpg'}, {'end': 5355.417, 'src': 'embed', 'start': 5324.579, 'weight': 4, 'content': [{'end': 5334.547, 'text': 'So that is now something that you can copy over to a server somewhere and treat it as a predefined program right?', 'start': 5324.579, 'duration': 9.968}, {'end': 5344.974, 'text': 'So then, so the process of using your trained model on new data kind of in production is called inference.', 'start': 5335.048, 'duration': 9.926}, {'end': 5351.096, 'text': "So here I've created an inference, learner, by loading that learner back again, right?", 'start': 5345.794, 'duration': 5.302}, {'end': 5355.417, 'text': "And so obviously it doesn't make sense to do it right next to after.", 'start': 5351.616, 'duration': 3.801}], 'summary': 'Using trained model for inference is a predefined program for new data.', 'duration': 30.838, 'max_score': 5324.579, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ85324579.jpg'}], 'start': 4733.287, 'title': 'Fastai for image processing', 'summary': 'Explains downloading and verifying bear images using fastai functions, ensuring the absence of failed files. it also provides an overview of using fastai datablock api to structure data for model training, achieving a one percent error rate with 450 images using a smaller resnet 18 model.', 'chapters': [{'end': 4787.712, 'start': 4733.287, 'title': 'Download and verify bear images', 'summary': 'Explains how to use fastai functions to download bear images from bing and verify that they are actual images, ensuring the absence of failed files.', 'duration': 54.425, 'highlights': ["You can use fastai functions to download images from Bing by creating a directory with the name of the bear, searching Bing for that term along with 'bear', and then calling getImageFiles to recursively return all image files inside the path.", 'It is important to verify the downloaded images to check for actual images and prevent potential file breakage, which can be done using the verify images function to ensure the absence of failed files.']}, {'end': 5463.84, 'start': 4788.232, 'title': 'Fastai datablock api and model training', 'summary': 'Provides an overview of using the fastai datablock api to structure data for model training, demonstrating the process of creating a model, training it, and evaluating its performance, achieving a one percent error rate with 450 images using a smaller resnet 18 model.', 'duration': 675.608, 'highlights': ['Creating a model using the FastAI DataBlock API and training it, achieving a one percent error rate with 450 images using a smaller ResNet 18 model. The process of creating a model, training it, and evaluating its performance, achieving a one percent error rate with 450 images using a smaller ResNet 18 model.', 'Overview of the DataBlock API, including specifying independent and dependent variables, data splitting, labeling, and item transformations. An overview of the DataBlock API, specifying independent and dependent variables, data splitting, labeling, and item transformations.', 'Explanation of the process of using a trained model for inference and exporting the model for production. Explanation of using a trained model for inference, exporting the model for production, and the process of inference.']}], 'duration': 730.553, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/BvHmRx14HQ8/pics/BvHmRx14HQ84733287.jpg', 'highlights': ["You can use fastai functions to download images from Bing by creating a directory with the name of the bear, searching Bing for that term along with 'bear', and then calling getImageFiles to recursively return all image files inside the path.", 'It is important to verify the downloaded images to check for actual images and prevent potential file breakage, which can be done using the verify images function to ensure the absence of failed files.', 'Creating a model using the FastAI DataBlock API and training it, achieving a one percent error rate with 450 images using a smaller ResNet 18 model.', 'Overview of the DataBlock API, including specifying independent and dependent variables, data splitting, labeling, and item transformations.', 'Explanation of the process of using a trained model for inference and exporting the model for production.']}], 'highlights': ['Creating a model using the FastAI DataBlock API and training it, achieving a one percent error rate with 450 images using a smaller ResNet 18 model.', "The drive train approach revolutionized the conventional predictive modeling in insurance, focusing on maximizing the company's five-year profit and controlling levers such as setting prices, which led to significant value and insights.", 'The process of turning predictive models into actionable insights is essential for practical model significance and implementation, emphasizing the importance of identifying sources of value, available data, approaches to analytics, implementation, and adaptation to changing environments.', 'The chapter introduces the process of training models and the transition into production, highlighting the importance of practicing with the course v4 notebooks to understand the concepts and experiment with code.', 'The labeling function returns a binary output, either true or false, significant in classification models.', 'The data set contains the actual breed of 37 different cat and dog breeds, allowing for specific categorization.', 'Transfer learning enables implicit learning of features from a large dataset, reducing resource requirements.', 'Pre-trained models like ImageNet recognize a thousand image categories, serving as a foundation for training custom data.', "The paper 'High Temperature and High Humidity Reduce the Transmission of COVID-19' potential policy implications if the claim holds true.", 'The chapter emphasizes the complexities and uncertainties in using models for policy and organizational decisions, stressing the importance of combining utilities and best guesses.']}