title
Image classification using CNN (CIFAR10 dataset) | Deep Learning Tutorial 24 (Tensorflow & Python)

description
In this video we will do small image classification using CIFAR10 dataset in tensorflow. We will use convolutional neural network for this image classification problem. First we will train a model using simple artificial neural network and then check how the performance looks like and then we will train a CNN and see how the model accuracy improves. This tutorial will help you understand why CNN is preferred over ANN for image classification. Code: https://github.com/codebasics/deep-learning-keras-tf-tutorial/blob/master/16_cnn_cifar10_small_image_classification/cnn_cifar10_dataset.ipynb Exercise: Scroll to the very end of above notebook. You will find exercise description and solution link Do you want to learn technology from me? Check https://codebasics.io/?utm_source=description&utm_medium=yt&utm_campaign=description&utm_id=description for my affordable video courses. Deep learning playlist: https://www.youtube.com/playlist?list=PLeo1K3hjS3uu7CxAacxVndI4bE_o3BDtO Machine learning playlist : https://www.youtube.com/playlist?list=PLeo1K3hjS3uvCeTYTeyfe0-rN5r8zn9rw   #cnn #cnnimageclassification #imageclassificationpython #cnnmodel #deeplearning #tensorflowimageclassification #pythonimageclassification 🌎 My Website For Video Courses: https://codebasics.io/?utm_source=description&utm_medium=yt&utm_campaign=description&utm_id=description Need help building software or data analytics and AI solutions? My company https://www.atliq.com/ can help. Click on the Contact button on that website. #️⃣ Social Media #️⃣ 🔗 Discord: https://discord.gg/r42Kbuk 📸 Dhaval's Personal Instagram: https://www.instagram.com/dhavalsays/ 📸 Codebasics Instagram: https://www.instagram.com/codebasicshub/ 🔊 Facebook: https://www.facebook.com/codebasicshub 📱 Twitter: https://twitter.com/codebasicshub 📝 Linkedin (Personal): https://www.linkedin.com/in/dhavalsays/ 📝 Linkedin (Codebasics): https://www.linkedin.com/company/codebasics/ 🔗 Patreon: https://www.patreon.com/codebasics?fan_landing=true DISCLAIMER: All opinions expressed in this video are of my own and not that of my employers'.

detail
{'title': 'Image classification using CNN (CIFAR10 dataset) | Deep Learning Tutorial 24 (Tensorflow & Python)', 'heatmap': [{'end': 172.714, 'start': 100.771, 'weight': 0.801}, {'end': 358.64, 'start': 318.126, 'weight': 0.938}, {'end': 426.611, 'start': 395.214, 'weight': 0.772}, {'end': 734.324, 'start': 522.123, 'weight': 0.76}, {'end': 847.266, 'start': 804.13, 'weight': 0.728}, {'end': 935.943, 'start': 909.149, 'weight': 0.815}, {'end': 1019.653, 'start': 961.394, 'weight': 0.772}, {'end': 1118.982, 'start': 1097.129, 'weight': 0.802}, {'end': 1189.055, 'start': 1148.732, 'weight': 0.787}, {'end': 1372.612, 'start': 1356.499, 'weight': 0.71}], 'summary': 'This tutorial on image classification using cnn focuses on the cifar-10 dataset with 60,000 32x32 colored images across 10 categories. it covers dataset analysis, normalization, neural network training, cnn basics, achieving 83% accuracy after 10 epochs, and a 70% accuracy on a challenging test set.', 'chapters': [{'end': 72.989, 'segs': [{'end': 52.971, 'src': 'embed', 'start': 0.55, 'weight': 0, 'content': [{'end': 4.434, 'text': 'In the last video, we looked at what is convolutional neural network.', 'start': 0.55, 'duration': 3.884}, {'end': 9.859, 'text': 'If you have not seen it, I highly recommend watching that before you continue on this video.', 'start': 4.734, 'duration': 5.125}, {'end': 16.004, 'text': 'Today we are going to do image classification of 60, 000 small images.', 'start': 10.6, 'duration': 5.404}, {'end': 19.409, 'text': 'And this dataset is coming from TensorFlow library itself.', 'start': 16.626, 'duration': 2.783}, {'end': 21.631, 'text': "It's called CIFAR-10 database.", 'start': 19.509, 'duration': 2.122}, {'end': 28.214, 'text': 'it has various objects, such as aeroplane, ship, frog horse, etc.', 'start': 22.452, 'duration': 5.762}, {'end': 36.878, 'text': "and we will be doing image classification using convolutional neural network, so we'll straight away jump into coding and then, in the end,", 'start': 28.214, 'duration': 8.664}, {'end': 41.5, 'text': 'i have an exercise for you, so make sure you watch till then and you do the exercise on your own.', 'start': 36.878, 'duration': 4.622}, {'end': 52.971, 'text': 'The data set that we are going to use in this video has 60, 000 32 by 32 colored images with three RGB channels,', 'start': 42.34, 'duration': 10.631}], 'summary': 'Video covers image classification of 60,000 small images from cifar-10 database using convolutional neural network.', 'duration': 52.421, 'max_score': 0.55, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/7HPwo4wnJeA/pics/7HPwo4wnJeA550.jpg'}], 'start': 0.55, 'title': 'Image classification with cnn', 'summary': 'Covers image classification using a convolutional neural network on a dataset of 60,000 32x32 colored images from the cifar-10 database, encompassing 10 categories, and includes an exercise for practical application.', 'chapters': [{'end': 72.989, 'start': 0.55, 'title': 'Image classification with cnn', 'summary': 'Covers image classification using a convolutional neural network on a dataset of 60,000 32x32 colored images with three rgb channels, sourced from the cifar-10 database, encompassing 10 categories, and includes an exercise for practical application.', 'duration': 72.439, 'highlights': ['The dataset consists of 60,000 32x32 colored images with three RGB channels, sourced from the CIFAR-10 database, involving classification into one of 10 categories.', 'The chapter emphasizes image classification using a convolutional neural network, with an exercise provided for practical application.', 'Recommendation to watch the previous video on convolutional neural network for better understanding.']}], 'duration': 72.439, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/7HPwo4wnJeA/pics/7HPwo4wnJeA550.jpg', 'highlights': ['The dataset consists of 60,000 32x32 colored images with three RGB channels, sourced from the CIFAR-10 database, involving classification into one of 10 categories.', 'The chapter emphasizes image classification using a convolutional neural network, with an exercise provided for practical application.', 'Recommendation to watch the previous video on convolutional neural network for better understanding.']}, {'end': 447.117, 'segs': [{'end': 172.714, 'src': 'heatmap', 'start': 72.989, 'weight': 0, 'content': [{'end': 89.622, 'text': 'so when you call this load data method, what you get as a return value is this so you have X train, you have Y train and then you have x test, y test,', 'start': 72.989, 'duration': 16.633}, {'end': 100.771, 'text': "x test, y test, and i'll just quickly check the shape of the extreme.", 'start': 89.622, 'duration': 11.149}, {'end': 104.414, 'text': 'you see that the training samples are 50 000.', 'start': 100.771, 'duration': 3.643}, {'end': 112.36, 'text': 'each sample is 32 by 32 image and 3 is for rgb channels.', 'start': 104.414, 'duration': 7.946}, {'end': 115.223, 'text': "now let's look at the test.", 'start': 112.36, 'duration': 2.863}, {'end': 123.849, 'text': 'So in the test we have 10, 000 images, so this data set is kind of decent size.', 'start': 118.706, 'duration': 5.143}, {'end': 125.13, 'text': 'You know it is not too small.', 'start': 123.869, 'duration': 1.261}, {'end': 127.512, 'text': 'Now I want to check.', 'start': 126.571, 'duration': 0.941}, {'end': 130.354, 'text': 'Each of the training samples.', 'start': 128.493, 'duration': 1.861}, {'end': 132.275, 'text': 'So when you of course do this.', 'start': 130.674, 'duration': 1.601}, {'end': 136.318, 'text': "X train let's say 0.", 'start': 133.316, 'duration': 3.002}, {'end': 140.42, 'text': 'You know you get this three dimensional array.', 'start': 136.318, 'duration': 4.102}, {'end': 144.083, 'text': '32 by 32 into three you know RGB channels.', 'start': 140.56, 'duration': 3.523}, {'end': 150.628, 'text': 'I want to just quickly plot this to see.', 'start': 145.704, 'duration': 4.924}, {'end': 154.271, 'text': 'you know how this thing looks, so you can do matplotlib.', 'start': 150.628, 'duration': 3.643}, {'end': 161.697, 'text': 'you know, matplotlib we have already imported here as plt and this has a function called iamshow.', 'start': 154.271, 'duration': 7.426}, {'end': 172.714, 'text': 'And you get this is actually a frog and if you do one, this is a truck, but the image is pretty big.', 'start': 165.942, 'duration': 6.772}], 'summary': 'Dataset contains 50,000 training samples and 10,000 test images, each 32x32 with 3 rgb channels.', 'duration': 77.639, 'max_score': 72.989, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/7HPwo4wnJeA/pics/7HPwo4wnJeA72989.jpg'}, {'end': 214.42, 'src': 'embed', 'start': 179.522, 'weight': 3, 'content': [{'end': 186.644, 'text': 'So you know when the image is little smaller you can clearly see that this is a truck and this is a frog.', 'start': 179.522, 'duration': 7.122}, {'end': 193.305, 'text': "Just for convenience, I'm going to write a function called plotSample here,", 'start': 187.424, 'duration': 5.881}, {'end': 204.85, 'text': 'and this function is taking x and y and index and printing that particular image sample.', 'start': 193.305, 'duration': 11.545}, {'end': 208.974, 'text': 'so here it will be this, and I want to also on X label.', 'start': 204.85, 'duration': 4.124}, {'end': 214.42, 'text': "I want to print the label basically whether it's a frog or a ship.", 'start': 208.974, 'duration': 5.446}], 'summary': "Function 'plotsample' prints image samples with corresponding labels.", 'duration': 34.898, 'max_score': 179.522, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/7HPwo4wnJeA/pics/7HPwo4wnJeA179522.jpg'}, {'end': 358.64, 'src': 'heatmap', 'start': 318.126, 'weight': 0.938, 'content': [{'end': 324.189, 'text': 'And the way you want to reshape is the first dimension which is 10, 000 you want to keep it as it is.', 'start': 318.126, 'duration': 6.063}, {'end': 329.218, 'text': "So when you don't want to change that dimension you just say minus 1.", 'start': 324.729, 'duration': 4.489}, {'end': 333.379, 'text': 'the second dimension is you want to flatten this.', 'start': 329.218, 'duration': 4.161}, {'end': 338.5, 'text': 'so instead of 6 being an array, you want a simple 6.', 'start': 333.379, 'duration': 5.121}, {'end': 353.924, 'text': 'so then you will just leave this blank and as a result, what will happen is now, when you do y train 5, okay, it is saying, oh, I need to do reshape,', 'start': 338.5, 'duration': 15.424}, {'end': 358.64, 'text': 'actually okay, so you notice the difference.', 'start': 353.924, 'duration': 4.716}], 'summary': 'Reshape the first dimension to 10,000 and flatten the second dimension to a simple 6.', 'duration': 40.514, 'max_score': 318.126, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/7HPwo4wnJeA/pics/7HPwo4wnJeA318126.jpg'}, {'end': 447.117, 'src': 'heatmap', 'start': 395.214, 'weight': 4, 'content': [{'end': 407.841, 'text': "and now, when I say plot sample, let's say, in my training sample, and here I will do Y train, let's say I give zero.", 'start': 395.214, 'duration': 12.627}, {'end': 409.122, 'text': 'so for zero I get frog.', 'start': 407.841, 'duration': 1.281}, {'end': 419.706, 'text': "see this X label is just printing this particular label and if you do, let's say one, You get a truck.", 'start': 409.122, 'duration': 10.584}, {'end': 420.726, 'text': 'This is a truck.', 'start': 420.186, 'duration': 0.54}, {'end': 421.407, 'text': "It's not a ship.", 'start': 420.746, 'duration': 0.661}, {'end': 422.968, 'text': 'Maybe I misspoke, but.', 'start': 421.467, 'duration': 1.501}, {'end': 426.611, 'text': "It's a truck, so 9 number is truck 8 number is ship.", 'start': 423.889, 'duration': 2.722}, {'end': 429.053, 'text': 'So you can.', 'start': 428.513, 'duration': 0.54}, {'end': 432.015, 'text': 'Check various samples.', 'start': 430.434, 'duration': 1.581}, {'end': 433.657, 'text': 'What the sample? This is also a truck.', 'start': 432.095, 'duration': 1.562}, {'end': 435.118, 'text': 'See this is a truck.', 'start': 433.677, 'duration': 1.441}, {'end': 437.74, 'text': 'Then this is a deer and so on.', 'start': 436.139, 'duration': 1.601}, {'end': 442.424, 'text': 'So this is a just quick data exploration part that we did.', 'start': 438.701, 'duration': 3.723}, {'end': 447.117, 'text': 'Now we want to normalize our data.', 'start': 443.536, 'duration': 3.581}], 'summary': 'Data exploration revealed 9 trucks and 8 ships in the training sample.', 'duration': 26.371, 'max_score': 395.214, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/7HPwo4wnJeA/pics/7HPwo4wnJeA395214.jpg'}], 'start': 72.989, 'title': 'Image data analysis and normalization', 'summary': 'Covers loading and examining image data, revealing that the training dataset contains 50,000 samples of 32x32 images with 3 rgb channels, while the test dataset contains 10,000 images. it also explores image data visualization, reshaping, and normalizing the data, with a focus on categorization and reshaping of y train, yielding a one-dimensional array and normalizing the data.', 'chapters': [{'end': 150.628, 'start': 72.989, 'title': 'Image data analysis', 'summary': 'Covers loading and examining image data, revealing that the training dataset contains 50,000 samples of 32x32 images with 3 rgb channels, while the test dataset contains 10,000 images, indicating a decently sized dataset.', 'duration': 77.639, 'highlights': ["The training dataset consists of 50,000 samples, each being a 32x32 image with 3 RGB channels, providing insight into the dataset's size and structure.", 'The test dataset contains 10,000 images, indicating a substantial amount of data available for testing and validation.', 'The load data method returns X train, Y train, x test, and y test, providing a clear understanding of the returned values and their significance for further analysis.']}, {'end': 447.117, 'start': 150.628, 'title': 'Data exploration and normalization', 'summary': 'Explores image data visualization using matplotlib, reshaping and normalizing the data, and identifying and printing image labels, with a focus on categorization and reshaping of y train, yielding a one-dimensional array and normalizing the data.', 'duration': 296.489, 'highlights': ['The chapter explores image data visualization using matplotlib and printing image labels, with a focus on categorization and reshaping of Y train.', 'Reshaping and normalizing the data are important steps in the data exploration process.']}], 'duration': 374.128, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/7HPwo4wnJeA/pics/7HPwo4wnJeA72989.jpg', 'highlights': ["The training dataset consists of 50,000 samples, each being a 32x32 image with 3 RGB channels, providing insight into the dataset's size and structure.", 'The test dataset contains 10,000 images, indicating a substantial amount of data available for testing and validation.', 'The load data method returns X train, Y train, x test, and y test, providing a clear understanding of the returned values and their significance for further analysis.', 'The chapter explores image data visualization using matplotlib and printing image labels, with a focus on categorization and reshaping of Y train.', 'Reshaping and normalizing the data are important steps in the data exploration process.']}, {'end': 915.032, 'segs': [{'end': 734.324, 'src': 'heatmap', 'start': 447.757, 'weight': 2, 'content': [{'end': 460.222, 'text': 'We saw in our previous tutorials that whenever you have an image, you want to divide each pixel value by 255,', 'start': 447.757, 'duration': 12.465}, {'end': 467.444, 'text': 'because the pixel value ranges from 0 to 255 for each of the channels R, G and B.', 'start': 460.222, 'duration': 7.222}, {'end': 475.867, 'text': 'And if you divide it by 255, you will be normalizing it into a 0 to 1 range.', 'start': 467.444, 'duration': 8.423}, {'end': 481.169, 'text': "so let's again quickly check this.", 'start': 475.867, 'duration': 5.302}, {'end': 483.99, 'text': 'you see, the values here are 103, 59 and so on.', 'start': 481.169, 'duration': 2.821}, {'end': 496.967, 'text': 'okay, now, if I do this, You see, this is the power of NumPy array.', 'start': 483.99, 'duration': 12.977}, {'end': 502.51, 'text': 'You can just divide it by 255 and it will divide every element in the entire array.', 'start': 497.007, 'duration': 5.503}, {'end': 509.813, 'text': 'So now I can simply say xtrain is xtrain divided by 255.', 'start': 503.47, 'duration': 6.343}, {'end': 521.361, 'text': 'xtaste is OK, so the values are normalized.', 'start': 509.813, 'duration': 11.548}, {'end': 527.746, 'text': "Now we'll build a simple artificial neural network first to train the model.", 'start': 522.123, 'duration': 5.623}, {'end': 536.791, 'text': 'I want to see the performance of how artificial neural network works and then we will do a convolutional neural network.', 'start': 527.806, 'duration': 8.985}, {'end': 544.394, 'text': 'In one of the previous tutorials, we did GPU performance for the same CIFAR-10 dataset.', 'start': 538.092, 'duration': 6.302}, {'end': 547.195, 'text': "So I'm going to use that notebook.", 'start': 544.975, 'duration': 2.22}, {'end': 551.577, 'text': 'It is on my GitHub here and I will just copy paste some code from there.', 'start': 547.716, 'duration': 3.861}, {'end': 557.039, 'text': 'So here we build a simple artificial neural network.', 'start': 552.918, 'duration': 4.121}, {'end': 562.141, 'text': 'So this is the same neural network I will build and see how it performs.', 'start': 557.299, 'duration': 4.842}, {'end': 568.801, 'text': "So here You see, it's very simple.", 'start': 563.482, 'duration': 5.319}, {'end': 571.942, 'text': 'The input layer is a flattened layer.', 'start': 569.581, 'duration': 2.361}, {'end': 577.365, 'text': "It's a first layer, which accepts the shape of 32 by 32 by 3.", 'start': 572.022, 'duration': 5.343}, {'end': 584.028, 'text': 'Then we have two deep layers, one having 3, 000 neurons, the other having 1, 000 neurons.', 'start': 577.365, 'duration': 6.663}, {'end': 592.532, 'text': 'And the last layer is having 10 categories, because we have total 10 categories, right? See, this is like 10.', 'start': 584.608, 'duration': 7.924}, {'end': 600.241, 'text': 'So when you train this neural network, it will use a dense artificial neural network with all these parameters.', 'start': 592.532, 'duration': 7.709}, {'end': 608.091, 'text': 'Optimizer is SGD and there is pass categorical cross entropy value as an input.', 'start': 601.123, 'duration': 6.968}, {'end': 610.272, 'text': 'that tutorial.', 'start': 608.711, 'duration': 1.561}, {'end': 620.956, 'text': 'by the way, I use a categorical cross entropy and if you want to know the difference between categorical and sparse categorical,', 'start': 610.272, 'duration': 10.684}, {'end': 625.157, 'text': 'then see I have this nice image that can explain you.', 'start': 620.956, 'duration': 4.201}, {'end': 633.48, 'text': 'so here, whenever you have, let me just go in our presentation mode.', 'start': 625.157, 'duration': 8.323}, {'end': 649.053, 'text': "so whenever you have a your y as one hot encoded vector, so let's say you have ship here, which is number nine, and if your y is something like this,", 'start': 633.48, 'duration': 15.573}, {'end': 653.614, 'text': 'which is one hot encoded, you will use categorical cross entropy.', 'start': 649.053, 'duration': 4.561}, {'end': 661.056, 'text': 'but if y is directly a value which is number eight, you use sparse categorical cross entropy.', 'start': 653.614, 'duration': 7.442}, {'end': 665.437, 'text': 'in that video of gpu performance we converted into categorical.', 'start': 661.056, 'duration': 4.381}, {'end': 667.017, 'text': "that's why we use categorical.", 'start': 665.437, 'duration': 1.58}, {'end': 673.203, 'text': 'But here we are directly using the value 8 and so on.', 'start': 667.833, 'duration': 5.37}, {'end': 677.471, 'text': "And that's why we are using sparse categorical cross entropy.", 'start': 674.165, 'duration': 3.306}, {'end': 687.656, 'text': 'Here after the training you see that accuracy, I am running just 5 epochs, but accuracy is pretty low 48%.', 'start': 679.81, 'duration': 7.846}, {'end': 691.959, 'text': 'You see 48.58% on training samples.', 'start': 687.656, 'duration': 4.303}, {'end': 697.203, 'text': 'When you evaluate it on test samples, it is 47%.', 'start': 692.66, 'duration': 4.543}, {'end': 704.648, 'text': 'So artificial neural network is performing really bad on this data set with 5 epochs.', 'start': 697.203, 'duration': 7.445}, {'end': 710.113, 'text': 'I have also printed here a classification report.', 'start': 706.809, 'duration': 3.304}, {'end': 715.979, 'text': 'And this classification report gives precision recall and F1 score on each of the classes.', 'start': 710.954, 'duration': 5.025}, {'end': 717.401, 'text': 'So for example, this 9.', 'start': 716.099, 'duration': 1.302}, {'end': 719.203, 'text': '9 is what? Truck.', 'start': 717.401, 'duration': 1.802}, {'end': 723.037, 'text': 'a ship, maybe ship.', 'start': 720.715, 'duration': 2.322}, {'end': 724.678, 'text': "no, it's a truck, actually okay.", 'start': 723.037, 'duration': 1.641}, {'end': 734.324, 'text': "so for truck class, the precision is 59 percent, recall is 48 percent, and you can see this matrix is, if you don't know about precision,", 'start': 724.678, 'duration': 9.646}], 'summary': 'Normalized pixel values, built simple artificial neural network, achieved 48% accuracy in 5 epochs.', 'duration': 109.282, 'max_score': 447.757, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/7HPwo4wnJeA/pics/7HPwo4wnJeA447757.jpg'}, {'end': 753.174, 'src': 'embed', 'start': 697.203, 'weight': 0, 'content': [{'end': 704.648, 'text': 'So artificial neural network is performing really bad on this data set with 5 epochs.', 'start': 697.203, 'duration': 7.445}, {'end': 710.113, 'text': 'I have also printed here a classification report.', 'start': 706.809, 'duration': 3.304}, {'end': 715.979, 'text': 'And this classification report gives precision recall and F1 score on each of the classes.', 'start': 710.954, 'duration': 5.025}, {'end': 717.401, 'text': 'So for example, this 9.', 'start': 716.099, 'duration': 1.302}, {'end': 719.203, 'text': '9 is what? Truck.', 'start': 717.401, 'duration': 1.802}, {'end': 723.037, 'text': 'a ship, maybe ship.', 'start': 720.715, 'duration': 2.322}, {'end': 724.678, 'text': "no, it's a truck, actually okay.", 'start': 723.037, 'duration': 1.641}, {'end': 734.324, 'text': "so for truck class, the precision is 59 percent, recall is 48 percent, and you can see this matrix is, if you don't know about precision,", 'start': 724.678, 'duration': 9.646}, {'end': 735.925, 'text': 'recall and f1 score.', 'start': 734.324, 'duration': 1.601}, {'end': 741.389, 'text': 'again, in my this tutorial series there is a video on what these terms are.', 'start': 735.925, 'duration': 5.464}, {'end': 743.871, 'text': 'so it is better if you watch these videos in sequence.', 'start': 741.389, 'duration': 2.482}, {'end': 751.033, 'text': 'now we are going to use cnn to improve the performance of this model.', 'start': 745.029, 'duration': 6.004}, {'end': 753.174, 'text': 'so how do you use cnn?', 'start': 751.033, 'duration': 2.141}], 'summary': 'Artificial neural network has 59% precision and 48% recall for truck class with 5 epochs; considering cnn for improvement.', 'duration': 55.971, 'max_score': 697.203, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/7HPwo4wnJeA/pics/7HPwo4wnJeA697203.jpg'}, {'end': 847.266, 'src': 'heatmap', 'start': 804.13, 'weight': 0.728, 'content': [{'end': 816.377, 'text': 'okay, so here this will be CNN, and so you will have some CNN layers here.', 'start': 804.13, 'duration': 12.247}, {'end': 820.38, 'text': 'okay, and this will be your dense network.', 'start': 816.377, 'duration': 4.003}, {'end': 832.754, 'text': "so when, uh, you are in the middle layer, you don't need to specify the shape because the network can figure it out automatically.", 'start': 820.38, 'duration': 12.374}, {'end': 841.101, 'text': "and just to keep things simple, i'm just going to keep only one dense network, because my cnn would have done most of the work.", 'start': 832.754, 'duration': 8.347}, {'end': 844.083, 'text': "so i don't need so many neurons and so many deep layers.", 'start': 841.101, 'duration': 2.982}, {'end': 847.266, 'text': "okay, and i'm going to use a softmax function here.", 'start': 844.083, 'duration': 3.183}], 'summary': 'Designing a cnn with one dense network, using softmax function.', 'duration': 43.136, 'max_score': 804.13, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/7HPwo4wnJeA/pics/7HPwo4wnJeA804130.jpg'}], 'start': 447.757, 'title': 'Image normalization and neural network training', 'summary': "Explains image normalization using numpy arrays and introduces a simple artificial neural network for model training. it also explores the performance of the neural network on the cifar-10 dataset, achieving 47% accuracy on test samples, with a precision of 59% and recall of 48% for the 'truck' class. additionally, it outlines the plan to use a convolutional neural network to improve performance.", 'chapters': [{'end': 527.746, 'start': 447.757, 'title': 'Image normalization and neural network training', 'summary': 'Explains the process of normalizing image pixel values by dividing them by 255 using numpy arrays, and then proceeds to build a simple artificial neural network for model training.', 'duration': 79.989, 'highlights': ['The pixel values in an image range from 0 to 255 for each channel (R, G, and B), and dividing by 255 normalizes them to a 0 to 1 range.', 'Using NumPy arrays, dividing the entire array by 255 normalizes every element, simplifying the process of image normalization.', 'The process of dividing pixel values by 255 enables the normalization of image data, facilitating the training of a simple artificial neural network.']}, {'end': 915.032, 'start': 527.806, 'title': 'Neural network performance analysis', 'summary': "Explores the performance of a simple artificial neural network on the cifar-10 dataset, achieving an accuracy of 47% on test samples and a precision of 59% and recall of 48% for the 'truck' class. it also introduces the plan to use a convolutional neural network to improve performance.", 'duration': 387.226, 'highlights': ['The accuracy of the simple artificial neural network on the CIFAR-10 dataset is 47% on test samples after 5 epochs.', "The precision for the 'truck' class is 59% and the recall is 48%.", 'The plan to use a convolutional neural network to improve the performance of the model is introduced.']}], 'duration': 467.275, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/7HPwo4wnJeA/pics/7HPwo4wnJeA447757.jpg', 'highlights': ['The accuracy of the simple artificial neural network on the CIFAR-10 dataset is 47% on test samples after 5 epochs.', "The precision for the 'truck' class is 59% and the recall is 48%.", 'Using NumPy arrays, dividing the entire array by 255 normalizes every element, simplifying the process of image normalization.', 'The pixel values in an image range from 0 to 255 for each channel (R, G, and B), and dividing by 255 normalizes them to a 0 to 1 range.', 'The process of dividing pixel values by 255 enables the normalization of image data, facilitating the training of a simple artificial neural network.', 'The plan to use a convolutional neural network to improve the performance of the model is introduced.']}, {'end': 1254.029, 'segs': [{'end': 1019.653, 'src': 'heatmap', 'start': 941.524, 'weight': 2, 'content': [{'end': 952.93, 'text': 'now, when we saw this presentation for the image of nine in the previous video, we saw that we can have this kind of three filters.', 'start': 941.524, 'duration': 11.406}, {'end': 961.394, 'text': 'so the first one is detecting the loopy pattern and the second one is detecting the vertical edge, which is the middle part.', 'start': 952.93, 'duration': 8.464}, {'end': 965.637, 'text': 'third one is detecting the tail.', 'start': 961.394, 'duration': 4.243}, {'end': 972.051, 'text': "the best thing about convolutional neural network is you don't need to tell it what, what the filters are.", 'start': 965.637, 'duration': 6.414}, {'end': 975.312, 'text': 'it will figure out the filters for you.', 'start': 972.051, 'duration': 3.261}, {'end': 980.935, 'text': 'you only need to tell the filter size and how many filters you want see.', 'start': 975.312, 'duration': 5.623}, {'end': 984.517, 'text': 'in this case we had three filters one, two, three.', 'start': 980.935, 'duration': 3.582}, {'end': 990.159, 'text': 'so if you look at this image, we get this three filter stack filters, feature maps.', 'start': 984.517, 'duration': 5.642}, {'end': 994.922, 'text': 'here we will use, you know, just random, like maybe 32 filters.', 'start': 990.159, 'duration': 4.763}, {'end': 1012.172, 'text': 'so this can detect 32 different features or different ages in your image and then the actual filter size is specified here.', 'start': 997.469, 'duration': 14.703}, {'end': 1019.653, 'text': "so let's say we are using 3x3 filter see here also, we had 3x3 filter, this green box.", 'start': 1012.172, 'duration': 7.481}], 'summary': 'Cnn can detect 32 different features using 3x3 filters, without specifying the filters.', 'duration': 30.527, 'max_score': 941.524, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/7HPwo4wnJeA/pics/7HPwo4wnJeA941524.jpg'}, {'end': 1118.982, 'src': 'heatmap', 'start': 1081.392, 'weight': 4, 'content': [{'end': 1085.515, 'text': 'And for activation also you know ReLU is quite popular.', 'start': 1081.392, 'duration': 4.123}, {'end': 1087.597, 'text': 'It is less expensive to calculate.', 'start': 1085.595, 'duration': 2.002}, {'end': 1092.584, 'text': 'Now you can have only one layer or you can have multiple.', 'start': 1089.201, 'duration': 3.383}, {'end': 1093.605, 'text': "Doesn't matter.", 'start': 1093.045, 'duration': 0.56}, {'end': 1096.708, 'text': 'You know, like you kind of figure this out by trial and error.', 'start': 1093.625, 'duration': 3.083}, {'end': 1104.696, 'text': "So just for fun, I'm going to have another set of convolution and max pooling layer.", 'start': 1097.129, 'duration': 7.567}, {'end': 1112.344, 'text': 'So see here we had like convolution pooling, convolution pooling.', 'start': 1108.28, 'duration': 4.064}, {'end': 1113.745, 'text': "So that's what we have now.", 'start': 1112.784, 'duration': 0.961}, {'end': 1118.982, 'text': 'Now we do our usual model compile thing.', 'start': 1116.741, 'duration': 2.241}], 'summary': 'Relu activation is popular, less expensive. using multiple layers for trial and error, adding convolution and max pooling. compiling the model.', 'duration': 30.952, 'max_score': 1081.392, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/7HPwo4wnJeA/pics/7HPwo4wnJeA1081392.jpg'}, {'end': 1225.293, 'src': 'heatmap', 'start': 1148.732, 'weight': 0, 'content': [{'end': 1168.663, 'text': "okay and then i will do uh pnn.pit and i will run it for tan epochs There are 50, 000 images to train, so it's going to take some time.", 'start': 1148.732, 'duration': 19.931}, {'end': 1171.024, 'text': 'So please have some patience.', 'start': 1168.743, 'duration': 2.281}, {'end': 1176.588, 'text': 'Here you saw that after 10 epochs, it gave me 83% accuracy.', 'start': 1171.845, 'duration': 4.743}, {'end': 1182.692, 'text': 'Actually, if you compare it with ANN, see, after 5 epochs, I get 73% accuracy.', 'start': 1177.129, 'duration': 5.563}, {'end': 1189.055, 'text': 'in ann, after five i got only 48.', 'start': 1185.174, 'duration': 3.881}, {'end': 1193.796, 'text': 'so you can see that using cnn helps you tremendously.', 'start': 1189.055, 'duration': 4.741}, {'end': 1210.889, 'text': 'and now i will test this out on my test set and here you know i got 70 accuracy, which is Pretty good.', 'start': 1193.796, 'duration': 17.093}, {'end': 1218.591, 'text': 'OK, if you training for more epochs, you can probably get more accuracy and you can do little fine tuning.', 'start': 1211.429, 'duration': 7.162}, {'end': 1225.293, 'text': 'But for the images which are like this, which are kind of you see these images are like kind of random.', 'start': 1219.171, 'duration': 6.122}], 'summary': 'Trained cnn for 50,000 images, achieved 83% accuracy in 10 epochs, outperforming ann.', 'duration': 81.525, 'max_score': 1148.732, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/7HPwo4wnJeA/pics/7HPwo4wnJeA1148732.jpg'}], 'start': 915.032, 'title': 'Convolutional neural networks', 'summary': 'Covers the basics and architecture of convolutional neural networks, including feature detection, advantages of convolutional layers, max pooling, relu activation, and the adam optimizer. it also showcases the effectiveness of cnn in image classification tasks, achieving 83% accuracy after 10 epochs and 70% accuracy on a challenging dataset test set.', 'chapters': [{'end': 1049.374, 'start': 915.032, 'title': 'Convolutional neural network basics', 'summary': 'Introduces the basics of convolutional neural networks, explaining the process of feature detection using filters and the advantages of using convolutional layers in image processing, including the automatic determination of filters and the detection of multiple features using multiple filters.', 'duration': 134.342, 'highlights': ['Convolutional neural networks automatically determine filters for feature detection, eliminating the need to specify the filters explicitly and only requiring the input shape and the number of filters, as demonstrated by the usage of 32 filters for detecting 32 different features in a 32x32x3 image.', 'The process of convolution involves detecting various features in images, with the example of detecting loopy patterns, vertical edges, and tails using different filters, showcasing the capability of CNNs to identify diverse features without explicit instructions.', 'The introduction of convolutional layers and max pooling layers in the code is emphasized, illustrating the straightforward process of integrating these layers for feature detection and extraction in images.']}, {'end': 1142.928, 'start': 1049.374, 'title': 'Convolutional neural network architecture', 'summary': 'Discusses the architecture of a convolutional neural network, including the use of max pooling, relu activation, and the popular adam optimizer for achieving good accuracy in image recognition.', 'duration': 93.554, 'highlights': ['The architecture includes convolutional layers, max pooling layers, ReLU activation, and the use of the popular Adam optimizer for achieving good accuracy.', 'Max pooling of 2 by 2 is specified, which is a very popular pooling method in image recognition.', 'The ReLU activation function is highlighted as quite popular and less expensive to calculate compared to other activation functions.', 'The use of the Adam optimizer is emphasized due to its popularity in providing good accuracy for image recognition tasks.']}, {'end': 1254.029, 'start': 1143.768, 'title': 'Cnn for image classification', 'summary': "Demonstrates the use of cnn for image classification, achieving 83% accuracy after 10 epochs, a significant improvement over ann's 48% accuracy after five epochs, and obtaining 70% accuracy on the test set for a challenging dataset, showcasing the effectiveness of cnn in image classification tasks.", 'duration': 110.261, 'highlights': ["Achieving 83% accuracy after 10 epochs, a significant improvement over ANN's 48% accuracy after five epochs", 'Obtaining 70% accuracy on the test set for a challenging dataset, showcasing the effectiveness of CNN in image classification tasks', 'Training with 50,000 images and potential for increased accuracy through fine-tuning and more epochs']}], 'duration': 338.997, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/7HPwo4wnJeA/pics/7HPwo4wnJeA915032.jpg', 'highlights': ["Achieving 83% accuracy after 10 epochs, a significant improvement over ANN's 48% accuracy after five epochs", 'Obtaining 70% accuracy on the test set for a challenging dataset, showcasing the effectiveness of CNN in image classification tasks', 'Convolutional neural networks automatically determine filters for feature detection, eliminating the need to specify the filters explicitly and only requiring the input shape and the number of filters, as demonstrated by the usage of 32 filters for detecting 32 different features in a 32x32x3 image', 'The process of convolution involves detecting various features in images, with the example of detecting loopy patterns, vertical edges, and tails using different filters, showcasing the capability of CNNs to identify diverse features without explicit instructions', 'The architecture includes convolutional layers, max pooling layers, ReLU activation, and the use of the popular Adam optimizer for achieving good accuracy']}, {'end': 1690.578, 'segs': [{'end': 1287.073, 'src': 'embed', 'start': 1255.53, 'weight': 3, 'content': [{'end': 1258.612, 'text': 'Now I will do plotting of some samples.', 'start': 1255.53, 'duration': 3.082}, {'end': 1261.534, 'text': "So let's see.", 'start': 1260.233, 'duration': 1.301}, {'end': 1269.349, 'text': 'OK, we got some error and I think this is happening because we did not reshape.', 'start': 1265.225, 'duration': 4.124}, {'end': 1271.371, 'text': 'You know we had to reshape our ytest.', 'start': 1269.389, 'duration': 1.982}, {'end': 1275.836, 'text': 'Cause our ytest OK before executing this let me show you.', 'start': 1272.973, 'duration': 2.863}, {'end': 1279.38, 'text': 'It is a two dimensional array.', 'start': 1278.199, 'duration': 1.181}, {'end': 1282.583, 'text': 'I want to convert it into one dimension.', 'start': 1280, 'duration': 2.583}, {'end': 1284.165, 'text': 'So if you do this.', 'start': 1282.623, 'duration': 1.542}, {'end': 1287.073, 'text': 'This is now one dimensional array.', 'start': 1285.612, 'duration': 1.461}], 'summary': 'Reshaped ytest from two-dimensional to one-dimensional array to resolve error during plotting of samples.', 'duration': 31.543, 'max_score': 1255.53, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/7HPwo4wnJeA/pics/7HPwo4wnJeA1255530.jpg'}, {'end': 1380.598, 'src': 'heatmap', 'start': 1356.499, 'weight': 0.71, 'content': [{'end': 1363.885, 'text': 'You know why it gave 1? Because 12 is the maximum element and the index of 12 is 1.', 'start': 1356.499, 'duration': 7.386}, {'end': 1365.146, 'text': 'See if I do this one.', 'start': 1363.885, 'duration': 1.261}, {'end': 1369.75, 'text': "Let's say I make this one.", 'start': 1367.148, 'duration': 2.602}, {'end': 1372.612, 'text': 'Then say it is getting 2.', 'start': 1370.771, 'duration': 1.841}, {'end': 1373.713, 'text': "So that's what it is doing.", 'start': 1372.612, 'duration': 1.101}, {'end': 1380.598, 'text': "So here if I supply let's say y pred 0.", 'start': 1373.933, 'duration': 6.665}], 'summary': 'Explanation of why the given value is 1, based on maximum element and its index.', 'duration': 24.099, 'max_score': 1356.499, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/7HPwo4wnJeA/pics/7HPwo4wnJeA1356499.jpg'}, {'end': 1438.234, 'src': 'embed', 'start': 1403.735, 'weight': 2, 'content': [{'end': 1407.058, 'text': "So the way you do that is, it's like running a for loop.", 'start': 1403.735, 'duration': 3.323}, {'end': 1413.363, 'text': "So for element in y predicted, you're computing argmax for each of these.", 'start': 1407.078, 'duration': 6.285}, {'end': 1420.389, 'text': 'So what you get as a result is something that you can compare with ytest.', 'start': 1413.703, 'duration': 6.686}, {'end': 1431.392, 'text': 'So now if I have ytest here, you see? This is how the first sample it got wrong.', 'start': 1422.191, 'duration': 9.201}, {'end': 1435.313, 'text': 'It was five got three was the second, third, fourth.', 'start': 1431.892, 'duration': 3.421}, {'end': 1436.434, 'text': 'It got right.', 'start': 1435.833, 'duration': 0.601}, {'end': 1438.234, 'text': 'And then it will make some errors.', 'start': 1436.474, 'duration': 1.76}], 'summary': 'The process involves computing argmax for elements in y predicted, leading to some correct and incorrect predictions.', 'duration': 34.499, 'max_score': 1403.735, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/7HPwo4wnJeA/pics/7HPwo4wnJeA1403735.jpg'}, {'end': 1571.739, 'src': 'embed', 'start': 1513.796, 'weight': 0, 'content': [{'end': 1514.916, 'text': 'Third image is aeroplane.', 'start': 1513.796, 'duration': 1.12}, {'end': 1516.416, 'text': "See, it's flying like aeroplane.", 'start': 1515.096, 'duration': 1.32}, {'end': 1518.977, 'text': 'And if you look at 3, see, aeroplane.', 'start': 1517.176, 'duration': 1.801}, {'end': 1521.117, 'text': 'So here it did not make a mistake.', 'start': 1519.097, 'duration': 2.02}, {'end': 1527.296, 'text': 'I also printed a classification report using y-test and y-classes.', 'start': 1522.952, 'duration': 4.344}, {'end': 1530.999, 'text': "So now using CNN, you're getting better numbers.", 'start': 1527.796, 'duration': 3.203}, {'end': 1533.681, 'text': 'See your F1 score is overall better here, 81%.', 'start': 1531.559, 'duration': 2.122}, {'end': 1534.582, 'text': "Here it's less, 50%, 70%, and so on.", 'start': 1533.681, 'duration': 0.901}, {'end': 1545.708, 'text': 'But when you looked at simple ANN the score was quite low.', 'start': 1539.526, 'duration': 6.182}, {'end': 1554.55, 'text': 'I know I did only 5 epochs but even if you try 10 epochs you still get lower score in ANN and better score in CNN.', 'start': 1546.268, 'duration': 8.282}, {'end': 1562.152, 'text': 'In CNN also computation is less because we are using max pooling layer and that reduces the dimension.', 'start': 1555.05, 'duration': 7.102}, {'end': 1568.597, 'text': 'The Jupyter Notebook that we covered in this video is uploaded on my GitHub.', 'start': 1564.713, 'duration': 3.884}, {'end': 1571.739, 'text': "I'm going to provide this link in the video description.", 'start': 1569.037, 'duration': 2.702}], 'summary': 'Using cnn improves f1 score to 81% compared to 50-70% in ann, with less computation and uploaded on github.', 'duration': 57.943, 'max_score': 1513.796, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/7HPwo4wnJeA/pics/7HPwo4wnJeA1513796.jpg'}, {'end': 1648.722, 'src': 'embed', 'start': 1622.064, 'weight': 6, 'content': [{'end': 1631.871, 'text': 'so what i want to do, what i want you to do, is take this notebook and do the same digit classification using cnn.', 'start': 1622.064, 'duration': 9.807}, {'end': 1635.693, 'text': 'this one is using ann okay, artificial neural network,', 'start': 1631.871, 'duration': 3.822}, {'end': 1648.722, 'text': 'see here and you have to just use a cnn and just see how the accuracy and the classification report improves when you use cnn.', 'start': 1635.693, 'duration': 13.029}], 'summary': 'Compare digit classification accuracy using cnn vs ann.', 'duration': 26.658, 'max_score': 1622.064, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/7HPwo4wnJeA/pics/7HPwo4wnJeA1622064.jpg'}], 'start': 1255.53, 'title': 'Model prediction and cnn classification', 'summary': 'Demonstrates reshaping data for model prediction and cnn classification, including converting a two-dimensional array into a one-dimensional array, using argmax function, list comprehension, accuracy testing, and comparison of cnn and ann. it also includes an exercise for handwritten digit classification using cnn, achieving a 70% test accuracy.', 'chapters': [{'end': 1314.869, 'start': 1255.53, 'title': 'Model prediction and plotting', 'summary': 'Demonstrates reshaping of data for model prediction, converting a two-dimensional array into a one-dimensional array, and using the model to predict image classifications.', 'duration': 59.339, 'highlights': ['Reshaping ytest from two-dimensional to one-dimensional array improves model performance.', 'Predicting image classifications using the model enhances understanding of model performance.', 'Converting two-dimensional array into one-dimensional array for y-test data aids in model prediction.']}, {'end': 1690.578, 'start': 1321.189, 'title': 'Cnn classification in deep learning', 'summary': 'Discusses the use of argmax function, list comprehension in python, accuracy testing, comparison of cnn and ann, and an exercise for handwritten digit classification using cnn, with a 70% test accuracy.', 'duration': 369.389, 'highlights': ['The test accuracy of the CNN model is 70%.', 'The F1 score for the CNN model is 81%, indicating better performance compared to the ANN model.', 'The exercise involves implementing handwritten digit classification using CNN and comparing the accuracy and classification report with the ANN model.', 'List comprehension in Python is used to compute argmax for each element in y predicted, resulting in a comparison with ytest.', "The Jupyter Notebook with the covered content is uploaded on the instructor's GitHub."]}], 'duration': 435.048, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/7HPwo4wnJeA/pics/7HPwo4wnJeA1255530.jpg', 'highlights': ['The test accuracy of the CNN model is 70%.', 'The F1 score for the CNN model is 81%, indicating better performance compared to the ANN model.', 'List comprehension in Python is used to compute argmax for each element in y predicted, resulting in a comparison with ytest.', 'Reshaping ytest from two-dimensional to one-dimensional array improves model performance.', 'Predicting image classifications using the model enhances understanding of model performance.', 'Converting two-dimensional array into one-dimensional array for y-test data aids in model prediction.', 'The exercise involves implementing handwritten digit classification using CNN and comparing the accuracy and classification report with the ANN model.', "The Jupyter Notebook with the covered content is uploaded on the instructor's GitHub."]}], 'highlights': ["Achieving 83% accuracy after 10 epochs, a significant improvement over ANN's 48% accuracy after five epochs", 'Obtaining 70% accuracy on the test set for a challenging dataset, showcasing the effectiveness of CNN in image classification tasks', 'The F1 score for the CNN model is 81%, indicating better performance compared to the ANN model', 'The test accuracy of the CNN model is 70%', 'The accuracy of the simple artificial neural network on the CIFAR-10 dataset is 47% on test samples after 5 epochs', "The precision for the 'truck' class is 59% and the recall is 48%", "The training dataset consists of 50,000 samples, each being a 32x32 image with 3 RGB channels, providing insight into the dataset's size and structure", 'The test dataset contains 10,000 images, indicating a substantial amount of data available for testing and validation', 'Using NumPy arrays, dividing the entire array by 255 normalizes every element, simplifying the process of image normalization', 'The process of dividing pixel values by 255 enables the normalization of image data, facilitating the training of a simple artificial neural network', 'The load data method returns X train, Y train, x test, and y test, providing a clear understanding of the returned values and their significance for further analysis', 'The architecture includes convolutional layers, max pooling layers, ReLU activation, and the use of the popular Adam optimizer for achieving good accuracy', 'The plan to use a convolutional neural network to improve the performance of the model is introduced', 'The dataset consists of 60,000 32x32 colored images with three RGB channels, sourced from the CIFAR-10 database, involving classification into one of 10 categories', 'The chapter emphasizes image classification using a convolutional neural network, with an exercise provided for practical application', 'Recommendation to watch the previous video on convolutional neural network for better understanding']}