title

C4W1L07 One Layer of a Convolutional Net

description

Take the Deep Learning Specialization: http://bit.ly/2IlGB9n
Check out all our courses: https://www.deeplearning.ai
Subscribe to The Batch, our weekly newsletter: https://www.deeplearning.ai/thebatch
Follow us:
Twitter: https://twitter.com/deeplearningai_
Facebook: https://www.facebook.com/deeplearningHQ/
Linkedin: https://www.linkedin.com/company/deeplearningai

detail

{'title': 'C4W1L07 One Layer of a Convolutional Net', 'heatmap': [{'end': 85.509, 'start': 49.164, 'weight': 0.81}, {'end': 781.366, 'start': 737.694, 'weight': 0.777}], 'summary': 'Discusses building a convolutional neural network layer, explaining the concept of convolutional neural nets and the advantage of using a small number of parameters to detect features in large images. it also introduces notation for a convolutional layer, including filter size, padding, and stride, and discusses the dimensions, size, and number of channels of input and output volumes, as well as the conventions and variations in notation used in deep learning literature.', 'chapters': [{'end': 414.737, 'segs': [{'end': 47.643, 'src': 'embed', 'start': 3.919, 'weight': 0, 'content': [{'end': 7.922, 'text': "You're now ready to see how to build one layer of a convolutional neural network.", 'start': 3.919, 'duration': 4.003}, {'end': 9.463, 'text': "Let's go through an example.", 'start': 8.303, 'duration': 1.16}, {'end': 23.915, 'text': "You've seen in the previous video how to take a 3D volume and convolve it with, say, two different filters in order to get, in this example,", 'start': 12.506, 'duration': 11.409}, {'end': 25.376, 'text': 'two different 4x4 outputs.', 'start': 23.915, 'duration': 1.461}, {'end': 47.643, 'text': "So let's say convolving with the first filter gives this first 4x4 output, and convolving with this second filter gives a different 4x4 output.", 'start': 30.959, 'duration': 16.684}], 'summary': 'Learn building one layer of a cnn, convolving with 2 filters to get 4x4 outputs.', 'duration': 43.724, 'max_score': 3.919, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jPOAS7uCODQ/pics/jPOAS7uCODQ3919.jpg'}, {'end': 85.509, 'src': 'heatmap', 'start': 49.164, 'weight': 0.81, 'content': [{'end': 60.594, 'text': "The final thing to turn this into a convolutional neural net layer is that for each of these, we're going to add a bias.", 'start': 49.164, 'duration': 11.43}, {'end': 63.015, 'text': 'So this is going to be a real number.', 'start': 61.414, 'duration': 1.601}, {'end': 75.123, 'text': 'And with Python broadcasting, you kind of add the same number to every you know one of these 16 elements and then apply a non-linearity, which,', 'start': 63.816, 'duration': 11.307}, {'end': 77.524, 'text': 'for illustration, this is a value non-linearity.', 'start': 75.123, 'duration': 2.401}, {'end': 85.509, 'text': 'And this gives you a 4x4 output after applying a bias and a non-linearity.', 'start': 77.984, 'duration': 7.525}], 'summary': 'Creating a convolutional neural net layer with a bias and non-linearity, resulting in 4x4 output.', 'duration': 36.345, 'max_score': 49.164, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jPOAS7uCODQ/pics/jPOAS7uCODQ49164.jpg'}, {'end': 241.914, 'src': 'embed', 'start': 181.069, 'weight': 1, 'content': [{'end': 183.79, 'text': "We're taking all of these numbers and multiplying them.", 'start': 181.069, 'duration': 2.721}, {'end': 189.552, 'text': "So you're really computing a linear function to get this 4 by 4 matrix.", 'start': 183.85, 'duration': 5.702}, {'end': 199.356, 'text': 'So that 4 by 4 matrix, the output of the convolution operation, that plays a role similar to w1 times a0.', 'start': 189.592, 'duration': 9.764}, {'end': 204.438, 'text': "That's really maybe the output of this 4x4 as well as that 4x4.", 'start': 199.416, 'duration': 5.022}, {'end': 208.159, 'text': 'And then the other thing you do is add the bias.', 'start': 205.258, 'duration': 2.901}, {'end': 219.003, 'text': 'So this thing here, before applying ReLU, this plays a role similar to z.', 'start': 209.68, 'duration': 9.323}, {'end': 223.697, 'text': "And then it's finally by applying the nonlinearity, you know, this kind of this, I guess.", 'start': 219.003, 'duration': 4.694}, {'end': 228.626, 'text': 'So this output, right, plays a row.', 'start': 224.338, 'duration': 4.288}, {'end': 232.629, 'text': 'This really becomes your activation at the next layer.', 'start': 229.687, 'duration': 2.942}, {'end': 235.49, 'text': 'So this is how you go from A0 to A1.', 'start': 233.449, 'duration': 2.041}, {'end': 241.914, 'text': "That's first the linear operation and then convolution has all these multiplies.", 'start': 236.01, 'duration': 5.904}], 'summary': 'Computing linear function to get a 4x4 matrix and applying nonlinearity to obtain activation at the next layer.', 'duration': 60.845, 'max_score': 181.069, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jPOAS7uCODQ/pics/jPOAS7uCODQ181069.jpg'}, {'end': 339.6, 'src': 'embed', 'start': 305.672, 'weight': 4, 'content': [{'end': 308.893, 'text': "So to make sure you understand this, let's go through an exercise.", 'start': 305.672, 'duration': 3.221}, {'end': 316.394, 'text': "Let's suppose you have 10 filters, not just two filters, that are 3x3x3 in one layer of a neural network.", 'start': 309.593, 'duration': 6.801}, {'end': 321.915, 'text': "How many parameters does this layer have? Well, let's figure this out.", 'start': 316.974, 'duration': 4.941}, {'end': 326.116, 'text': 'Each filter is a 3x3x3 volume, so 3x3x3.', 'start': 323.095, 'duration': 3.021}, {'end': 336.858, 'text': "So each filter has 27 So there's 27 numbers to be learned.", 'start': 326.136, 'duration': 10.722}, {'end': 339.6, 'text': 'Oh, and then plus the bias.', 'start': 337.819, 'duration': 1.781}], 'summary': 'A neural network layer with 10 3x3x3 filters has 270 parameters to be learned, in addition to the bias.', 'duration': 33.928, 'max_score': 305.672, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jPOAS7uCODQ/pics/jPOAS7uCODQ305672.jpg'}, {'end': 391.266, 'src': 'embed', 'start': 370.911, 'weight': 3, 'content': [{'end': 381.478, 'text': 'Notice. one nice thing about this is that, No matter how big the input image is, the input image could be 1,000 by 1,000 or 5,000 by 5,000,', 'start': 370.911, 'duration': 10.567}, {'end': 391.266, 'text': 'but the number of parameters you have still remains fixed as 280, and you can use these 10 filters to detect features you know vertical edges,', 'start': 381.479, 'duration': 9.787}], 'summary': "Input image size doesn't affect fixed 280 parameters, 10 filters detect features.", 'duration': 20.355, 'max_score': 370.911, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jPOAS7uCODQ/pics/jPOAS7uCODQ370911.jpg'}], 'start': 3.919, 'title': 'Convolutional neural networks', 'summary': 'Covers building a convolutional neural network layer, including convolving a 3d volume with filters, adding bias and non-linearity, and mapping it to forward propagation, along with explaining the concept of convolutional neural nets, demonstrating the fixed number of parameters regardless of input image size and the advantage of using a small number of parameters to detect features in large images.', 'chapters': [{'end': 241.914, 'start': 3.919, 'title': 'Building convolutional neural network layer', 'summary': 'Explains the process of building a convolutional neural network layer, including convolving a 3d volume with filters, adding bias and non-linearity, and mapping it to forward propagation in a standard neural network.', 'duration': 237.995, 'highlights': ['The process of building a convolutional neural network layer involves convolving a 3D volume with different filters to obtain multiple 4x4 outputs, adding a bias to each output, applying a non-linearity such as ReLU, and stacking the outputs to form the final layer. The process of building a convolutional neural network layer involves convolving a 3D volume with different filters to obtain multiple 4x4 outputs, adding a bias to each output, applying a non-linearity such as ReLU, and stacking the outputs to form the final layer.', 'The convolutional operation in the layer plays a role similar to computing a linear function, and the output of the convolution operation is analogous to the product of the filter and the input, akin to the w1 times a0 operation in a standard neural network. The convolutional operation in the layer plays a role similar to computing a linear function, and the output of the convolution operation is analogous to the product of the filter and the input, akin to the w1 times a0 operation in a standard neural network.', 'The process involves adding a bias, which plays a role similar to z, and applying non-linearity to obtain the activation at the next layer, mirroring the steps in forward propagation of a non-convolutional neural network. The process involves adding a bias, which plays a role similar to z, and applying non-linearity to obtain the activation at the next layer, mirroring the steps in forward propagation of a non-convolutional neural network.']}, {'end': 414.737, 'start': 241.934, 'title': 'Convolutional neural nets', 'summary': 'Explains the concept of convolutional neural nets, demonstrating how the number of parameters remains fixed regardless of the input image size and highlighting the advantage of using a small number of parameters to detect features in large images.', 'duration': 172.803, 'highlights': ['The number of parameters remains fixed as 280, allowing the use of 10 filters to detect features in large images, making convolutional neural nets less prone to overfitting. No matter how big the input image is, the number of parameters remains fixed at 280, enabling detection of features in large images with a small number of parameters.', 'Each filter in one layer has 27 numbers to be learned, resulting in 280 parameters when 10 filters are used. When 10 filters are used, each with 27 numbers to be learned, the total parameters amount to 280, providing insight into the calculation of parameters in convolutional neural nets.', 'Explains the process of applying filters and stacking them up to form an output volume, demonstrating the transformation of dimensions in a convolutional net. Illustrates the process of applying filters and stacking them up to form an output volume, showing the transformation of dimensions from 6x6x3 to 4x4x2 in a convolutional net.']}], 'duration': 410.818, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jPOAS7uCODQ/pics/jPOAS7uCODQ3919.jpg', 'highlights': ['The process of building a convolutional neural network layer involves convolving a 3D volume with different filters, adding a bias to each output, applying a non-linearity such as ReLU, and stacking the outputs to form the final layer.', 'The convolutional operation in the layer plays a role similar to computing a linear function, and the output of the convolution operation is analogous to the product of the filter and the input, akin to the w1 times a0 operation in a standard neural network.', 'The process involves adding a bias, which plays a role similar to z, and applying non-linearity to obtain the activation at the next layer, mirroring the steps in forward propagation of a non-convolutional neural network.', 'The number of parameters remains fixed as 280, allowing the use of 10 filters to detect features in large images, making convolutional neural nets less prone to overfitting.', 'Each filter in one layer has 27 numbers to be learned, resulting in 280 parameters when 10 filters are used, providing insight into the calculation of parameters in convolutional neural nets.', 'Explains the process of applying filters and stacking them up to form an output volume, demonstrating the transformation of dimensions in a convolutional net.']}, {'end': 970.196, 'segs': [{'end': 459.838, 'src': 'embed', 'start': 414.737, 'weight': 0, 'content': [{'end': 418.86, 'text': 'and the number of parameters you know still remains fixed and relatively small as 280 in this example.', 'start': 414.737, 'duration': 4.123}, {'end': 419.16, 'text': 'All right.', 'start': 418.88, 'duration': 0.28}, {'end': 427.841, 'text': "So, to wrap up this video, let's just summarize the notation we're going to use to describe one layer,", 'start': 422.938, 'duration': 4.903}, {'end': 431.423, 'text': 'to describe a convolutional layer in a convolutional neural network.', 'start': 427.841, 'duration': 3.582}, {'end': 434.085, 'text': 'So layer L is a convolutional layer.', 'start': 432.024, 'duration': 2.061}, {'end': 438.568, 'text': "I'm going to use F superscript square bracket L to denote the filter size.", 'start': 434.425, 'duration': 4.143}, {'end': 442.671, 'text': "So previously we've been saying the filters are F by F.", 'start': 438.608, 'duration': 4.063}, {'end': 450.336, 'text': 'And now this superscript square bracket L just denotes that this is a filter size as an F by F filter in layer L.', 'start': 442.671, 'duration': 7.665}, {'end': 459.838, 'text': "And as usual, the superscript square bracket L is the notation we're using to refer to particular layer L.", 'start': 451.096, 'duration': 8.742}], 'summary': 'In this example, there are 280 fixed parameters in the convolutional layer notation.', 'duration': 45.101, 'max_score': 414.737, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jPOAS7uCODQ/pics/jPOAS7uCODQ414737.jpg'}, {'end': 567.453, 'src': 'embed', 'start': 541.123, 'weight': 1, 'content': [{'end': 548.475, 'text': "It's just that in layer L, the input to this layer is what have you had from the previous layer, so that's why you have L minus 1 there.", 'start': 541.123, 'duration': 7.352}, {'end': 556.706, 'text': 'And then the neural network, excuse me, this layer, and then this layer of the neural network will output, will itself output a volume.', 'start': 549.522, 'duration': 7.184}, {'end': 565.992, 'text': 'So that will be NH of L by NW of L by NC of L.', 'start': 556.767, 'duration': 9.225}, {'end': 567.453, 'text': 'That will be the size of the output.', 'start': 565.992, 'duration': 1.461}], 'summary': 'Neural network layer outputs nhxnwxnc volume.', 'duration': 26.33, 'max_score': 541.123, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jPOAS7uCODQ/pics/jPOAS7uCODQ541123.jpg'}, {'end': 670.47, 'src': 'embed', 'start': 650.064, 'weight': 2, 'content': [{'end': 661.705, 'text': "If the output volume has this depth, well, we know from the previous examples that that's equal to the number of filters we have in that layer,", 'start': 650.064, 'duration': 11.641}, {'end': 663.266, 'text': 'right?. So we had two filters.', 'start': 661.705, 'duration': 1.561}, {'end': 670.47, 'text': 'the output volume was 4x4x2, was two-dimensional, and if you had 10 filters, then the output volume was 4x4x10..', 'start': 663.266, 'duration': 7.204}], 'summary': 'With 2 filters, the output volume was 4x4x2. with 10 filters, it became 4x4x10.', 'duration': 20.406, 'max_score': 650.064, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jPOAS7uCODQ/pics/jPOAS7uCODQ650064.jpg'}, {'end': 781.366, 'src': 'heatmap', 'start': 737.694, 'weight': 0.777, 'content': [{'end': 741.856, 'text': "that's NHL by NW, L minus one.", 'start': 737.694, 'duration': 4.162}, {'end': 744.695, 'text': 'by NCL.', 'start': 742.674, 'duration': 2.021}, {'end': 755.361, 'text': 'And when you are using a vectorized implementation, or you know, batch gradient descent or mini batch gradient descent, then you actually output AL,', 'start': 745.616, 'duration': 9.745}, {'end': 760.524, 'text': 'which is set of M activations, if you have M examples.', 'start': 755.361, 'duration': 5.163}, {'end': 769.97, 'text': "So that will be M by NHL by NWL by NCL, right? If say you're using batch gradient descent.", 'start': 760.604, 'duration': 9.366}, {'end': 772.093, 'text': 'In the programming exercises.', 'start': 770.811, 'duration': 1.282}, {'end': 775.898, 'text': 'this would be the dimension, this would be ordering of the variables,', 'start': 772.093, 'duration': 3.805}, {'end': 781.366, 'text': 'and we have the index and the training examples first and then these three variables.', 'start': 775.898, 'duration': 5.468}], 'summary': 'Vectorized implementation outputs al, a set of m activations for m examples, with dimensions m x nhl x nwl x ncl.', 'duration': 43.672, 'max_score': 737.694, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jPOAS7uCODQ/pics/jPOAS7uCODQ737694.jpg'}], 'start': 414.737, 'title': 'Convolutional neural network layers', 'summary': 'Introduces notation for a convolutional layer, including filter size denoted by f superscript square bracket l, amount of padding denoted by pl, and stride denoted by sl. it explains the notations and computations involved in one layer of a convolutional neural network, discussing the dimensions, size, and number of channels of input and output volumes, the size and number of filters, weights, and biases, and introduces the conventions and variations in notation used in deep learning literature.', 'chapters': [{'end': 481.474, 'start': 414.737, 'title': 'Convolutional layer notation', 'summary': 'Introduces notation to describe a convolutional layer, including filter size denoted by f superscript square bracket l, amount of padding denoted by pl, and stride denoted by sl.', 'duration': 66.737, 'highlights': ['The notation for describing a convolutional layer includes F superscript square bracket L for filter size, PL for padding amount, and SL for stride.', 'The number of parameters in the example remains fixed and relatively small, at 280.', 'The notation F superscript square bracket L denotes the filter size as an F by F filter in layer L.']}, {'end': 970.196, 'start': 483.235, 'title': 'Convolutional neural network layers', 'summary': 'Explains the notations and computations involved in one layer of a convolutional neural network, discussing the dimensions, size, and number of channels of input and output volumes, the size and number of filters, weights, and biases, and introduces the conventions and variations in notation used in deep learning literature.', 'duration': 486.961, 'highlights': ['The size of the volume in layer L is NH by NW by NC, and the output volume is NH of L by NW of L by NC of L. It explains the size of the input and output volumes in a convolutional neural network layer, providing a clear understanding of the dimensions involved.', 'The number of channels in the output volume is equal to the number of filters used in that layer. It clarifies that the number of channels in the output volume is determined by the number of filters used in the layer, providing a key insight into the relationship between channels and filters.', 'Each filter has dimensions of FL by FL by NC L minus one, where NC L minus one is the number of channels in the input. It highlights the dimensions of each filter in a convolutional neural network layer, emphasizing the relationship between filter size and the number of channels in the input.']}], 'duration': 555.459, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/jPOAS7uCODQ/pics/jPOAS7uCODQ414737.jpg', 'highlights': ['The notation for describing a convolutional layer includes F superscript square bracket L for filter size, PL for padding amount, and SL for stride.', 'The size of the volume in layer L is NH by NW by NC, and the output volume is NH of L by NW of L by NC of L.', 'The number of channels in the output volume is equal to the number of filters used in that layer.', 'The number of parameters in the example remains fixed and relatively small, at 280.']}], 'highlights': ['The process of building a convolutional neural network layer involves convolving a 3D volume with different filters, adding a bias to each output, applying a non-linearity such as ReLU, and stacking the outputs to form the final layer.', 'The convolutional operation in the layer plays a role similar to computing a linear function, and the output of the convolution operation is analogous to the product of the filter and the input, akin to the w1 times a0 operation in a standard neural network.', 'The process involves adding a bias, which plays a role similar to z, and applying non-linearity to obtain the activation at the next layer, mirroring the steps in forward propagation of a non-convolutional neural network.', 'The number of parameters remains fixed as 280, allowing the use of 10 filters to detect features in large images, making convolutional neural nets less prone to overfitting.', 'Each filter in one layer has 27 numbers to be learned, resulting in 280 parameters when 10 filters are used, providing insight into the calculation of parameters in convolutional neural nets.', 'Explains the process of applying filters and stacking them up to form an output volume, demonstrating the transformation of dimensions in a convolutional net.', 'The notation for describing a convolutional layer includes F superscript square bracket L for filter size, PL for padding amount, and SL for stride.', 'The size of the volume in layer L is NH by NW by NC, and the output volume is NH of L by NW of L by NC of L.', 'The number of channels in the output volume is equal to the number of filters used in that layer.', 'The number of parameters in the example remains fixed and relatively small, at 280.']}