title
Activation Functions | Deep Learning Tutorial 8 (Tensorflow Tutorial, Keras & Python)
description
Activation functions (step, sigmoid, tanh, relu, leaky relu ) are very important in building a non linear model for a given problem. In this video we will cover different activation functions that are used while building a neural network. We will discuss these functions with their pros and cons,
1) Step
2) Sigmoid
3) tanh
4) ReLU (rectified linear unit)
5) Leaky ReLU
We will also write python code to implement these functions and see how they behave for sample inputes.
Github link for code in this tutorial: : https://github.com/codebasics/deep-learning-keras-tf-tutorial/blob/master/2_activation_functions/2_activation_functions.ipynb
Do you want to learn technology from me? Check https://codebasics.io/?utm_source=description&utm_medium=yt&utm_campaign=description&utm_id=description for my affordable video courses.
🔖 Hashtags 🔖
#activationfunction #activationfunctionneuralnetwork #neuralnetwork #deeplearning
Next video: https://www.youtube.com/watch?v=cT4pQT5Da0Q&list=PLeo1K3hjS3uu7CxAacxVndI4bE_o3BDtO&index=9
Previous video: https://www.youtube.com/watch?v=iqQgED9vV7k&list=PLeo1K3hjS3uu7CxAacxVndI4bE_o3BDtO&index=7
Deep learning playlist: https://www.youtube.com/playlist?list=PLeo1K3hjS3uu7CxAacxVndI4bE_o3BDtO
Machine learning playlist : https://www.youtube.com/playlist?list=PLeo1K3hjS3uvCeTYTeyfe0-rN5r8zn9rw
Prerequisites for this series:
1: Python tutorials (first 16 videos): https://www.youtube.com/playlist?list=PLeo1K3hjS3uv5U-Lmlnucd7gqF-3ehIh0
2: Pandas tutorials(first 8 videos): https://www.youtube.com/playlist?list=PLeo1K3hjS3uuASpe-1LjfG5f14Bnozjwy
3: Machine learning playlist (first 16 videos): https://www.youtube.com/playlist?list=PLeo1K3hjS3uvCeTYTeyfe0-rN5r8zn9rw
🌎 My Website For Video Courses: https://codebasics.io/?utm_source=description&utm_medium=yt&utm_campaign=description&utm_id=description
Need help building software or data analytics and AI solutions? My company https://www.atliq.com/ can help. Click on the Contact button on that website.
#️⃣ Social Media #️⃣
🔗 Discord: https://discord.gg/r42Kbuk
📸 Dhaval's Personal Instagram: https://www.instagram.com/dhavalsays/
📸 Instagram: https://www.instagram.com/codebasicshub/
🔊 Facebook: https://www.facebook.com/codebasicshub
📝 Linkedin (Personal): https://www.linkedin.com/in/dhavalsays/
📝 Linkedin (Codebasics): https://www.linkedin.com/company/codebasics/
📱 Twitter: https://twitter.com/codebasicshub
🔗 Patreon: https://www.patreon.com/codebasics?fan_landing=true
detail
{'title': 'Activation Functions | Deep Learning Tutorial 8 (Tensorflow Tutorial, Keras & Python)', 'heatmap': [{'end': 614.476, 'start': 543.093, 'weight': 0.705}], 'summary': 'This tutorial delves into the necessity and impact of activation functions in neural networks, addressing limitations of linear equations in classification and discussing the implementation and functionality of sigmoid, tanh, relu, and leaky relu functions in python, emphasizing their ability to convert input values to specific ranges and the preference for relu due to its impact on the speed of learning.', 'chapters': [{'end': 164.208, 'segs': [{'end': 49.797, 'src': 'embed', 'start': 21.091, 'weight': 0, 'content': [{'end': 24.574, 'text': "If you're not seeing those videos, I would highly recommend you watch that.", 'start': 21.091, 'duration': 3.483}, {'end': 32.803, 'text': 'But there we build a single neuron neural network for classification problem based on age, income, education.', 'start': 25.115, 'duration': 7.688}, {'end': 35.525, 'text': 'We want to predict if a person will buy the insurance or not.', 'start': 32.863, 'duration': 2.662}, {'end': 43.732, 'text': 'And we saw that having a sigmoid function helps you reduce output in a 0 to 1 range.', 'start': 36.346, 'duration': 7.386}, {'end': 49.797, 'text': 'And you can make a decision for your classification if you have value between 0 and 1.', 'start': 44.173, 'duration': 5.624}], 'summary': 'Built single neuron neural network for classification based on age, income, education to predict insurance purchase.', 'duration': 28.706, 'max_score': 21.091, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/icZItWxw7AI/pics/icZItWxw7AI21091.jpg'}, {'end': 96.565, 'src': 'embed', 'start': 66.649, 'weight': 1, 'content': [{'end': 69.69, 'text': 'When neuron is firing, it is saying that person will buy insurance.', 'start': 66.649, 'duration': 3.041}, {'end': 74.551, 'text': 'When it is not firing, it will say person is not buying the insurance.', 'start': 70.41, 'duration': 4.141}, {'end': 83.874, 'text': 'So you can clearly see that having a sigmoid function or an activation function is helpful in the output layer.', 'start': 75.212, 'duration': 8.662}, {'end': 90.977, 'text': 'How about hidden layers? We also saw that you can have a complex neural network with a hidden layer like this.', 'start': 84.375, 'duration': 6.602}, {'end': 96.565, 'text': 'Here also in the hidden layer, there are always two portions.', 'start': 92.401, 'duration': 4.164}], 'summary': 'Neuron firing predicts insurance purchase, sigmoid function aids output, hidden layer allows complexity.', 'duration': 29.916, 'max_score': 66.649, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/icZItWxw7AI/pics/icZItWxw7AI66649.jpg'}, {'end': 140.212, 'src': 'embed', 'start': 114.139, 'weight': 2, 'content': [{'end': 125.906, 'text': 'If you do the math, you will realize that you will eventually get a linear equation where the output is just a weighted sum of your input features.', 'start': 114.139, 'duration': 11.767}, {'end': 128.768, 'text': 'For that reason, you do not even need hidden layer.', 'start': 126.306, 'duration': 2.462}, {'end': 140.212, 'text': "So if you are having a complex neural network with, let's say, five hidden layers, and if you remove the activation function from all those layers,", 'start': 129.728, 'duration': 10.484}], 'summary': 'Neural networks can be simplified to a linear equation without hidden layers, even in complex networks.', 'duration': 26.073, 'max_score': 114.139, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/icZItWxw7AI/pics/icZItWxw7AI114139.jpg'}], 'start': 0.249, 'title': 'Importance of activation functions', 'summary': 'Discusses the necessity of activation functions in neural networks, focusing on the sigmoid function for output layer and the implications of removing activation functions from hidden layers in a complex neural network.', 'chapters': [{'end': 164.208, 'start': 0.249, 'title': 'Importance of activation functions in neural networks', 'summary': 'Discusses the necessity of activation functions in neural networks, focusing on the sigmoid function for output layer and the implications of removing activation functions from hidden layers in a complex neural network.', 'duration': 163.959, 'highlights': ['The sigmoid function helps reduce output in a 0 to 1 range, aiding in classification predictions based on age, income, and education. Having a sigmoid function allows for classification predictions based on age, income, and education, reducing the output to a 0 to 1 range for easier decision-making.', 'The activation function in the hidden layer is essential as it prevents the network from reducing to a simple linear equation, allowing for the complex problem-solving capability of neural networks. The activation function in the hidden layer prevents the network from reducing to a simple linear equation, enabling the complex problem-solving capability of neural networks.', 'The removal of activation functions from hidden layers in a complex neural network results in the network being reducible to a simple linear equation, rendering the hidden layers unnecessary. Removing activation functions from hidden layers in a complex neural network causes the network to be reducible to a simple linear equation, making the hidden layers unnecessary.']}], 'duration': 163.959, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/icZItWxw7AI/pics/icZItWxw7AI249.jpg', 'highlights': ['The sigmoid function allows for classification predictions based on age, income, and education, reducing the output to a 0 to 1 range for easier decision-making.', 'The activation function in the hidden layer prevents the network from reducing to a simple linear equation, enabling the complex problem-solving capability of neural networks.', 'Removing activation functions from hidden layers in a complex neural network causes the network to be reducible to a simple linear equation, making the hidden layers unnecessary.']}, {'end': 728.952, 'segs': [{'end': 321.151, 'src': 'embed', 'start': 292.323, 'weight': 1, 'content': [{'end': 293.944, 'text': 'Otherwise, person will not buy the insurance.', 'start': 292.323, 'duration': 1.621}, {'end': 298.686, 'text': 'And you already saw the problem with step function, which is it is misclassifying some data points.', 'start': 294.504, 'duration': 4.182}, {'end': 302.787, 'text': 'Here is a simple representation of step function.', 'start': 299.286, 'duration': 3.501}, {'end': 308.648, 'text': "The second problem with step function is when you're doing multi-class classification.", 'start': 303.407, 'duration': 5.241}, {'end': 312.669, 'text': 'Here I have an image of four handwritten digit, of course.', 'start': 309.628, 'duration': 3.041}, {'end': 321.151, 'text': "And if you have output classes zero to nine, if you're using step function, it will just give two output, either zero or one.", 'start': 313.769, 'duration': 7.382}], 'summary': 'Step function misclassifies data, problematic for multi-class classification.', 'duration': 28.828, 'max_score': 292.323, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/icZItWxw7AI/pics/icZItWxw7AI292323.jpg'}, {'end': 360.386, 'src': 'embed', 'start': 336.675, 'weight': 0, 'content': [{'end': 346.059, 'text': "And that's when sigmoid function comes in, where instead of 0 and 1 value, it will give you a smooth curve between 0 and 1.", 'start': 336.675, 'duration': 9.384}, {'end': 352.022, 'text': "And because of this, when you're doing multiclass classification, you have a number between 0 and 1.", 'start': 346.059, 'duration': 5.963}, {'end': 354.343, 'text': 'And now you can take a maximum value out of it.', 'start': 352.022, 'duration': 2.321}, {'end': 356.584, 'text': 'So 4 has 0.82.', 'start': 354.403, 'duration': 2.181}, {'end': 360.386, 'text': "That's why you can say this image is of digit 4.", 'start': 356.584, 'duration': 3.802}], 'summary': 'Sigmoid function produces a smooth curve, aiding in multiclass classification by assigning a number between 0 and 1, enabling the identification of the maximum value, such as 0.82 for digit 4.', 'duration': 23.711, 'max_score': 336.675, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/icZItWxw7AI/pics/icZItWxw7AI336675.jpg'}, {'end': 453.043, 'src': 'embed', 'start': 426.991, 'weight': 2, 'content': [{'end': 433.794, 'text': 'The general guideline is use sigmoid in the output layer because you already saw why we need to use sigmoid in output layer.', 'start': 426.991, 'duration': 6.803}, {'end': 436.756, 'text': 'It can be helpful in binary classification.', 'start': 434.495, 'duration': 2.261}, {'end': 440.818, 'text': 'In all other places, try to use 10H if possible.', 'start': 437.696, 'duration': 3.122}, {'end': 450.222, 'text': 'So 10H instead of sigmoid is always better because 10H will kind of calculate a mean of zero and it will center your data.', 'start': 441.078, 'duration': 9.144}, {'end': 453.043, 'text': "So it's useful to use 10H.", 'start': 450.382, 'duration': 2.661}], 'summary': 'Use sigmoid in output layer for binary classification. use 10h elsewhere for centering data.', 'duration': 26.052, 'max_score': 426.991, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/icZItWxw7AI/pics/icZItWxw7AI426991.jpg'}, {'end': 614.476, 'src': 'heatmap', 'start': 543.093, 'weight': 0.705, 'content': [{'end': 545.453, 'text': 'So the change is actually 0.', 'start': 543.093, 'duration': 2.36}, {'end': 547.973, 'text': 'Delta x is 1 between 3 and 4.', 'start': 545.453, 'duration': 2.52}, {'end': 551.574, 'text': 'So when you divide 0 by 1, of course you get 0.', 'start': 547.973, 'duration': 3.601}, {'end': 552.855, 'text': 'here on the negative range.', 'start': 551.574, 'duration': 1.281}, {'end': 562.325, 'text': 'also, when the value is higher, you get your derivative as zero, and that creates a problem in your learning process,', 'start': 552.855, 'duration': 9.47}, {'end': 571.773, 'text': 'because we saw in our gradient descent tutorial that you need to calculate derivative and back propagate your errors,', 'start': 562.325, 'duration': 9.448}, {'end': 577.618, 'text': 'and if your derivatives are closing to zero, the learning becomes extremely slow.', 'start': 571.773, 'duration': 5.845}, {'end': 580.781, 'text': 'this is called vanishing gradients problem.', 'start': 577.618, 'duration': 3.163}, {'end': 585.264, 'text': 'I will make a separate video on that, but just for now.', 'start': 580.781, 'duration': 4.483}, {'end': 594.295, 'text': 'just have this fact in mind that sigmoid and tanh has this vanishing gradient problem and for that reason it makes learning process very slow.', 'start': 585.264, 'duration': 9.031}, {'end': 599.582, 'text': 'So then they came up with this new function called relu which is extremely simple function by the way.', 'start': 594.735, 'duration': 4.847}, {'end': 605.77, 'text': 'If your value is less than zero then your value is zero, output is zero.', 'start': 600.703, 'duration': 5.067}, {'end': 611.154, 'text': 'If it is more than zero, then the output is same as that value.', 'start': 606.551, 'duration': 4.603}, {'end': 614.476, 'text': 'So if I feed two, I get two as an output.', 'start': 611.694, 'duration': 2.782}], 'summary': 'Derivative close to zero causes slow learning, solved by relu function.', 'duration': 71.383, 'max_score': 543.093, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/icZItWxw7AI/pics/icZItWxw7AI543093.jpg'}, {'end': 599.582, 'src': 'embed', 'start': 571.773, 'weight': 3, 'content': [{'end': 577.618, 'text': 'and if your derivatives are closing to zero, the learning becomes extremely slow.', 'start': 571.773, 'duration': 5.845}, {'end': 580.781, 'text': 'this is called vanishing gradients problem.', 'start': 577.618, 'duration': 3.163}, {'end': 585.264, 'text': 'I will make a separate video on that, but just for now.', 'start': 580.781, 'duration': 4.483}, {'end': 594.295, 'text': 'just have this fact in mind that sigmoid and tanh has this vanishing gradient problem and for that reason it makes learning process very slow.', 'start': 585.264, 'duration': 9.031}, {'end': 599.582, 'text': 'So then they came up with this new function called relu which is extremely simple function by the way.', 'start': 594.735, 'duration': 4.847}], 'summary': 'Vanishing gradients in sigmoid and tanh slow learning; relu is a solution.', 'duration': 27.809, 'max_score': 571.773, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/icZItWxw7AI/pics/icZItWxw7AI571773.jpg'}, {'end': 648.769, 'src': 'embed', 'start': 621.019, 'weight': 4, 'content': [{'end': 624.061, 'text': 'The guideline is for hidden layers.', 'start': 621.019, 'duration': 3.042}, {'end': 627.083, 'text': 'ReLU is most popularly used function.', 'start': 624.061, 'duration': 3.022}, {'end': 633.427, 'text': 'because you think about the math behind ReLU, it is computationally very effective.', 'start': 627.083, 'duration': 6.344}, {'end': 643.247, 'text': "sigmoid 10h, you have to do some computation, but ReLU is very, very lightweight function and that's why it is very, very popular.", 'start': 634.324, 'duration': 8.923}, {'end': 648.769, 'text': 'if you are not sure which function to use, always go with ReLU, especially for hidden layers.', 'start': 643.247, 'duration': 5.522}], 'summary': 'Relu is the most popularly used function for hidden layers due to its computational efficiency.', 'duration': 27.75, 'max_score': 621.019, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/icZItWxw7AI/pics/icZItWxw7AI621019.jpg'}], 'start': 164.729, 'title': 'Activation functions in neural networks', 'summary': 'Discusses the importance of activation functions like step, sigmoid, and tanh in addressing limitations of linear equations in classification, emphasizing their impact on misclassification, decision-making, vanishing gradient problem, and the preference for relu, impacting the speed of learning.', 'chapters': [{'end': 453.043, 'start': 164.729, 'title': 'Activation functions and classification', 'summary': 'Discusses the importance of activation functions like step function, sigmoid function, and tanh in addressing the limitations of linear equations in binary and multiclass classification problems, emphasizing their impact on misclassification and decision-making.', 'duration': 288.314, 'highlights': ['Sigmoid function provides a smooth curve between 0 and 1, enabling better decision-making in multiclass classification, with a clear example showing the benefit of sigmoid function in determining the final class. Sigmoid function generates a smooth curve between 0 and 1, facilitating better decision-making in multiclass classification. With an example showing an image being classified as digit 4 based on a sigmoid function output of 0.82, it demonstrates the advantage of sigmoid function in enabling clear decision-making.', 'The drawbacks of step function in misclassifying data points and its limitations in multiclass classification are outlined, providing a clear understanding of the challenges associated with using step function as an activation function. The limitations of step function in misclassifying data points and its challenges in multiclass classification are clearly explained, highlighting the drawbacks of using step function as an activation function.', 'The recommendation to use tanh instead of sigmoid in most cases due to its ability to center data by calculating a mean of zero, offering insights into the advantages of using tanh in various scenarios. The recommendation to prefer tanh over sigmoid in most cases is explained, emphasizing the benefits of using tanh to center data by calculating a mean of zero, illustrating its usefulness in different scenarios.']}, {'end': 728.952, 'start': 453.484, 'title': 'Activation functions and learning process', 'summary': "Explains the issues with sigmoid and tanh functions, the vanishing gradient problem, and the preference for relu as the most popularly used function for hidden layers, emphasizing the importance of choosing the right activation function for a neural network's learning process and the impact on the speed of learning.", 'duration': 275.468, 'highlights': ['The vanishing gradient problem with sigmoid and tanh functions slows down the learning process. The derivatives of sigmoid and tanh functions tend to approach zero for higher values, leading to a vanishing gradient problem that causes slow learning.', 'ReLU is recommended as the default choice for hidden layers due to its computational efficiency. ReLU is popular for hidden layers due to its computational effectiveness and lightweight nature compared to sigmoid and tanh functions.', 'Preference for ReLU or Leaky ReLU in hidden layers, with the choice often based on trial and error for optimal output. The decision to use ReLU or Leaky ReLU in hidden layers is often based on trial and error to determine which function provides the best output for a specific problem.']}], 'duration': 564.223, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/icZItWxw7AI/pics/icZItWxw7AI164729.jpg', 'highlights': ['Sigmoid function enables better decision-making in multiclass classification.', 'Drawbacks of step function in misclassifying data points and its limitations in multiclass classification are outlined.', 'Recommendation to use tanh instead of sigmoid in most cases due to its ability to center data by calculating a mean of zero.', 'Vanishing gradient problem with sigmoid and tanh functions slows down the learning process.', 'ReLU is recommended as the default choice for hidden layers due to its computational efficiency.', 'Preference for ReLU or Leaky ReLU in hidden layers, often based on trial and error for optimal output.']}, {'end': 988.004, 'segs': [{'end': 787.647, 'src': 'embed', 'start': 730.598, 'weight': 0, 'content': [{'end': 736.161, 'text': 'We already know the equation for our sigmoid function, which is 1 divided by 1 plus e, raised to minus z.', 'start': 730.598, 'duration': 5.563}, {'end': 741.504, 'text': 'I have written a simple Python function for the same equation and that function looks something like this.', 'start': 736.161, 'duration': 5.343}, {'end': 747.567, 'text': '1 divided by 1 plus this is e raised to minus z or the input which is x.', 'start': 742.144, 'duration': 5.423}, {'end': 755.886, 'text': 'And we know sigmoid will just try to convert any value in a range of 0 to 1.', 'start': 749.321, 'duration': 6.565}, {'end': 756.767, 'text': "So let's try it out.", 'start': 755.886, 'duration': 0.881}, {'end': 762.271, 'text': "So if I try 100, let's see what happens.", 'start': 756.867, 'duration': 5.404}, {'end': 765.393, 'text': 'Okay, see 100 it converted to 1.', 'start': 763.031, 'duration': 2.362}, {'end': 769.896, 'text': "Okay, let's see what it will do to 1.", 'start': 765.393, 'duration': 4.503}, {'end': 771.678, 'text': 'So 0.73.', 'start': 769.896, 'duration': 1.782}, {'end': 776.241, 'text': 'Any output from sigmoid function will be in range 0 and 1.', 'start': 771.678, 'duration': 4.563}, {'end': 778.583, 'text': "Okay, let's give some negative value.", 'start': 776.241, 'duration': 2.342}, {'end': 782.58, 'text': "So let's say minus 56.", 'start': 781.198, 'duration': 1.382}, {'end': 787.647, 'text': 'See, e raised to minus 25, which means very, very close to zero.', 'start': 782.58, 'duration': 5.067}], 'summary': 'Demonstration of sigmoid function in python, converting 100 to 1 and 1 to 0.73, and negative value converted to close to zero.', 'duration': 57.049, 'max_score': 730.598, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/icZItWxw7AI/pics/icZItWxw7AI730598.jpg'}, {'end': 946.083, 'src': 'embed', 'start': 825.561, 'weight': 1, 'content': [{'end': 830.203, 'text': 'And this function will convert a value between minus one and one.', 'start': 825.561, 'duration': 4.642}, {'end': 831.064, 'text': "So let's try it out.", 'start': 830.243, 'duration': 0.821}, {'end': 838.648, 'text': 'So see, minus 56, it converted it into minus one.', 'start': 831.204, 'duration': 7.444}, {'end': 843.551, 'text': "And if you have, let's say value 50, it will convert it into one.", 'start': 839.429, 'duration': 4.122}, {'end': 854.32, 'text': "And if you have any intermediate value in between, let's say one, So again, your output will be between minus one and one.", 'start': 844.351, 'duration': 9.969}, {'end': 866.265, 'text': 'ReLU is extremely easy to implement, which is, you are just taking max between zero and x.', 'start': 855.081, 'duration': 11.184}, {'end': 873.689, 'text': "okay, and see here let's say, if i do any negative value, it will convert it to zero.", 'start': 866.265, 'duration': 7.424}, {'end': 879.292, 'text': "and if i type in any positive value, let's say one one, it will convert it to one.", 'start': 873.689, 'duration': 5.603}, {'end': 882.114, 'text': 'if it is six, it is six.', 'start': 879.292, 'duration': 2.822}, {'end': 890.958, 'text': 'so the value remains same for positive value, but any negative value i supply remains zero, very, very simple.', 'start': 882.114, 'duration': 8.844}, {'end': 894.602, 'text': 'and leaky value is also very simple.', 'start': 890.958, 'duration': 3.644}, {'end': 903.11, 'text': 'so the leaky value function is 0.1 into x, so it will convert.', 'start': 894.602, 'duration': 8.508}, {'end': 903.771, 'text': "so let's see.", 'start': 903.11, 'duration': 0.661}, {'end': 913.119, 'text': 'so leaky value 5, supply minus 10, it will not convert it to 0 this time, but 0.1 into x, which is minus 10..', 'start': 903.771, 'duration': 9.348}, {'end': 918.923, 'text': 'And then if I have a positive value, of course, it will keep it same as a positive value.', 'start': 913.119, 'duration': 5.804}, {'end': 920.424, 'text': 'It will not make any change.', 'start': 918.943, 'duration': 1.481}, {'end': 928.748, 'text': 'I implemented this function just for your understanding when you are solving any machine learning problem using deep learning.', 'start': 921.584, 'duration': 7.164}, {'end': 932.691, 'text': "Most likely, you don't have to write these functions yourself.", 'start': 929.549, 'duration': 3.142}, {'end': 936.177, 'text': 'you will be using Keras TensorFlow library.', 'start': 933.755, 'duration': 2.422}, {'end': 939.279, 'text': 'And those libraries will have those functions implemented.', 'start': 936.237, 'duration': 3.042}, {'end': 943.582, 'text': 'So I gave you an idea of this function just for your understanding.', 'start': 939.699, 'duration': 3.883}, {'end': 946.083, 'text': "So remember, you're not going to.", 'start': 943.762, 'duration': 2.321}], 'summary': 'Functions convert values between -1 and 1, relu converts negative to 0 and positive to 1, leaky relu multiplies negative values by 0.1.', 'duration': 120.522, 'max_score': 825.561, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/icZItWxw7AI/pics/icZItWxw7AI825561.jpg'}], 'start': 730.598, 'title': 'Implementing functions in deep learning', 'summary': 'Explains the implementation of sigmoid and 10h functions in python, showcasing their ability to convert input values to a range of 0 to 1, with examples such as 100 converting to 1 and -56 converting to a value very close to zero. it also discusses relu, leaky relu, and sigmoid activation functions, showcasing their ability to convert input values to specific ranges, such as [-1, 1] for sigmoid, with examples demonstrating their functionality and simplicity of usage. additionally, it explains that while implementing functions for deep learning, most likely, you will not have to write these functions yourself and will be using ready-made apis from tensorflow and keras.', 'chapters': [{'end': 825.081, 'start': 730.598, 'title': 'Sigmoid and 10h functions in python', 'summary': 'Explains the sigmoid and 10h functions, demonstrating their python implementations and showing how they convert input values to a range of 0 to 1, with examples such as 100 converting to 1 and -56 converting to a value very close to zero.', 'duration': 94.483, 'highlights': ['The sigmoid function converts input values to a range of 0 and 1, with examples such as 100 converting to 1 and -56 converting to a value very close to zero.', 'The Python implementation of the sigmoid function is demonstrated, showcasing its simple conversion of any number to a range of 0 and 1.', 'The 10H function, a variant of the sigmoid function, is explained and its Python implementation is shown, with the equation e raised to z minus e raised to minus c divided by e raised to z plus e raised to minus c.']}, {'end': 920.424, 'start': 825.561, 'title': 'Activation functions summary', 'summary': 'Discusses the implementation of relu, leaky relu, and sigmoid activation functions, showcasing their ability to convert input values to specific ranges, such as [-1, 1] for sigmoid, with examples demonstrating their functionality and simplicity of usage.', 'duration': 94.863, 'highlights': ['The Sigmoid function converts values between -1 and 1, with examples showcasing conversion of -56 to -1, 50 to 1, and intermediate values to range between -1 and 1.', 'The ReLU function is described as taking the maximum value between 0 and x, effectively converting negative values to 0 and leaving positive values unchanged, as demonstrated with specific examples.', 'The Leaky ReLU function, defined as 0.1 * x, showcases its ability to convert negative values to a fraction of the input, demonstrated with an example of -10 being converted to -1, while leaving positive values unchanged.']}, {'end': 988.004, 'start': 921.584, 'title': 'Implementing functions in deep learning', 'summary': 'Explains that while implementing functions for deep learning, most likely, you will not have to write these functions yourself and will be using ready-made apis from tensorflow and keras.', 'duration': 66.42, 'highlights': ['The chapter emphasizes that when solving machine learning problems using deep learning, one will most likely not have to write these functions themselves and will use Keras TensorFlow library for implementation.', 'It is mentioned that the tutorial series requires support and can be shared on social media platforms such as Facebook, LinkedIn, and WhatsApp to reach a wider audience.', 'The speaker expresses the intention of wanting the tutorials to reach as many people as possible and encourages sharing to achieve this goal.']}], 'duration': 257.406, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/icZItWxw7AI/pics/icZItWxw7AI730598.jpg', 'highlights': ['The sigmoid function converts input values to a range of 0 and 1, with examples such as 100 converting to 1 and -56 converting to a value very close to zero.', 'The Sigmoid function converts values between -1 and 1, with examples showcasing conversion of -56 to -1, 50 to 1, and intermediate values to range between -1 and 1.', 'The Python implementation of the sigmoid function is demonstrated, showcasing its simple conversion of any number to a range of 0 and 1.', 'The ReLU function is described as taking the maximum value between 0 and x, effectively converting negative values to 0 and leaving positive values unchanged, as demonstrated with specific examples.', 'The Leaky ReLU function, defined as 0.1 * x, showcases its ability to convert negative values to a fraction of the input, demonstrated with an example of -10 being converted to -1, while leaving positive values unchanged.', 'The chapter emphasizes that when solving machine learning problems using deep learning, one will most likely not have to write these functions themselves and will use Keras TensorFlow library for implementation.']}], 'highlights': ['The preference for ReLU or Leaky ReLU in hidden layers, often based on trial and error for optimal output.', 'Recommendation to use tanh instead of sigmoid in most cases due to its ability to center data by calculating a mean of zero.', 'The activation function in the hidden layer prevents the network from reducing to a simple linear equation, enabling the complex problem-solving capability of neural networks.', 'Preference for ReLU due to its impact on the speed of learning.', 'The ReLU function is described as taking the maximum value between 0 and x, effectively converting negative values to 0 and leaving positive values unchanged, as demonstrated with specific examples.']}