title

Naive Bayes, Clearly Explained!!!

description

When most people want to learn about Naive Bayes, they want to learn about the Multinomial Naive Bayes Classifier - which sounds really fancy, but is actually quite simple. This video walks you through it one step at a time and by the end, you'll no longer be naive about Naive Bayes!!!
Get the StatQuest Study Guide here: https://statquest.org/studyguides/
For a complete index of all the StatQuest videos, check out:
https://statquest.org/video-index/
If you'd like to support StatQuest, please consider...
Buying my book, The StatQuest Illustrated Guide to Machine Learning:
PDF - https://statquest.gumroad.com/l/wvtmc
Paperback - https://www.amazon.com/dp/B09ZCKR4H6
Kindle eBook - https://www.amazon.com/dp/B09ZG79HXC
Patreon: https://www.patreon.com/statquest
...or...
YouTube Membership: https://www.youtube.com/channel/UCtYLUTtgS3k1Fg4y5tAhLbw/join
...a cool StatQuest t-shirt or sweatshirt:
https://shop.spreadshirt.com/statquest-with-josh-starmer/
...buying one or two of my songs (or go large and get a whole album!)
https://joshuastarmer.bandcamp.com/
...or just donating to StatQuest!
https://www.paypal.me/statquest
Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
https://twitter.com/joshuastarmer
0:00 Awesome song and introduction
1:08 Histograms and conditional probabilities
4:22 Classifying "Dear Friend"
7:33 Review of concepts
9:00 Classifying "Lunch Money x 5"
10:54 Pseudocounts
12:35 Why Naive Bayes is Naive
#statquest #naivebayes

detail

{'title': 'Naive Bayes, Clearly Explained!!!', 'heatmap': [{'end': 313.666, 'start': 283.202, 'weight': 0.874}], 'summary': 'Provides a clear explanation of naive bayes classification, focusing on multinomial naive bayes classifier and the process of calculating probabilities for words in normal and spam messages. it explains the probability calculation for normal and spam messages with a normal message score of 0.09 and a spam message score of 0.01, and demonstrates the application of naive bayes classification in classifying messages as normal or spam.', 'chapters': [{'end': 253.038, 'segs': [{'end': 80.167, 'src': 'embed', 'start': 28.471, 'weight': 0, 'content': [{'end': 33.913, 'text': 'Just add data, and their automatic machine learning algorithms will do the rest of the work for you.', 'start': 28.471, 'duration': 5.442}, {'end': 38.815, 'text': 'For more details, follow the link in the pinned comment below.', 'start': 34.914, 'duration': 3.901}, {'end': 43.957, 'text': 'When most people want to learn about Naive Bayes.', 'start': 40.856, 'duration': 3.101}, {'end': 50.019, 'text': "they want to learn about the Multinomial Naive Bayes Classifier, and that's what we talk about in this video.", 'start': 43.957, 'duration': 6.062}, {'end': 59.403, 'text': 'However, just know that there is another commonly used version of Naive Bayes called Gaussian Naive Bayes Classification,', 'start': 51.001, 'duration': 8.402}, {'end': 61.843, 'text': 'and I cover that in a follow-up stat quest.', 'start': 59.403, 'duration': 2.44}, {'end': 65.444, 'text': "So check that one out when you're done with this quest.", 'start': 63.043, 'duration': 2.401}, {'end': 72.545, 'text': 'Bam! Now, imagine we receive normal messages from friends and family.', 'start': 66.264, 'duration': 6.281}, {'end': 80.167, 'text': 'And we also receive spam, unwanted messages that are usually scams or unsolicited advertisements.', 'start': 73.546, 'duration': 6.621}], 'summary': 'Automatic machine learning algorithms handle data; covers naive bayes and its types.', 'duration': 51.696, 'max_score': 28.471, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/O2L2Uv9pdDA/pics/O2L2Uv9pdDA28471.jpg'}, {'end': 169.079, 'src': 'embed', 'start': 111.575, 'weight': 3, 'content': [{'end': 121.823, 'text': 'the total number of times dear occurred in normal messages, divided by 17,, the total number of words in all of the normal messages.', 'start': 111.575, 'duration': 10.248}, {'end': 127.227, 'text': 'And that gives us 0.47.', 'start': 123.444, 'duration': 3.783}, {'end': 130.568, 'text': "So let's put that over the word dear so we don't forget it.", 'start': 127.227, 'duration': 3.341}, {'end': 141.371, 'text': 'Likewise, the probability that we see the word friend, given that we saw it in a normal message, is 5,', 'start': 132.07, 'duration': 9.301}, {'end': 150.877, 'text': 'the total number of times friend occurred in normal messages, divided by 17,, the total number of words in all of the normal messages.', 'start': 141.371, 'duration': 9.506}, {'end': 156.06, 'text': 'And that gives us 0.29.', 'start': 152.237, 'duration': 3.823}, {'end': 159.142, 'text': "So let's put that over the word friend so we don't forget it.", 'start': 156.06, 'duration': 3.082}, {'end': 169.079, 'text': 'Likewise, the probability that we see the word lunch given that it is in a normal message is 0.18.', 'start': 160.632, 'duration': 8.447}], 'summary': "The probability of 'dear' in normal messages is 0.47, 'friend' is 0.29, and 'lunch' is 0.18.", 'duration': 57.504, 'max_score': 111.575, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/O2L2Uv9pdDA/pics/O2L2Uv9pdDA111575.jpg'}], 'start': 1.067, 'title': 'Naive bayes classification and spam message filtering', 'summary': 'Provides an understanding of naive bayes classification, focusing on the multinomial naive bayes classifier. it also discusses the process of calculating probabilities for words in normal and spam messages, with examples of specific word probabilities provided.', 'chapters': [{'end': 80.167, 'start': 1.067, 'title': 'Understanding naive bayes classification', 'summary': 'Discusses naive bayes classification, focusing on the multinomial naive bayes classifier, and mentions a follow-up video on gaussian naive bayes. it also introduces the concept of normal messages and spam in the context of naive bayes.', 'duration': 79.1, 'highlights': ['Multinomial Naive Bayes Classifier is the focus of the discussion.', 'Introduction of Gaussian Naive Bayes Classification for further learning.', 'Explanation of normal messages and spam in the context of Naive Bayes.']}, {'end': 253.038, 'start': 81.644, 'title': 'Spam message filtering', 'summary': "Discusses the process of calculating probabilities for words in normal and spam messages, with examples such as the word 'dear' having a probability of 0.47 in normal messages and 0.29 in spam messages.", 'duration': 171.394, 'highlights': ["The probability of seeing the word 'dear' in normal messages is 0.47, calculated from the occurrence of 8 times out of 17 words, while in spam messages it is 0.29.", "The probability of seeing the word 'friend' in normal messages is 0.29, calculated from the occurrence of 5 times out of 17 words, while in spam messages it is not specified.", "The probability of seeing the word 'lunch' in normal messages is 0.18, while the probability of seeing the word 'money' in normal messages is 0.06."]}], 'duration': 251.971, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/O2L2Uv9pdDA/pics/O2L2Uv9pdDA1067.jpg', 'highlights': ['Multinomial Naive Bayes Classifier is the focus of the discussion.', 'Explanation of normal messages and spam in the context of Naive Bayes.', 'Introduction of Gaussian Naive Bayes Classification for further learning.', "The probability of seeing the word 'dear' in normal messages is 0.47, while in spam messages it is 0.29.", "The probability of seeing the word 'friend' in normal messages is 0.29, while in spam messages it is not specified.", "The probability of seeing the word 'lunch' in normal messages is 0.18, and 'money' in normal messages is 0.06."]}, {'end': 434.052, 'segs': [{'end': 313.666, 'src': 'heatmap', 'start': 283.202, 'weight': 0.874, 'content': [{'end': 289.206, 'text': 'This guess can be any probability that we want, but a common guess is estimated from the training data.', 'start': 283.202, 'duration': 6.004}, {'end': 299.861, 'text': 'For example, since 8 of the 12 messages are normal messages, our initial guess will be 0.67.', 'start': 290.587, 'duration': 9.274}, {'end': 303.102, 'text': "So let's put that under the normal messages so we don't forget it.", 'start': 299.861, 'duration': 3.241}, {'end': 313.666, 'text': "Oh no! It's another dreaded terminology alert! The initial guess that we observe a normal message is called a prior probability.", 'start': 304.663, 'duration': 9.003}], 'summary': 'In the given example, the initial guess for observing a normal message is 0.67, based on 8 out of 12 messages being normal.', 'duration': 30.464, 'max_score': 283.202, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/O2L2Uv9pdDA/pics/O2L2Uv9pdDA283202.jpg'}, {'end': 345.752, 'src': 'embed', 'start': 315.586, 'weight': 0, 'content': [{'end': 326.41, 'text': 'Now we multiply the initial guess by the probability that the word dear occurs in a normal message and the probability that the word friend occurs in a normal message.', 'start': 315.586, 'duration': 10.824}, {'end': 332.22, 'text': 'Now we just plug in the values that we worked out earlier and do the math.', 'start': 328.297, 'duration': 3.923}, {'end': 337.225, 'text': 'And we get 0.09.', 'start': 335.903, 'duration': 1.322}, {'end': 345.752, 'text': 'We can think of 0.09 as the score that Dear Friend gets if it is a normal message.', 'start': 337.225, 'duration': 8.527}], 'summary': "The score for 'dear friend' in a normal message is 0.09.", 'duration': 30.166, 'max_score': 315.586, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/O2L2Uv9pdDA/pics/O2L2Uv9pdDA315586.jpg'}, {'end': 434.052, 'src': 'embed', 'start': 406.235, 'weight': 1, 'content': [{'end': 410.179, 'text': 'Now we just plugged in the values that we worked out earlier and do the math.', 'start': 406.235, 'duration': 3.944}, {'end': 412.901, 'text': 'Bip, bip, bip, boop, bip.', 'start': 411.26, 'duration': 1.641}, {'end': 417.603, 'text': 'And we get 0.01.', 'start': 413.861, 'duration': 3.742}, {'end': 424.327, 'text': 'Like before, we can think of 0.01 as the score that Dear Friend gets if it is spam.', 'start': 417.603, 'duration': 6.724}, {'end': 434.052, 'text': 'However, technically, it is proportional to the probability that the message is spam, given that it says Dear Friend.', 'start': 425.768, 'duration': 8.284}], 'summary': "After calculations, the probability of 'dear friend' being spam is 0.01.", 'duration': 27.817, 'max_score': 406.235, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/O2L2Uv9pdDA/pics/O2L2Uv9pdDA406235.jpg'}], 'start': 254.199, 'title': 'Naive bayes probability calculation', 'summary': 'Explains the process of calculating the probability of a message being normal or spam using initial guesses and word probabilities, with a normal message score of 0.09 and a spam message score of 0.01.', 'chapters': [{'end': 434.052, 'start': 254.199, 'title': 'Naive bayes probability calculation', 'summary': 'Explains the process of calculating the probability of a message being normal or spam using initial guesses and word probabilities, with a normal message score of 0.09 and a spam message score of 0.01.', 'duration': 179.853, 'highlights': ["The initial guess about the probability of a message being normal is calculated as 0.67, based on the training data, with a score of 0.09 for the message 'Dear Friend' being a normal message.", "The initial guess about the probability of a message being spam is calculated as 0.33, based on the training data, with a score of 0.01 for the message 'Dear Friend' being a spam message."]}], 'duration': 179.853, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/O2L2Uv9pdDA/pics/O2L2Uv9pdDA254199.jpg', 'highlights': ["Initial guess for normal message probability: 0.67, with 'Dear Friend' score 0.09", "Initial guess for spam message probability: 0.33, with 'Dear Friend' score 0.01"]}, {'end': 911.532, 'segs': [{'end': 727.227, 'src': 'embed', 'start': 672.791, 'weight': 1, 'content': [{'end': 677.255, 'text': 'In this case, alpha equals 1, but we could have set it to anything.', 'start': 672.791, 'duration': 4.464}, {'end': 685.28, 'text': 'Now when we calculate the probabilities of observing each word, we never get zero.', 'start': 679.855, 'duration': 5.425}, {'end': 699.672, 'text': 'For example, the probability of seeing lunch, given that it is in SPAM, is 1 divided by 7, the total number of words in SPAM plus 4,', 'start': 686.701, 'duration': 12.971}, {'end': 701.453, 'text': 'the extra counts that we added.', 'start': 699.672, 'duration': 1.781}, {'end': 707.877, 'text': 'And that gives us 0.09.', 'start': 702.975, 'duration': 4.902}, {'end': 717.282, 'text': 'adding counts to each word does not change our initial guess that a message is normal or the initial guess that the message is spam,', 'start': 707.877, 'duration': 9.405}, {'end': 727.227, 'text': 'because adding a count to each word did not change the number of messages in the training data set that are normal or the number of messages that are spam.', 'start': 717.282, 'duration': 9.945}], 'summary': "Probability of seeing 'lunch' in spam: 0.09, alpha equals 1.", 'duration': 54.436, 'max_score': 672.791, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/O2L2Uv9pdDA/pics/O2L2Uv9pdDA672791.jpg'}, {'end': 792.181, 'src': 'embed', 'start': 760.935, 'weight': 3, 'content': [{'end': 767.158, 'text': 'The thing that makes Naive Bayes so naive is that it treats all word orders the same.', 'start': 760.935, 'duration': 6.223}, {'end': 778.397, 'text': 'For example, the normal message score for the phrase dear friend is the exact same for the score for friend dear.', 'start': 768.715, 'duration': 9.682}, {'end': 786.78, 'text': 'In other words, regardless of how the words are ordered, we get 0.08.', 'start': 779.698, 'duration': 7.082}, {'end': 792.181, 'text': 'Treating all word orders equal is very different from how you and I communicate.', 'start': 786.78, 'duration': 5.401}], 'summary': 'Naive bayes treats all word orders equally, resulting in consistent scores regardless of word order.', 'duration': 31.246, 'max_score': 760.935, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/O2L2Uv9pdDA/pics/O2L2Uv9pdDA760935.jpg'}, {'end': 846.702, 'src': 'embed', 'start': 819.107, 'weight': 0, 'content': [{'end': 828.092, 'text': 'That said, even though Naive Bayes is naive, it tends to perform surprisingly well when separating normal messages from spam.', 'start': 819.107, 'duration': 8.985}, {'end': 837.297, 'text': "In machine learning lingo, we'd say that by ignoring relationships among words, Naive Bayes has high bias.", 'start': 829.693, 'duration': 7.604}, {'end': 843.62, 'text': 'But because it works well in practice, Naive Bayes has low variance.', 'start': 838.617, 'duration': 5.003}, {'end': 846.702, 'text': 'Shameless self-promotion.', 'start': 844.981, 'duration': 1.721}], 'summary': 'Naive bayes performs well in separating normal messages from spam with high bias and low variance.', 'duration': 27.595, 'max_score': 819.107, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/O2L2Uv9pdDA/pics/O2L2Uv9pdDA819107.jpg'}], 'start': 435.7, 'title': 'Naive bayes classification', 'summary': 'Introduces the concept of naive bayes classification, demonstrating its application in classifying messages as normal or spam, and addresses the issue of zero probabilities in spam detection, showcasing high bias and low variance.', 'chapters': [{'end': 552.675, 'start': 435.7, 'title': 'Naive bayes classification basics', 'summary': "Introduces the concept of naive bayes classification, demonstrating its application in classifying messages as normal or spam based on word probabilities, and concludes with a more complex example of classifying a message containing the word 'money' four times.", 'duration': 116.975, 'highlights': ['We calculated the probabilities of seeing each word in a normal message and spam, then made initial guesses about the probability of seeing a normal message and spam, based on the training data set.', "We multiplied our initial guess of a message being normal by the probabilities of seeing the words 'dear' and 'friend', and concluded that it was a normal message because 0.09 is greater than 0.01.", "The transcript concludes with an example of classifying a message containing the word 'money' four times, setting the stage for a more complex application of naive Bayes classification."]}, {'end': 727.227, 'start': 554.015, 'title': 'Naive bayes spam detection', 'summary': 'Explains how naive bayes algorithm is used to predict spam messages, addressing the issue of zero probabilities by adding one count to each word in the histograms and calculating the probabilities of observing each word, ultimately ensuring that no probability becomes zero.', 'duration': 173.212, 'highlights': ['Naive Bayes algorithm is used to predict spam messages The chapter discusses how the Naive Bayes algorithm is utilized for predicting whether a message is spam or not.', 'Addressing the issue of zero probabilities by adding one count to each word in the histograms To prevent zero probabilities, one count is added to each word in the histograms, ensuring that no probability becomes zero.', 'Calculating the probabilities of observing each word The process of calculating the probabilities of observing each word, which ensures that no probability becomes zero.']}, {'end': 911.532, 'start': 729.108, 'title': 'Naive bayes in spam classification', 'summary': 'Discusses how naive bayes algorithm treats all word orders equally, ignoring grammar rules and common phrases but still performs well in separating normal messages from spam, showcasing high bias and low variance.', 'duration': 182.424, 'highlights': ['Naive Bayes treats all word orders equally, ignoring grammar rules and common phrases, which results in a score of 0.08 regardless of word order.', "Despite Naive Bayes' naivety in ignoring word relationships and grammar rules, it performs well in separating normal messages from spam, exhibiting high bias and low variance.", 'The message classification as spam is based on the value of the calculated scores, where the value for spam is greater than zero, leading to the classification of the message as spam.']}], 'duration': 475.832, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/O2L2Uv9pdDA/pics/O2L2Uv9pdDA435700.jpg', 'highlights': ['Naive Bayes algorithm predicts spam messages, showcasing high bias and low variance.', 'Addressing zero probabilities by adding one count to each word in the histograms.', 'Calculating probabilities of observing each word to prevent zero probabilities.', 'Naive Bayes treats all word orders equally, resulting in a score of 0.08 regardless of word order.', 'Naive Bayes performs well in separating normal messages from spam, exhibiting high bias and low variance.']}], 'highlights': ['Multinomial Naive Bayes Classifier is the focus of the discussion.', 'Explanation of normal messages and spam in the context of Naive Bayes.', 'Introduction of Gaussian Naive Bayes Classification for further learning.', "Initial guess for normal message probability: 0.67, with 'Dear Friend' score 0.09", "Initial guess for spam message probability: 0.33, with 'Dear Friend' score 0.01", 'Naive Bayes algorithm predicts spam messages, showcasing high bias and low variance.', 'Addressing zero probabilities by adding one count to each word in the histograms.', 'Calculating probabilities of observing each word to prevent zero probabilities.', 'Naive Bayes treats all word orders equally, resulting in a score of 0.08 regardless of word order.', 'Naive Bayes performs well in separating normal messages from spam, exhibiting high bias and low variance.', "The probability of seeing the word 'dear' in normal messages is 0.47, while in spam messages it is 0.29.", "The probability of seeing the word 'friend' in normal messages is 0.29, while in spam messages it is not specified.", "The probability of seeing the word 'lunch' in normal messages is 0.18, and 'money' in normal messages is 0.06."]}