title
Solving Wordle using information theory

description
An excuse to teach a lesson on information theory and entropy. Special thanks to these supporters: https://3b1b.co/lessons/wordle#thanks Help fund future projects: https://www.patreon.com/3blue1brown An equally valuable form of support is to simply share the videos. Contents: 0:00 - What is Wordle? 2:43 - Initial ideas 8:04 - Information theory basics 18:15 - Incorporating word frequencies 27:49 - Final performance Original wordle site: https://www.powerlanguage.co.uk/wordle/ Music by Vincent Rubinetti. https://www.vincentrubinetti.com/ Shannon and von Neumann artwork by Kurt Bruns. https://www.instagram.com/p/CZpRKhMJnD6/ Code for this video: https://github.com/3b1b/videos/tree/master/_2022/wordle These animations are largely made using a custom python library, manim. See the FAQ comments here: https://www.3blue1brown.com/faq#manim https://github.com/3b1b/manim https://github.com/ManimCommunity/manim/ You can find code for specific videos and projects here: https://github.com/3b1b/videos/ ------------------ 3blue1brown is a channel about animating math, in all senses of the word animate. And you know the drill with YouTube, if you want to stay posted on new videos, subscribe: http://3b1b.co/subscribe Various social media stuffs: Website: https://www.3blue1brown.com Twitter: https://twitter.com/3blue1brown Reddit: https://www.reddit.com/r/3blue1brown Instagram: https://www.instagram.com/3blue1brown_animations/ Patreon: https://patreon.com/3blue1brown Facebook: https://www.facebook.com/3blue1brown

detail
{'title': 'Solving Wordle using information theory', 'heatmap': [{'end': 1786.225, 'start': 1777.839, 'weight': 0.714}], 'summary': 'Explores using wordle to teach information theory and creating an algorithm to guess the mystery word in three attempts, covering strategies, measuring information and entropy, developing a wordle solver through simulations achieving an average score of 4.124, word frequency analysis and word probability in the wordle game, and the introduction of wordlebot version two improving decision-making and reducing uncertainty.', 'chapters': [{'end': 149.869, 'segs': [{'end': 28.111, 'src': 'embed', 'start': 0.009, 'weight': 1, 'content': [{'end': 5.558, 'text': 'The game Wordle has gone pretty viral in the last month or two, and never one to overlook an opportunity for a math lesson.', 'start': 0.009, 'duration': 5.549}, {'end': 12.688, 'text': 'It occurs to me that this game makes for a very good central example in a lesson about information theory, and in particular a topic known as entropy.', 'start': 5.778, 'duration': 6.91}, {'end': 17.827, 'text': 'You see, like a lot of people, I got kind of sucked into the puzzle and, like a lot of programmers,', 'start': 13.885, 'duration': 3.942}, {'end': 22.649, 'text': 'I also got sucked into trying to write an algorithm that would play the game as optimally as it could.', 'start': 17.827, 'duration': 4.822}, {'end': 28.111, 'text': "And what I thought I'd do here is just talk through with you some of my process in that and explain some of the math that went into it,", 'start': 23.129, 'duration': 4.982}], 'summary': "Wordle game's viral popularity used to teach information theory and entropy, with a focus on algorithm optimization.", 'duration': 28.102, 'max_score': 0.009, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA9.jpg'}, {'end': 82.249, 'src': 'embed', 'start': 55.552, 'weight': 0, 'content': [{'end': 60.315, 'text': "The goal of Wordle is to guess a mystery five-letter word, and you're given six different chances to guess.", 'start': 55.552, 'duration': 4.763}, {'end': 64.498, 'text': 'For example, my Wordle bot suggests that I start with the guess crane.', 'start': 60.795, 'duration': 3.703}, {'end': 70.221, 'text': 'Each time that you make a guess, you get some information about how close your guess is to the true answer.', 'start': 65.158, 'duration': 5.063}, {'end': 74.024, 'text': "Here, the gray box is telling me there's no C in the actual answer.", 'start': 70.862, 'duration': 3.162}, {'end': 77.866, 'text': "The yellow box is telling me there is an R, but it's not in that position.", 'start': 74.504, 'duration': 3.362}, {'end': 82.249, 'text': "The green box is telling me that the secret word does have an A, and it's in the third position.", 'start': 78.327, 'duration': 3.922}], 'summary': "Wordle is a game to guess a 5-letter word with 6 chances. e.g., 'crane' with feedback: no 'c', 'r' but not in position, 'a' at 3rd position.", 'duration': 26.697, 'max_score': 55.552, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA55552.jpg'}, {'end': 159.413, 'src': 'embed', 'start': 131.14, 'weight': 2, 'content': [{'end': 133.702, 'text': 'Which, hooray, is the actual answer, so we got it in three.', 'start': 131.14, 'duration': 2.562}, {'end': 140.725, 'text': "If you're wondering if that's any good, the way I heard one person phrase, it is that with Wordle, four is par and three is birdie,", 'start': 134.602, 'duration': 6.123}, {'end': 142.546, 'text': 'which I think is a pretty apt analogy.', 'start': 140.725, 'duration': 1.821}, {'end': 146.988, 'text': "You have to be consistently on your game to be getting four, but it's certainly not crazy.", 'start': 142.886, 'duration': 4.102}, {'end': 149.869, 'text': 'But when you get it in three, it just feels great.', 'start': 147.408, 'duration': 2.461}, {'end': 156.012, 'text': "So if you're down for it, what I'd like to do here is just talk through my thought process from the beginning for how I approach the Wordle bot.", 'start': 149.889, 'duration': 6.123}, {'end': 159.413, 'text': "And like I said, really, it's an excuse for an information theory lesson.", 'start': 156.432, 'duration': 2.981}], 'summary': 'Achieving a wordle answer in three attempts is considered excellent, with four being the average. the speaker will discuss their approach to the wordle bot as an information theory lesson.', 'duration': 28.273, 'max_score': 131.14, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA131140.jpg'}], 'start': 0.009, 'title': 'Wordle and information theory', 'summary': 'Explores using wordle to teach information theory and creating an algorithm to guess the mystery word in three attempts.', 'chapters': [{'end': 149.869, 'start': 0.009, 'title': 'Wordle and information theory', 'summary': 'Discusses how the game wordle can be used as a central example in a lesson about information theory and entropy, while also explaining the process of creating an algorithm to play the game optimally, with a successful result of guessing the mystery word in three attempts.', 'duration': 149.86, 'highlights': ['The game Wordle can be used as a central example in a lesson about information theory and entropy The game Wordle is discussed as a good central example in a lesson about information theory and entropy.', 'Creating an algorithm to play the game optimally The chapter explains the process of creating an algorithm to play the game optimally, resulting in successfully guessing the mystery word in three attempts.', 'Successful result of guessing the mystery word in three attempts The algorithm successfully guesses the mystery word in three attempts, which is considered a great achievement in the game Wordle.']}], 'duration': 149.86, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA9.jpg', 'highlights': ['Creating an algorithm to play the game optimally resulting in successfully guessing the mystery word in three attempts', 'The game Wordle can be used as a central example in a lesson about information theory and entropy', 'The algorithm successfully guesses the mystery word in three attempts, considered a great achievement in the game Wordle']}, {'end': 450.307, 'segs': [{'end': 194.278, 'src': 'embed', 'start': 168.499, 'weight': 0, 'content': [{'end': 173.604, 'text': 'My first thought in approaching this was to take a look at the relative frequencies of different letters in the English language.', 'start': 168.499, 'duration': 5.105}, {'end': 177.388, 'text': 'So I thought okay, is there an opening guess or an opening pair of guesses?', 'start': 174.285, 'duration': 3.103}, {'end': 179.23, 'text': 'that hits a lot of these most frequent letters?', 'start': 177.388, 'duration': 1.842}, {'end': 183.034, 'text': 'And one that I was pretty fond of was doing other, followed by nails.', 'start': 179.831, 'duration': 3.203}, {'end': 188.816, 'text': "The thought is that if you hit a letter, you know, you get a green or a yellow, that always feels good, it feels like you're getting information.", 'start': 183.735, 'duration': 5.081}, {'end': 194.278, 'text': "But in these cases, even if you don't hit and you always get greys, that's still giving you a lot of information,", 'start': 189.377, 'duration': 4.901}], 'summary': 'Analyze letter frequencies to improve guessing strategy.', 'duration': 25.779, 'max_score': 168.499, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA168499.jpg'}, {'end': 246.388, 'src': 'embed', 'start': 217.188, 'weight': 3, 'content': [{'end': 218.97, 'text': 'But who knows, maybe that is a better opener.', 'start': 217.188, 'duration': 1.782}, {'end': 224.154, 'text': 'Is there some kind of quantitative score that we can give to judge the quality of a potential guess?', 'start': 219.37, 'duration': 4.784}, {'end': 231.192, 'text': "Now, to set up for the way that we're going to rank possible guesses, let's go back and add a little clarity to how exactly the game is set up.", 'start': 225.305, 'duration': 5.887}, {'end': 237.779, 'text': "So there's a list of words that it will allow you to enter that are considered valid guesses that's just about 13,000 words long.", 'start': 231.732, 'duration': 6.047}, {'end': 243.726, 'text': "But when you look at it, there's a lot of really uncommon things, things like awhead or awli and arg,", 'start': 238.259, 'duration': 5.467}, {'end': 246.388, 'text': 'the kind of words that bring about family arguments in a game of Scrabble.', 'start': 243.726, 'duration': 2.662}], 'summary': 'The game involves ranking potential guesses from a list of about 13,000 words.', 'duration': 29.2, 'max_score': 217.188, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA217188.jpg'}, {'end': 279.402, 'src': 'embed', 'start': 250.97, 'weight': 1, 'content': [{'end': 255.331, 'text': "And in fact, there's another list of around 2,300 words that are the possible answers.", 'start': 250.97, 'duration': 4.361}, {'end': 261.093, 'text': "And this is a human-curated list, I think specifically by the game creator's girlfriend, which is kind of fun.", 'start': 255.991, 'duration': 5.102}, {'end': 263.053, 'text': 'But what I would like to do.', 'start': 261.713, 'duration': 1.34}, {'end': 270.055, 'text': "our challenge for this project is to see if we can write a program solving Wordle that doesn't incorporate previous knowledge about this list.", 'start': 263.053, 'duration': 7.002}, {'end': 274.798, 'text': "For one thing there's plenty of pretty common five-letter words that you won't find in that list,", 'start': 270.675, 'duration': 4.123}, {'end': 279.402, 'text': "so it would be better to write a program that's a little more resilient and would play Wordle against anyone,", 'start': 274.798, 'duration': 4.604}], 'summary': 'The challenge is to create a wordle-solving program without relying on a list of 2,300 words, aiming for resilience and inclusivity.', 'duration': 28.432, 'max_score': 250.97, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA250970.jpg'}, {'end': 366.723, 'src': 'embed', 'start': 338.79, 'weight': 2, 'content': [{'end': 342.813, 'text': "But the flip side of that, of course, is that it's very uncommon to get a pattern like this.", 'start': 338.79, 'duration': 4.023}, {'end': 351.578, 'text': 'Specifically, if each word was equally likely to be the answer, the probability of hitting this pattern would be 58 divided by around 13,000.', 'start': 343.393, 'duration': 8.185}, {'end': 353.6, 'text': "Of course, they're not equally likely to be answers.", 'start': 351.579, 'duration': 2.021}, {'end': 356.12, 'text': 'Most of these are very obscure and even questionable words.', 'start': 353.68, 'duration': 2.44}, {'end': 361.481, 'text': "But at least for our first pass at all of this, let's assume that they're all equally likely and then refine that a bit later.", 'start': 356.56, 'duration': 4.921}, {'end': 366.723, 'text': 'The point is, the pattern with a lot of information is, by its very nature, unlikely to occur.', 'start': 361.962, 'duration': 4.761}], 'summary': 'Uncommon pattern with 58 out of around 13,000 words, mostly obscure and questionable.', 'duration': 27.933, 'max_score': 338.79, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA338790.jpg'}, {'end': 423.985, 'src': 'embed', 'start': 394.281, 'weight': 4, 'content': [{'end': 400.687, 'text': 'To get a more global view here, let me show you the full distribution of probabilities across all of the different patterns that you might see.', 'start': 394.281, 'duration': 6.406}, {'end': 407.072, 'text': "So each bar that you're looking at corresponds to a possible pattern of colors that could be revealed,", 'start': 401.908, 'duration': 5.164}, {'end': 412.317, 'text': "of which there are three to the fifth possibilities, and they're organized from left to right, most common to least common.", 'start': 407.072, 'duration': 5.245}, {'end': 416, 'text': 'So the most common possibility here is that you get all grays.', 'start': 412.917, 'duration': 3.083}, {'end': 417.041, 'text': 'That happens about 14% of the time.', 'start': 416.14, 'duration': 0.901}, {'end': 423.985, 'text': "And what you're hoping for when you make a guess is that you end up somewhere out in this long tail like over here,", 'start': 418.602, 'duration': 5.383}], 'summary': 'Showing distribution of color patterns with 3^5 possibilities, most common being all grays at 14%.', 'duration': 29.704, 'max_score': 394.281, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA394281.jpg'}], 'start': 149.889, 'title': 'Wordle strategies', 'summary': 'Covers an approach to solving wordle using information theory, analyzing letter frequencies, and ranking potential guesses. it also discusses word pattern analysis, including the selection of an opening guess, likelihood of patterns, and distribution of probabilities, emphasizing the importance of informative patterns.', 'chapters': [{'end': 305.02, 'start': 149.889, 'title': 'Approach to wordle bot', 'summary': "Discusses the approach to solving wordle using information theory, analyzing letter frequencies, and ranking potential guesses based on the game's setup and the creator's word lists.", 'duration': 155.131, 'highlights': ['The game is set up with a list of 13,000 valid guesses and around 2,300 possible answers, which are common words and a human-curated list.', 'Analyzing letter frequencies in English language to find an opening guess or pair of guesses that contains the most frequent letters.', 'Considering the order of letters in potential guesses to systematically rank them, questioning the quantitative score to judge the quality of a guess.', 'Exploring a program solving Wordle without previous knowledge about the list of possible answers to make it resilient and adaptable to any opponent.']}, {'end': 450.307, 'start': 305.02, 'title': 'Word pattern analysis', 'summary': 'Discusses the process of analyzing word patterns, covering the selection of an opening guess from a large pool of possibilities, the likelihood of certain patterns, and the distribution of probabilities across all patterns, ultimately emphasizing the importance of unlikely but informative patterns.', 'duration': 145.287, 'highlights': ['The probability of hitting a specific pattern from a pool of 13,000 words reduces to only 58, demonstrating a significant reduction in possibilities.', 'A pattern with a lot of information is unlikely to occur, making it informative, with a probability of about 11% for a specific pattern with 1400 possible matches, highlighting the least informative yet most likely outcomes.', 'The full distribution of probabilities across all possible patterns shows a range from most common to least common, emphasizing the significance of aiming for the long tail with fewer possibilities.', "The answers to the puzzle about English words starting with 'W', ending with 'Y', and having an 'R' somewhere in them are 'wordy', 'wormy', and 'riley', providing an interesting example to judge word quality."]}], 'duration': 300.418, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA149889.jpg', 'highlights': ['Analyzing letter frequencies in English language to find an opening guess or pair of guesses that contains the most frequent letters.', 'The game is set up with a list of 13,000 valid guesses and around 2,300 possible answers, which are common words and a human-curated list.', 'The probability of hitting a specific pattern from a pool of 13,000 words reduces to only 58, demonstrating a significant reduction in possibilities.', 'Considering the order of letters in potential guesses to systematically rank them, questioning the quantitative score to judge the quality of a guess.', 'The full distribution of probabilities across all possible patterns shows a range from most common to least common, emphasizing the significance of aiming for the long tail with fewer possibilities.']}, {'end': 915.653, 'segs': [{'end': 484.158, 'src': 'embed', 'start': 450.307, 'weight': 0, 'content': [{'end': 455.75, 'text': "we want some kind of measure of the expected amount of information that you're going to get from this distribution.", 'start': 450.307, 'duration': 5.443}, {'end': 464.575, 'text': 'If we go through each pattern and we multiply its probability of occurring times something that measures how informative it is that can maybe give us an objective score.', 'start': 456.311, 'duration': 8.264}, {'end': 469.825, 'text': 'Now, your first instinct for what that something should be might be the number of matches.', 'start': 465.922, 'duration': 3.903}, {'end': 472.308, 'text': 'You want a lower average number of matches.', 'start': 470.226, 'duration': 2.082}, {'end': 477.332, 'text': "But instead I'd like to use a more universal measurement that we often ascribe to information,", 'start': 472.888, 'duration': 4.444}, {'end': 484.158, 'text': "and one that will be more flexible once we have a different probability assigned to each of these 13,000 words for whether or not they're actually the answer.", 'start': 477.332, 'duration': 6.826}], 'summary': 'Objective score for expected information using flexible measurement.', 'duration': 33.851, 'max_score': 450.307, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA450307.jpg'}, {'end': 549.691, 'src': 'embed', 'start': 525.177, 'weight': 3, 'content': [{'end': 530.68, 'text': "If the observation cuts that space by a factor of eight, we say it's three bits of information, and so on and so forth.", 'start': 525.177, 'duration': 5.503}, {'end': 533.862, 'text': 'Four bits cuts it into a sixteenth, five bits cuts it into a thirty-second.', 'start': 530.821, 'duration': 3.041}, {'end': 543.008, 'text': 'So now is when you might want to take a moment and pause and ask for yourself what is the formula for information for the number of bits in terms of the probability of an occurrence?', 'start': 534.923, 'duration': 8.085}, {'end': 549.691, 'text': "Well, what we're saying here is basically that when you take 1 half to the number of bits, that's the same thing as the probability,", 'start': 544.147, 'duration': 5.544}], 'summary': 'The information is cut by 1/8 for 3 bits, 1/16 for 4 bits, and 1/32 for 5 bits. the formula for information is related to the probability of an occurrence.', 'duration': 24.514, 'max_score': 525.177, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA525177.jpg'}, {'end': 685.228, 'src': 'embed', 'start': 658.414, 'weight': 4, 'content': [{'end': 665.396, 'text': 'So on average, the information you get from this opening guess is as good as chopping your space of possibilities in half about five times.', 'start': 658.414, 'duration': 6.982}, {'end': 671.618, 'text': 'By contrast, an example of a guess with a higher expected information value would be something like Slate.', 'start': 665.956, 'duration': 5.662}, {'end': 675.6, 'text': "In this case, you'll notice the distribution looks a lot flatter.", 'start': 673.158, 'duration': 2.442}, {'end': 681.204, 'text': 'In particular, the most probable occurrence of all grays only has about a 6% chance of occurring.', 'start': 676.12, 'duration': 5.084}, {'end': 685.228, 'text': "So at minimum, you're getting, evidently, 3.9 bits of information.", 'start': 681.585, 'duration': 3.643}], 'summary': 'Opening guess provides 3.9 bits of information, while slate gives higher value.', 'duration': 26.814, 'max_score': 658.414, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA658414.jpg'}, {'end': 791.719, 'src': 'embed', 'start': 763.117, 'weight': 2, 'content': [{'end': 769.427, 'text': 'And for our purposes here, when I use the word entropy, I just want you to think the expected information value of a particular guess.', 'start': 763.117, 'duration': 6.31}, {'end': 773.847, 'text': 'You can think of entropy as measuring two things simultaneously.', 'start': 770.765, 'duration': 3.082}, {'end': 776.789, 'text': 'The first one is how flat is the distribution.', 'start': 774.347, 'duration': 2.442}, {'end': 780.992, 'text': 'The closer a distribution is to uniform, the higher that entropy will be.', 'start': 777.369, 'duration': 3.623}, {'end': 786.336, 'text': 'In our case, where there are 3 to the 5th total patterns for a uniform distribution,', 'start': 781.652, 'duration': 4.684}, {'end': 791.719, 'text': 'observing any one of them would have information log base 2 of 3 to the 5th, which happens to be 7.92..', 'start': 786.336, 'duration': 5.383}], 'summary': 'Entropy measures the flatness of distribution; for 3^5 patterns, entropy is 7.92.', 'duration': 28.602, 'max_score': 763.117, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA763117.jpg'}, {'end': 910.169, 'src': 'embed', 'start': 879.055, 'weight': 5, 'content': [{'end': 883.796, 'text': 'You search through all 13,000 possibilities and you find the one that maximizes that entropy.', 'start': 879.055, 'duration': 4.741}, {'end': 886.794, 'text': 'To show you how this works in action,', 'start': 885.393, 'duration': 1.401}, {'end': 892.097, 'text': 'let me just pull up a little variant of Wordle that I wrote that shows the highlights of this analysis in the margins.', 'start': 886.794, 'duration': 5.303}, {'end': 899.703, 'text': "So after doing all its entropy calculations, on the right here it's showing us which ones have the highest expected information.", 'start': 893.939, 'duration': 5.764}, {'end': 910.169, 'text': "Turns out the top answer, at least at the moment, we'll refine this later, is tares, which means, um, of course, a vetch, the most common vetch.", 'start': 900.203, 'duration': 9.966}], 'summary': "Entropy analysis finds top word 'tares' out of 13,000 possibilities", 'duration': 31.114, 'max_score': 879.055, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA879055.jpg'}], 'start': 450.307, 'title': 'Measuring information and entropy', 'summary': 'Covers a universal approach for measuring expected information from distributions, aiming for a lower average number of matches. it also explains the concept of bits as a unit of information, its relation to probability, and practical application in wordle, aiding in informed guesses.', 'chapters': [{'end': 484.158, 'start': 450.307, 'title': 'Measuring information from distributions', 'summary': 'Discusses the measurement of expected information from a distribution, proposing a universal and flexible approach based on probability and informativeness, aiming for a lower average number of matches.', 'duration': 33.851, 'highlights': ['Using a universal measurement for informativeness allows flexibility with different probabilities assigned to each of the 13,000 words, enabling a more objective and adaptable scoring system.', "Proposing a measure of expected information by multiplying the probability of occurrence with a measure of informativeness aims to provide an objective score for evaluating the distribution's information content."]}, {'end': 915.653, 'start': 490.563, 'title': 'Unit of information and entropy', 'summary': 'Explains the concept of bits as a unit of information, how it relates to probability, and the concept of entropy. it also discusses the practical application of entropy in the game wordle, showing how it helps in making informed guesses.', 'duration': 425.09, 'highlights': ['The formula for information is the log base 2 of 1 divided by the probability, with entropy measuring both the flatness of the distribution and the number of possibilities. The formula for information is the log base 2 of 1 divided by the probability, and entropy measures the flatness of the distribution and the number of possibilities.', 'Observations that cut down the space of possibilities by a factor of four provide two bits of information, and the addition of information is similar to the multiplication of probabilities. Observations that cut down the space of possibilities by a factor of four provide two bits of information, and the addition of information is similar to the multiplication of probabilities.', "The average information from the first guess in the game Wordle is about 4.9 bits for the word 'Weary' and about 5.8 bits for the word 'Slate'. The average information from the first guess in the game Wordle is about 4.9 bits for the word 'Weary' and about 5.8 bits for the word 'Slate'.", 'The Wordlebot uses entropy to calculate the highest expected information for making guesses in the game Wordle. The Wordlebot uses entropy to calculate the highest expected information for making guesses in the game Wordle.']}], 'duration': 465.346, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA450307.jpg', 'highlights': ['Using a universal measurement for informativeness allows flexibility with different probabilities assigned to each of the 13,000 words, enabling a more objective and adaptable scoring system.', "Proposing a measure of expected information by multiplying the probability of occurrence with a measure of informativeness aims to provide an objective score for evaluating the distribution's information content.", 'The formula for information is the log base 2 of 1 divided by the probability, with entropy measuring both the flatness of the distribution and the number of possibilities.', 'Observations that cut down the space of possibilities by a factor of four provide two bits of information, and the addition of information is similar to the multiplication of probabilities.', "The average information from the first guess in the game Wordle is about 4.9 bits for the word 'Weary' and about 5.8 bits for the word 'Slate'.", 'The Wordlebot uses entropy to calculate the highest expected information for making guesses in the game Wordle.']}, {'end': 1093.157, 'segs': [{'end': 1093.157, 'src': 'embed', 'start': 1033.478, 'weight': 0, 'content': [{'end': 1039.724, 'text': "So it just keeps going, trying to gain as much information as it can, until it's only one possibility left, and then it guesses it.", 'start': 1033.478, 'duration': 6.246}, {'end': 1042.627, 'text': 'So obviously we need a better endgame strategy,', 'start': 1040.306, 'duration': 2.321}, {'end': 1048.074, 'text': "but let's say we call this version 1 of our Wordle solver and then we go and run some simulations to see how it does.", 'start': 1042.627, 'duration': 5.447}, {'end': 1054.11, 'text': "So the way this is working is it's playing every possible Wordle game.", 'start': 1050.669, 'duration': 3.441}, {'end': 1058.552, 'text': "It's going through all of those 2315 words that are the actual Wordle answers.", 'start': 1054.15, 'duration': 4.402}, {'end': 1060.473, 'text': "It's basically using that as a testing set.", 'start': 1058.692, 'duration': 1.781}, {'end': 1069.777, 'text': 'And with this naive method of not considering how common a word is and just trying to maximize the information at each step along the way until it gets down to one and only one choice.', 'start': 1061.273, 'duration': 8.504}, {'end': 1073.518, 'text': 'By the end of the simulation, the average score works out to be about 4.124, which..', 'start': 1070.437, 'duration': 3.081}, {'end': 1077.901, 'text': "You know, it's not bad, to be honest.", 'start': 1075.979, 'duration': 1.922}, {'end': 1079.183, 'text': 'I kind of expected to do worse.', 'start': 1077.941, 'duration': 1.242}, {'end': 1082.987, 'text': 'But the people who play Wordle will tell you that they can usually get it in 4.', 'start': 1079.663, 'duration': 3.324}, {'end': 1085.309, 'text': 'The real challenge is to get as many in 3 as you can.', 'start': 1082.987, 'duration': 2.322}, {'end': 1087.812, 'text': "It's a pretty big jump between the score of 4 and the score of 3.", 'start': 1085.509, 'duration': 2.303}, {'end': 1093.157, 'text': 'The obvious low-hanging fruit here is to somehow incorporate whether or not a word is common.', 'start': 1087.812, 'duration': 5.345}], 'summary': 'Version 1 of wordle solver achieves an average score of 4.124 in simulations, aiming to improve endgame strategy by considering word frequency.', 'duration': 59.679, 'max_score': 1033.478, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA1033478.jpg'}], 'start': 916.274, 'title': 'Developing a wordle solver through simulations', 'summary': 'Explains the process of developing a wordle solver through simulations, achieving an average score of 4.124, while emphasizing the challenge of achieving scores of 3 and the need to incorporate word commonality.', 'chapters': [{'end': 1093.157, 'start': 916.274, 'title': 'Wordle solver simulation', 'summary': 'Explains the process of developing a wordle solver through simulations, achieving an average score of 4.124, while emphasizing the challenge of achieving scores of 3 and the need to incorporate word commonality.', 'duration': 176.883, 'highlights': ['The chapter explains the process of developing a Wordle solver through simulations The speaker discusses the development of a Wordle solver by running simulations on all 2315 actual Wordle answers, achieving an average score of 4.124.', "Emphasizing the challenge of achieving scores of 3 The speaker mentions the difficulty in achieving Wordle scores of 3, as players usually aim for this target, indicating the challenge in improving the solver's performance.", "The need to incorporate word commonality The speaker acknowledges the need to factor in word commonality in the solver's strategy, suggesting it as an area for improvement to achieve higher scores in the Wordle game."]}], 'duration': 176.883, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA916274.jpg', 'highlights': ['The chapter explains the process of developing a Wordle solver through simulations, achieving an average score of 4.124.', 'Emphasizing the challenge of achieving scores of 3, as players usually aim for this target.', "The need to incorporate word commonality in the solver's strategy, suggesting it as an area for improvement."]}, {'end': 1392.071, 'segs': [{'end': 1138.504, 'src': 'embed', 'start': 1093.157, 'weight': 0, 'content': [{'end': 1094.499, 'text': 'and how exactly do we do that?', 'start': 1093.157, 'duration': 1.342}, {'end': 1107.826, 'text': 'The way I approached it is to get a list of the relative frequencies for all of the words in the English language.', 'start': 1103.103, 'duration': 4.723}, {'end': 1114.89, 'text': "And I just used Mathematica's word frequency data function, which itself pulls from the Google Books English Ngram public dataset.", 'start': 1108.426, 'duration': 6.464}, {'end': 1119.593, 'text': "And it's kind of fun to look at, for example, if we sort it from the most common words to the least common words.", 'start': 1115.37, 'duration': 4.223}, {'end': 1123.015, 'text': 'Evidently, these are the most common five-letter words in the English language.', 'start': 1120.073, 'duration': 2.942}, {'end': 1125.816, 'text': 'Or rather, these is the eighth most common.', 'start': 1123.695, 'duration': 2.121}, {'end': 1128.758, 'text': "First is which, after which there's there and there.", 'start': 1126.237, 'duration': 2.521}, {'end': 1133.901, 'text': 'First itself is not first but ninth, and it makes sense that these other words could come about more often.', 'start': 1129.238, 'duration': 4.663}, {'end': 1138.504, 'text': 'Where those after first are after, where, and those being just a little bit less common.', 'start': 1134.461, 'duration': 4.043}], 'summary': "Analyzed english word frequencies using mathematica's word frequency data function, derived from google books english ngram public dataset.", 'duration': 45.347, 'max_score': 1093.157, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA1093157.jpg'}, {'end': 1202.199, 'src': 'embed', 'start': 1179.631, 'weight': 3, 'content': [{'end': 1188.454, 'text': "the probability that I'm assigning to each word for being in the final list will be the value of the sigmoid function above wherever it sits on the x-axis.", 'start': 1179.631, 'duration': 8.823}, {'end': 1191.535, 'text': 'Now, obviously this depends on a few parameters.', 'start': 1189.475, 'duration': 2.06}, {'end': 1199.458, 'text': 'For example, how wide a space on the x-axis those words fill determines how gradually or steeply we drop off from 1 to 0,', 'start': 1191.775, 'duration': 7.683}, {'end': 1202.199, 'text': 'and where we situate them left to right determines the cutoff.', 'start': 1199.458, 'duration': 2.741}], 'summary': 'Assigning probabilities to words based on sigmoid function and parameters like x-axis space and cutoff.', 'duration': 22.568, 'max_score': 1179.631, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA1179631.jpg'}, {'end': 1250.461, 'src': 'embed', 'start': 1217.048, 'weight': 5, 'content': [{'end': 1223.811, 'text': 'Now, once we have a distribution like this across the words, it gives us another situation where entropy becomes this really useful measurement.', 'start': 1217.048, 'duration': 6.763}, {'end': 1229.634, 'text': "For example, let's say we were playing a game and we start with my old openers, which were other and nails,", 'start': 1224.412, 'duration': 5.222}, {'end': 1233.016, 'text': "and we end up with a situation where there's four possible words that match it.", 'start': 1229.634, 'duration': 3.382}, {'end': 1235.537, 'text': "And let's say we consider them all equally likely.", 'start': 1233.556, 'duration': 1.981}, {'end': 1238.879, 'text': 'Let me ask you what is the entropy of this distribution?', 'start': 1236.338, 'duration': 2.541}, {'end': 1250.461, 'text': "Well, the information associated with each one of these possibilities is going to be the log base 2 of 4, since each one is 1 and 4, and that's 2..", 'start': 1241.055, 'duration': 9.406}], 'summary': 'Entropy measurement is useful in situations with equally likely possibilities, resulting in entropy of 2.', 'duration': 33.413, 'max_score': 1217.048, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA1217048.jpg'}, {'end': 1350.156, 'src': 'embed', 'start': 1321.669, 'weight': 4, 'content': [{'end': 1324.652, 'text': 'We have these two distinct-feeling applications for entropy.', 'start': 1321.669, 'duration': 2.983}, {'end': 1329.578, 'text': "The first one telling us what's the expected information we'll get from a given guess,", 'start': 1325.253, 'duration': 4.325}, {'end': 1335.445, 'text': 'and the second one saying can we measure the remaining uncertainty among all of the words that we have possible?', 'start': 1329.578, 'duration': 5.867}, {'end': 1342.951, 'text': "And I should emphasize in that first case, where we're looking at the expected information of a guess once we have an unequal weighting to the words.", 'start': 1336.366, 'duration': 6.585}, {'end': 1344.512, 'text': 'that affects the entropy calculation.', 'start': 1342.951, 'duration': 1.561}, {'end': 1350.156, 'text': 'For example, let me pull up that same case we were looking at earlier of the distribution associated with weary,', 'start': 1344.952, 'duration': 5.204}], 'summary': 'Applications for entropy: measuring expected information and remaining uncertainty among words with unequal weighting.', 'duration': 28.487, 'max_score': 1321.669, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA1321669.jpg'}], 'start': 1093.157, 'title': 'Word frequency analysis and word probability in wordle game', 'summary': "Discusses analyzing word frequencies in the english language using mathematica's word frequency data function, sourced from the google books english ngram public dataset, showcasing the most common five-letter words. it also explores the use of sigmoid function to assign probabilities to words, the impact of word distribution on entropy, and the application of entropy in measuring remaining uncertainty in the wordle game.", 'chapters': [{'end': 1138.504, 'start': 1093.157, 'title': 'Word frequency analysis', 'summary': "Discusses analyzing word frequencies in the english language using mathematica's word frequency data function, sourced from the google books english ngram public dataset, showcasing the most common five-letter words.", 'duration': 45.347, 'highlights': ["The most common five-letter words in the English language are 'which,' 'there,' 'first,' 'after,' and 'where,' with 'which' being the most common.", "Mathematica's word frequency data function uses the Google Books English Ngram public dataset to analyze word frequencies in the English language.", 'The approach involves obtaining a list of relative frequencies for all words in the English language to analyze their frequency distribution.']}, {'end': 1392.071, 'start': 1139.224, 'title': 'Word probability and entropy in wordle game', 'summary': 'Discusses the use of sigmoid function to assign probabilities to words, the impact of word distribution on entropy, and the application of entropy in measuring remaining uncertainty in the wordle game.', 'duration': 252.847, 'highlights': ['The use of sigmoid function to assign probabilities to words The author uses the sigmoid function to assign probabilities to words, creating a binary cutoff for word selection based on the position of the word on the x-axis.', 'The impact of word distribution on entropy The distribution of words across the x-axis determines the gradual or steep drop-off from 1 to 0, affecting the entropy measurement and the level of uncertainty in the final answer.', 'Application of entropy in measuring remaining uncertainty in Wordle game Entropy is used to measure the remaining uncertainty among all possible words in the Wordle game, providing insight into the expected information from a given guess and the unequal weighting of words affecting the entropy calculation.']}], 'duration': 298.914, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA1093157.jpg', 'highlights': ["Mathematica's word frequency data function uses the Google Books English Ngram public dataset to analyze word frequencies in the English language.", "The most common five-letter words in the English language are 'which,' 'there,' 'first,' 'after,' and 'where,' with 'which' being the most common.", 'The approach involves obtaining a list of relative frequencies for all words in the English language to analyze their frequency distribution.', 'The use of sigmoid function to assign probabilities to words The author uses the sigmoid function to assign probabilities to words, creating a binary cutoff for word selection based on the position of the word on the x-axis.', 'Application of entropy in measuring remaining uncertainty in Wordle game Entropy is used to measure the remaining uncertainty among all possible words in the Wordle game, providing insight into the expected information from a given guess and the unequal weighting of words affecting the entropy calculation.', 'The impact of word distribution on entropy The distribution of words across the x-axis determines the gradual or steep drop-off from 1 to 0, affecting the entropy measurement and the level of uncertainty in the final answer.']}, {'end': 1793.69, 'segs': [{'end': 1449.607, 'src': 'embed', 'start': 1411.207, 'weight': 2, 'content': [{'end': 1414.85, 'text': 'is now using the more refined distributions across the patterns.', 'start': 1411.207, 'duration': 3.643}, {'end': 1418.153, 'text': 'that incorporates the probability that a given word would actually be the answer.', 'start': 1414.85, 'duration': 3.303}, {'end': 1423.685, 'text': 'As it happens, tares is still number one, though the ones following are a bit different.', 'start': 1419.242, 'duration': 4.443}, {'end': 1430.168, 'text': "Second, when it ranks its top picks, it's now going to keep a model of the probability that each word is the actual answer,", 'start': 1424.225, 'duration': 5.943}, {'end': 1434.951, 'text': "and it'll incorporate that into its decision, which is easier to see once we have a few guesses on the table.", 'start': 1430.168, 'duration': 4.783}, {'end': 1439.701, 'text': "Again, ignoring its recommendation because we can't let machines rule our lives.", 'start': 1435.839, 'duration': 3.862}, {'end': 1445.585, 'text': 'And I suppose I should mention another thing different here is over on the left, that uncertainty value,', 'start': 1440.902, 'duration': 4.683}, {'end': 1449.607, 'text': 'that number of bits is no longer just redundant with the number of possible matches.', 'start': 1445.585, 'duration': 4.022}], 'summary': 'Refined distributions used, tares still top, incorporating word probability for decision making.', 'duration': 38.4, 'max_score': 1411.207, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA1411207.jpg'}, {'end': 1683.477, 'src': 'embed', 'start': 1659.941, 'weight': 0, 'content': [{'end': 1666.763, 'text': 'And remember the whole point of doing any of that is so that we can quantify this intuition that the more information we gain from a word,', 'start': 1659.941, 'duration': 6.822}, {'end': 1668.304, 'text': 'the lower the expected score will be.', 'start': 1666.763, 'duration': 1.541}, {'end': 1676.203, 'text': 'So, with this as version 2.0,, if we go back and we run the same set of simulations, having it play against all 2,315 possible Wordle answers,', 'start': 1669.653, 'duration': 6.55}, {'end': 1676.564, 'text': 'how does it do??', 'start': 1676.203, 'duration': 0.361}, {'end': 1683.477, 'text': "Well, in contrast to our first version, it's definitely better, which is reassuring.", 'start': 1680.316, 'duration': 3.161}], 'summary': 'Version 2.0 improves performance against 2,315 possible wordle answers.', 'duration': 23.536, 'max_score': 1659.941, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA1659941.jpg'}, {'end': 1725.603, 'src': 'embed', 'start': 1698.923, 'weight': 1, 'content': [{'end': 1702.863, 'text': 'So, can we do better than 3.6? We definitely can.', 'start': 1698.923, 'duration': 3.94}, {'end': 1709.35, 'text': "Now, I said at the start that it's most fun to try not incorporating the true list of Wordle answers into the way that it builds its model.", 'start': 1703.203, 'duration': 6.147}, {'end': 1715.117, 'text': 'But if we do incorporate it, the best performance I could get was around 3.43.', 'start': 1709.851, 'duration': 5.266}, {'end': 1720.8, 'text': 'So if we try to get more sophisticated than just using word frequency data to choose this prior distribution,', 'start': 1715.117, 'duration': 5.683}, {'end': 1725.603, 'text': 'this 3.43 probably gives a max at how good we could get with that, or at least how good I could get with that.', 'start': 1720.8, 'duration': 4.803}], 'summary': 'By incorporating the true list of wordle answers, the best performance achieved was around 3.43, indicating potential for improvement from the current 3.6.', 'duration': 26.68, 'max_score': 1698.923, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA1698923.jpg'}, {'end': 1765.695, 'src': 'embed', 'start': 1735.049, 'weight': 5, 'content': [{'end': 1739.912, 'text': "Originally I was planning on talking more about that, but I realize we've actually gone quite long as it is.", 'start': 1735.049, 'duration': 4.863}, {'end': 1747.139, 'text': "The one thing I'll say is, after doing this two-step search and then running a couple sample simulations in the top candidates so far for me at least,", 'start': 1740.512, 'duration': 6.627}, {'end': 1749.081, 'text': "it's looking like Crane is the best opener.", 'start': 1747.139, 'duration': 1.942}, {'end': 1750.022, 'text': 'Who would have guessed?', 'start': 1749.442, 'duration': 0.58}, {'end': 1757.753, 'text': 'Also, if you use the true word list to determine your space of possibilities, then the uncertainty you start with is a little over 11 bits.', 'start': 1750.891, 'duration': 6.862}, {'end': 1765.695, 'text': 'And it turns out, just from a brute force search, the maximum possible expected information after the first two guesses is around 10 bits.', 'start': 1758.213, 'duration': 7.482}], 'summary': 'After simulations, crane is the best opener with an uncertainty of over 11 bits and a maximum expected information of around 10 bits.', 'duration': 30.646, 'max_score': 1735.049, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA1735049.jpg'}, {'end': 1786.225, 'src': 'heatmap', 'start': 1777.839, 'weight': 0.714, 'content': [{'end': 1784.404, 'text': "But I think it's fair and probably pretty conservative to say that you could never possibly write an algorithm that gets this average as low as 3,", 'start': 1777.839, 'duration': 6.565}, {'end': 1786.225, 'text': 'because with the words available to you,', 'start': 1784.404, 'duration': 1.821}], 'summary': "It's not possible to achieve an average as low as 3 with available words.", 'duration': 8.386, 'max_score': 1777.839, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA1777839.jpg'}], 'start': 1392.411, 'title': 'Wordlebot version 2 update and improved wordle strategy', 'summary': 'Introduces wordlebot version two, which improves decision-making and reduces uncertainty. it also discusses the development of a new wordle-playing algorithm, which has improved the average guesses to 3.6 and explores potential further refinements.', 'chapters': [{'end': 1486.071, 'start': 1392.411, 'title': 'Wordlebot version 2 update', 'summary': 'Introduces wordlebot version two, which incorporates more refined distributions across patterns and ranks its top picks based on the probability of each word being the actual answer, improving decision-making and reducing uncertainty.', 'duration': 93.66, 'highlights': ['The Wordlebot version two incorporates more refined distributions across patterns and ranks its top picks based on the probability of each word being the actual answer, improving decision-making.', 'The uncertainty value, representing the number of bits, is now more accurate and not just redundant with the number of possible matches, reducing uncertainty in decision-making.', 'The computation of entropies and expected values of information now uses more refined distributions across patterns, incorporating the probability that a given word would actually be the answer.']}, {'end': 1793.69, 'start': 1488.595, 'title': 'Improving wordle strategy', 'summary': 'Discusses the development of a new version of a wordle-playing algorithm, highlighting its improvement to an average of 3.6 guesses, and exploring the potential for further refinement.', 'duration': 305.095, 'highlights': ['The new Wordle-playing algorithm version 2.0 has improved to an average of 3.6 guesses, with a maximum of more than 6 in certain circumstances, showcasing its enhanced performance compared to the previous version.', "Incorporating the true word list into the algorithm's model could potentially lead to a further improvement, with the best performance reaching an average of 3.43 guesses.", "The algorithm uses a two-step search for expected information, leading to the identification of 'Crane' as the best opener, and suggests that with optimal play, there could be around one bit of uncertainty left after the first two guesses."]}], 'duration': 401.279, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/v68zYyaEmEA/pics/v68zYyaEmEA1392411.jpg', 'highlights': ['The new Wordle-playing algorithm version 2.0 has improved to an average of 3.6 guesses, with a maximum of more than 6 in certain circumstances, showcasing its enhanced performance compared to the previous version.', "Incorporating the true word list into the algorithm's model could potentially lead to a further improvement, with the best performance reaching an average of 3.43 guesses.", 'The Wordlebot version two incorporates more refined distributions across patterns and ranks its top picks based on the probability of each word being the actual answer, improving decision-making.', 'The uncertainty value, representing the number of bits, is now more accurate and not just redundant with the number of possible matches, reducing uncertainty in decision-making.', 'The computation of entropies and expected values of information now uses more refined distributions across patterns, incorporating the probability that a given word would actually be the answer.', "The algorithm uses a two-step search for expected information, leading to the identification of 'Crane' as the best opener, and suggests that with optimal play, there could be around one bit of uncertainty left after the first two guesses."]}], 'highlights': ['The algorithm successfully guesses the mystery word in three attempts, considered a great achievement in the game Wordle', 'The game Wordle can be used as a central example in a lesson about information theory and entropy', 'Creating an algorithm to play the game optimally resulting in successfully guessing the mystery word in three attempts', 'The new Wordle-playing algorithm version 2.0 has improved to an average of 3.6 guesses, with a maximum of more than 6 in certain circumstances, showcasing its enhanced performance compared to the previous version', "Incorporating the true word list into the algorithm's model could potentially lead to a further improvement, with the best performance reaching an average of 3.43 guesses", 'The Wordlebot version two incorporates more refined distributions across patterns and ranks its top picks based on the probability of each word being the actual answer, improving decision-making', "The average information from the first guess in the game Wordle is about 4.9 bits for the word 'Weary' and about 5.8 bits for the word 'Slate'", 'The chapter explains the process of developing a Wordle solver through simulations, achieving an average score of 4.124', 'The game is set up with a list of 13,000 valid guesses and around 2,300 possible answers, which are common words and a human-curated list', 'Using a universal measurement for informativeness allows flexibility with different probabilities assigned to each of the 13,000 words, enabling a more objective and adaptable scoring system', 'The Wordlebot uses entropy to calculate the highest expected information for making guesses in the game Wordle', 'The approach involves obtaining a list of relative frequencies for all words in the English language to analyze their frequency distribution', "The most common five-letter words in the English language are 'which,' 'there,' 'first,' 'after,' and 'where,' with 'which' being the most common", "The need to incorporate word commonality in the solver's strategy, suggesting it as an area for improvement", 'Entropy is used to measure the remaining uncertainty among all possible words in the Wordle game, providing insight into the expected information from a given guess and the unequal weighting of words affecting the entropy calculation']}