title
What are regular expressions?

description
Google Python Class Day 2 Part 1: Regular Expressions. By Nick Parlante. Support materials and exercises: http://code.google.com/edu/languages/google-python-class

detail
{'title': 'What are regular expressions?', 'heatmap': [{'end': 781.236, 'start': 704.518, 'weight': 0.734}, {'end': 2272.524, 'start': 2213.939, 'weight': 0.826}], 'summary': 'Covers python regular expressions, including their integration with python, basic rules, usage, email extraction, python regular expression functions, and baby name analysis with python, providing comprehensive insights and practical applications.', 'chapters': [{'end': 469.248, 'segs': [{'end': 46.766, 'src': 'embed', 'start': 16.656, 'weight': 0, 'content': [{'end': 21.598, 'text': "And I'm not going to show you all of regular expressions, I'm going to show you like just enough for us to get some useful stuff done.", 'start': 16.656, 'duration': 4.942}, {'end': 26, 'text': "But regular expressions are a very powerful combination with Python, there's a nice integration there, so I want to show you that.", 'start': 21.638, 'duration': 4.362}, {'end': 33.067, 'text': 'Also the exercises later today will of course you know, have little elements which are solved nicely with Python regular expressions.', 'start': 26.3, 'duration': 6.767}, {'end': 38.777, 'text': 'Just as a, regular expressions is sort of a good news, bad news situation.', 'start': 34.029, 'duration': 4.748}, {'end': 46.766, 'text': 'They, regular expressions are very, I mean you could use the word powerful, but I also like to use the word very dense.', 'start': 41.604, 'duration': 5.162}], 'summary': "Python's integration with regular expressions is powerful and useful for solving exercises.", 'duration': 30.11, 'max_score': 16.656, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe416656.jpg'}, {'end': 95.842, 'src': 'embed', 'start': 65.093, 'weight': 4, 'content': [{'end': 66.894, 'text': "we're gonna sort of touch into a little bit of that power.", 'start': 65.093, 'duration': 1.801}, {'end': 74.122, 'text': "So, one word of warning, when messing with regular expressions, it's the, I tend, I try to move a little slowly.", 'start': 68.275, 'duration': 5.847}, {'end': 77.406, 'text': "Like they're very powerful, they're a little tricky, so I'm going to try to be careful.", 'start': 74.522, 'duration': 2.884}, {'end': 81.291, 'text': "And for, you know, today's discussion, like I'm going to show you just sort of basic stuff.", 'start': 77.866, 'duration': 3.425}, {'end': 85.636, 'text': "And if you're extremely familiar with regular expressions, well, you know, please just bear with me for a little bit.", 'start': 82.212, 'duration': 3.424}, {'end': 86.877, 'text': "We're not going to do this for too long.", 'start': 85.676, 'duration': 1.201}, {'end': 89.639, 'text': "And obviously, really, I'm going to emphasize Python.", 'start': 87.518, 'duration': 2.121}, {'end': 90.9, 'text': "That's the bad news.", 'start': 90.44, 'duration': 0.46}, {'end': 95.842, 'text': "The good news is also on all the exercises we're going to do later today.", 'start': 91.2, 'duration': 4.642}], 'summary': 'The discussion will cover basic regular expressions and emphasize python, with exercises included.', 'duration': 30.749, 'max_score': 65.093, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe465093.jpg'}, {'end': 122.594, 'src': 'embed', 'start': 95.842, 'weight': 6, 'content': [{'end': 103.226, 'text': "in case I forget to mention if there's a regular expression component at the very end, printed in little teeny print,", 'start': 95.842, 'duration': 7.384}, {'end': 105.387, 'text': 'I put what the regular expression solution is.', 'start': 103.226, 'duration': 2.161}, {'end': 108.929, 'text': "So it's kind of like, you can sort of flip to the back and get the answer if you're struggling with that part of it.", 'start': 105.407, 'duration': 3.522}, {'end': 110.37, 'text': 'Because really, it is a Python class.', 'start': 108.949, 'duration': 1.421}, {'end': 112.311, 'text': "So I don't want you to block on the regular expressions too much.", 'start': 110.39, 'duration': 1.921}, {'end': 113.591, 'text': 'All right.', 'start': 112.331, 'duration': 1.26}, {'end': 116.093, 'text': "So with that introduction, I'm going to start talking about how these things work.", 'start': 113.651, 'duration': 2.442}, {'end': 122.594, 'text': 'But first, I have to tell you a joke, which will appear later.', 'start': 117.029, 'duration': 5.565}], 'summary': 'Python class with regular expression component, solution printed at the end for struggling students.', 'duration': 26.752, 'max_score': 95.842, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe495842.jpg'}, {'end': 202.1, 'src': 'embed', 'start': 143.459, 'weight': 2, 'content': [{'end': 144.2, 'text': "So I'm just going to import that.", 'start': 143.459, 'duration': 0.741}, {'end': 147.242, 'text': "And I'm going to do a lot of stuff here in the interpreter.", 'start': 144.22, 'duration': 3.022}, {'end': 148.062, 'text': "I'm going to sort of build this up.", 'start': 147.262, 'duration': 0.8}, {'end': 155.267, 'text': "So the basic idea with regular expressions is they're a way of searching for a pattern inside of a larger text.", 'start': 148.503, 'duration': 6.764}, {'end': 158.33, 'text': 'So very much like, you know, search in Microsoft Word or wherever.', 'start': 155.488, 'duration': 2.842}, {'end': 163.353, 'text': "You have the little pattern you're looking for and it's going to look over this huge text and find the first instance of that pattern.", 'start': 158.35, 'duration': 5.003}, {'end': 166.095, 'text': "But it's this whole language where the patterns can be very popular.", 'start': 163.994, 'duration': 2.101}, {'end': 175.292, 'text': 'So the way this works in Python, The simplest way is there is a function inside of re called search.', 'start': 166.615, 'duration': 8.677}, {'end': 177.998, 'text': "And I'll sort of spec this out.", 'start': 176.575, 'duration': 1.423}, {'end': 180.122, 'text': "It's going to work basically this way.", 'start': 178.9, 'duration': 1.222}, {'end': 184.632, 'text': "where the first argument to search is the pattern, which I'm going to talk about a lot.", 'start': 181.291, 'duration': 3.341}, {'end': 187.274, 'text': 'The second argument is just kind of whatever text I want to search.', 'start': 184.933, 'duration': 2.341}, {'end': 194.177, 'text': 'And what it returns is actually not a Boolean, not text, but a match object.', 'start': 187.574, 'duration': 6.603}, {'end': 195.397, 'text': "So here I'll write this as match.", 'start': 194.197, 'duration': 1.2}, {'end': 200.719, 'text': "And then the match object will indicate, it'll show us a bunch of things about the found text.", 'start': 195.817, 'duration': 4.902}, {'end': 202.1, 'text': 'So let me do an example.', 'start': 201.14, 'duration': 0.96}], 'summary': "Regular expressions in python help search for patterns in text using the 'search' function inside the 're' module.", 'duration': 58.641, 'max_score': 143.459, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe4143459.jpg'}, {'end': 294.762, 'src': 'embed', 'start': 230.898, 'weight': 5, 'content': [{'end': 237.518, 'text': "So if I just type match, I mean, it's not really going to print, but it'll say, well, that's some kind of Python object.", 'start': 230.898, 'duration': 6.62}, {'end': 242, 'text': 'So it turns out for the first 20 minutes this morning.', 'start': 238.899, 'duration': 3.101}, {'end': 246.863, 'text': 'the only thing you need to know about match is that it has this response to this method called group.', 'start': 242, 'duration': 4.863}, {'end': 251.706, 'text': "If you call group on it, it shows you, here's what the matching text was.", 'start': 247.524, 'duration': 4.182}, {'end': 256.111, 'text': 'This is our first example of a regular expression.', 'start': 254.67, 'duration': 1.441}, {'end': 265.718, 'text': 'And the simplest case in a regular expression is like the iig here, is that a character like i or g or something like that matches itself.', 'start': 256.13, 'duration': 9.588}, {'end': 268.24, 'text': 'So the lowercase i matches the lowercase i, whatever.', 'start': 265.898, 'duration': 2.342}, {'end': 271.323, 'text': "Now I'm going to build up the vocabulary to have a lot more complicated matches.", 'start': 268.26, 'duration': 3.063}, {'end': 274.165, 'text': "But that's just characters matching themselves is the simplest case.", 'start': 271.343, 'duration': 2.822}, {'end': 276.335, 'text': 'All right, another thing to point out here.', 'start': 275.435, 'duration': 0.9}, {'end': 277.916, 'text': 'So this match was successful.', 'start': 276.695, 'duration': 1.221}, {'end': 279.616, 'text': "I'm going to do one that's not successful.", 'start': 278.036, 'duration': 1.58}, {'end': 284.278, 'text': "So like let's say we're looking for the pattern IGS.", 'start': 280.417, 'duration': 3.861}, {'end': 287.139, 'text': "And that pattern, it just doesn't appear in there.", 'start': 285.359, 'duration': 1.78}, {'end': 291.281, 'text': "So if I run that, and then I look at the match object, it's none.", 'start': 287.179, 'duration': 4.102}, {'end': 294.762, 'text': "Now, in the interpreter, none just prints as nothing, but it's just not there.", 'start': 292.321, 'duration': 2.441}], 'summary': 'Introduction to regular expressions and their basic functionality explained, including successful and unsuccessful matches.', 'duration': 63.864, 'max_score': 230.898, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe4230898.jpg'}, {'end': 348.703, 'src': 'embed', 'start': 319.781, 'weight': 10, 'content': [{'end': 320.902, 'text': 'then we found that we can look at the group.', 'start': 319.781, 'duration': 1.121}, {'end': 322.243, 'text': "Otherwise, it's not there.", 'start': 321.302, 'duration': 0.941}, {'end': 324.124, 'text': "Now what I'm going to do.", 'start': 322.463, 'duration': 1.661}, {'end': 328.108, 'text': "I'm actually going to write, just I'm going to def a little, find function just here in the interpreter,", 'start': 324.124, 'duration': 3.984}, {'end': 330.51, 'text': "just because I'm going to do so many regular expression searches today.", 'start': 328.108, 'duration': 2.402}, {'end': 331.711, 'text': 'I just want to encapsulate that behavior.', 'start': 330.51, 'duration': 1.201}, {'end': 336.775, 'text': "So what I'm going to show you here is sort of the prototypical use of re.search, and then I'll just use it for half an hour.", 'start': 331.731, 'duration': 5.044}, {'end': 341.478, 'text': "So I'm going to say, I'm just going to call this thing a find, and it'll take a pattern and some text.", 'start': 336.795, 'duration': 4.683}, {'end': 342.679, 'text': 'This is a little weird.', 'start': 342.159, 'duration': 0.52}, {'end': 344.04, 'text': "I'm doing this in the interpreter, but this works.", 'start': 342.699, 'duration': 1.341}, {'end': 346.301, 'text': "I'm going to type a colon, and I hit Return.", 'start': 344.06, 'duration': 2.241}, {'end': 348.703, 'text': "And now the interpreter is saying OK, well, what's the next line?", 'start': 346.782, 'duration': 1.921}], 'summary': 'Demonstrating the use of re.search for regular expression searches in the interpreter.', 'duration': 28.922, 'max_score': 319.781, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe4319781.jpg'}], 'start': 0.209, 'title': 'Python regular expressions', 'summary': 'Covers the integration of regular expressions with python, emphasizing their power and density, with a cautionary note on their complexity, and a promise of python-focused exercises with solutions provided. it also introduces the basics of using regular expressions in python, including the concept of patterns, the search function, and match objects, with a simple example of finding a pattern within a text. additionally, the chapter explains the use of re.search in python, demonstrating successful and unsuccessful matches, emphasizing the importance of checking for a match before accessing the group, and encapsulating the behavior in a find function.', 'chapters': [{'end': 112.311, 'start': 0.209, 'title': 'Python and regular expressions', 'summary': 'Covers the integration of regular expressions with python, emphasizing their power and density, with a cautionary note on their complexity, and a promise of python-focused exercises with solutions provided.', 'duration': 112.102, 'highlights': ['Regular expressions are a powerful combination with Python', 'They are very dense and require careful handling', 'Exercises later today will involve Python regular expressions', 'A cautionary note is given about the complexity and trickiness of regular expressions', 'Solutions for regular expression components of exercises will be provided']}, {'end': 274.165, 'start': 112.331, 'title': 'Python regular expressions introduction', 'summary': 'Introduces the basics of using regular expressions in python, including the concept of patterns, the search function, and match objects, with a simple example of finding a pattern within a text.', 'duration': 161.834, 'highlights': ['Regular expressions in Python are a way of searching for a pattern inside of a larger text.', 'The search function in the re module is used to find the first instance of a pattern within a text.', 'The match object returned by the search function provides information about the found text, such as the matching text.', "A simple example of a regular expression is matching characters like 'i' or 'g' to themselves."]}, {'end': 469.248, 'start': 275.435, 'title': 'Using re.search in python', 'summary': 'Explains the use of re.search in python, demonstrating a successful match and an unsuccessful match, emphasizing the importance of checking for a match before accessing the group, and encapsulating the behavior in a find function.', 'duration': 193.813, 'highlights': ['Emphasizing the importance of checking for a match before accessing the group', 'Demonstrating a successful match and an unsuccessful match', 'Encapsulating the behavior in a find function']}], 'duration': 469.039, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe4209.jpg', 'highlights': ['Regular expressions are a powerful combination with Python', 'Exercises later today will involve Python regular expressions', 'Regular expressions in Python are a way of searching for a pattern inside of a larger text', 'The search function in the re module is used to find the first instance of a pattern within a text', 'A cautionary note is given about the complexity and trickiness of regular expressions', 'Emphasizing the importance of checking for a match before accessing the group', 'Solutions for regular expression components of exercises will be provided', "A simple example of a regular expression is matching characters like 'i' or 'g' to themselves", 'Demonstrating a successful match and an unsuccessful match', 'The match object returned by the search function provides information about the found text, such as the matching text', 'Encapsulating the behavior in a find function', 'They are very dense and require careful handling']}, {'end': 1079.135, 'segs': [{'end': 526.375, 'src': 'embed', 'start': 495.43, 'weight': 4, 'content': [{'end': 497.852, 'text': 'Special characters, I made a little table up here.', 'start': 495.43, 'duration': 2.422}, {'end': 500.622, 'text': 'The dot Very special.', 'start': 498.752, 'duration': 1.87}, {'end': 502.103, 'text': 'Dot matches any character.', 'start': 500.682, 'duration': 1.421}, {'end': 505.084, 'text': 'It means you did anything, except it does not match new line.', 'start': 502.123, 'duration': 2.961}, {'end': 515.57, 'text': "So I could have said, well, I'm looking for, let's say, any three characters and then a G.", 'start': 505.925, 'duration': 9.645}, {'end': 517.15, 'text': "That's the pattern I'm looking for.", 'start': 515.57, 'duration': 1.58}, {'end': 519.792, 'text': "And so in this case, that's going to find pig.", 'start': 517.691, 'duration': 2.101}, {'end': 526.375, 'text': 'So you can get a little bit of a sense of how this is going to be more powerful than just regular Microsoft Word search.', 'start': 520.412, 'duration': 5.963}], 'summary': 'Using dot in regular expressions can match any character except newline, making it more powerful than regular text search.', 'duration': 30.945, 'max_score': 495.43, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe4495430.jpg'}, {'end': 598.425, 'src': 'embed', 'start': 569.174, 'weight': 5, 'content': [{'end': 571.716, 'text': 'Whatever I could use a little bit of it at the end of the start or whatever.', 'start': 569.174, 'duration': 2.542}, {'end': 573.897, 'text': 'So that is a fundamental asymmetry.', 'start': 572.276, 'duration': 1.621}, {'end': 578.658, 'text': "The other thing that's going to happen here is that the search is going to go left to right.", 'start': 574.937, 'duration': 3.721}, {'end': 581.899, 'text': "And it's satisfied as soon as it finds a solution.", 'start': 578.698, 'duration': 3.201}, {'end': 584.94, 'text': "So we could make up a case where there's maybe multiple solutions.", 'start': 582.439, 'duration': 2.501}, {'end': 588.722, 'text': "Say, for example, I'm looking for dot dot g.", 'start': 584.96, 'duration': 3.762}, {'end': 594.143, 'text': "And then I'll make this like, oh, here's a much better solution, x, y, z, g.", 'start': 588.722, 'duration': 5.421}, {'end': 598.425, 'text': "And what's going to happen is it's just not finding that second one.", 'start': 594.143, 'duration': 4.282}], 'summary': 'Discussion on asymmetry in search algorithms and multiple solutions.', 'duration': 29.251, 'max_score': 569.174, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe4569174.jpg'}, {'end': 643.444, 'src': 'embed', 'start': 615.942, 'weight': 6, 'content': [{'end': 621.187, 'text': "Now, the regular expression engine, without getting into too much detail, it finds all the things it's supposed to.", 'start': 615.942, 'duration': 5.245}, {'end': 621.908, 'text': "And it's smart.", 'start': 621.247, 'duration': 0.661}, {'end': 622.408, 'text': "It'll backtrack.", 'start': 621.968, 'duration': 0.44}, {'end': 633.597, 'text': "So for example, what if I said, well, I'm looking for dot dot g, and then I insist that there's an s.", 'start': 624.971, 'duration': 8.626}, {'end': 636.099, 'text': "And here, I'll go fix this one to have an s here.", 'start': 633.597, 'duration': 2.502}, {'end': 639.001, 'text': 'So you can imagine, so that succeeds.', 'start': 637.62, 'duration': 1.381}, {'end': 643.444, 'text': "So you can imagine it maybe tries to make this one work, and it doesn't work.", 'start': 639.021, 'duration': 4.423}], 'summary': 'The regular expression engine finds all occurrences, is smart, and backtracks when necessary.', 'duration': 27.502, 'max_score': 615.942, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe4615942.jpg'}, {'end': 692.672, 'src': 'embed', 'start': 665.127, 'weight': 7, 'content': [{'end': 670.011, 'text': 'This one? One more.', 'start': 665.127, 'duration': 4.884}, {'end': 670.532, 'text': 'This one.', 'start': 670.191, 'duration': 0.341}, {'end': 672.674, 'text': 'That one succeeded.', 'start': 671.933, 'duration': 0.741}, {'end': 674.816, 'text': 'The second piece.', 'start': 672.874, 'duration': 1.942}, {'end': 681.686, 'text': "Oh, why didn't it find that? OK, yeah, so what it does is it goes left to right.", 'start': 675.176, 'duration': 6.51}, {'end': 684.607, 'text': "And once it finds a solution, it's like, OK, I'm done.", 'start': 682.166, 'duration': 2.441}, {'end': 685.628, 'text': "It just doesn't try anymore.", 'start': 684.647, 'duration': 0.981}, {'end': 686.368, 'text': 'Yeah, question?', 'start': 685.988, 'duration': 0.38}, {'end': 692.672, 'text': 'If you were actually looking for the period character, would you just need to escape it with a??', 'start': 686.388, 'duration': 6.284}], 'summary': 'Discussion about finding a solution, stopping after finding one, and seeking the period character.', 'duration': 27.545, 'max_score': 665.127, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe4665127.jpg'}, {'end': 781.236, 'src': 'heatmap', 'start': 704.518, 'weight': 0.734, 'content': [{'end': 707.301, 'text': 'And you can always put a backslash in.', 'start': 704.518, 'duration': 2.783}, {'end': 710.225, 'text': 'And that inhibits the specialness of a character.', 'start': 707.782, 'duration': 2.443}, {'end': 712.868, 'text': 'So I could look for C dot G.', 'start': 710.265, 'duration': 2.603}, {'end': 715.651, 'text': 'Or I could look for C dot backslash dot L there.', 'start': 712.868, 'duration': 2.783}, {'end': 719.936, 'text': "Now, I'm going to introduce a slight extra bit of syntax here, which Python has.", 'start': 716.031, 'duration': 3.905}, {'end': 723.581, 'text': "which is where it's a little troubling.", 'start': 721.598, 'duration': 1.983}, {'end': 726.424, 'text': 'Like the backslash, it could be interpreted at different levels.', 'start': 724.101, 'duration': 2.323}, {'end': 730.93, 'text': 'Like maybe Python or like in Java, it might get taken out by the language.', 'start': 726.444, 'duration': 4.486}, {'end': 737.579, 'text': "So without getting into too much detail, I'm just going to say Python has an option called a raw string.", 'start': 730.95, 'duration': 6.629}, {'end': 742.019, 'text': 'where you put a lowercase r to the left of the leading quote.', 'start': 738.456, 'duration': 3.563}, {'end': 748.105, 'text': 'And what the lowercase r means, it says, do not do any special processing with backslashes.', 'start': 742.68, 'duration': 5.425}, {'end': 752.509, 'text': 'Whatever I type, just send it through absolutely raw and uninterpreted.', 'start': 748.666, 'duration': 3.843}, {'end': 754.89, 'text': 'This feature.', 'start': 754.37, 'duration': 0.52}, {'end': 756.751, 'text': "I mean it's a little bit obscure,", 'start': 754.89, 'duration': 1.861}, {'end': 762.492, 'text': 'but it happens to be very useful for writing regular expressions because it frees us from having to worry about layers of backslash processing.', 'start': 756.751, 'duration': 5.741}, {'end': 768.193, 'text': "So in fact, even though I've done my example so far without the R, I'm just going to use the lowercase R for all of my examples from here on out.", 'start': 762.772, 'duration': 5.421}, {'end': 769.553, 'text': "So I just don't have to think about it.", 'start': 768.313, 'duration': 1.24}, {'end': 771.814, 'text': "So in this case, let's just try it.", 'start': 770.574, 'duration': 1.24}, {'end': 777.055, 'text': "Yeah, so then it's able to find this, you know, so it's matching that.", 'start': 772.714, 'duration': 4.341}, {'end': 778.215, 'text': "So that's how I'm able to put the dot.", 'start': 777.115, 'duration': 1.1}, {'end': 781.236, 'text': "You know, that's how I'm able to talk about a dot explicitly.", 'start': 779.156, 'duration': 2.08}], 'summary': 'Python has a raw string option (lowercase r) which allows for uninterpreted backslashes, making it useful for writing regular expressions.', 'duration': 76.718, 'max_score': 704.518, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe4704518.jpg'}, {'end': 768.193, 'src': 'embed', 'start': 742.68, 'weight': 0, 'content': [{'end': 748.105, 'text': 'And what the lowercase r means, it says, do not do any special processing with backslashes.', 'start': 742.68, 'duration': 5.425}, {'end': 752.509, 'text': 'Whatever I type, just send it through absolutely raw and uninterpreted.', 'start': 748.666, 'duration': 3.843}, {'end': 754.89, 'text': 'This feature.', 'start': 754.37, 'duration': 0.52}, {'end': 756.751, 'text': "I mean it's a little bit obscure,", 'start': 754.89, 'duration': 1.861}, {'end': 762.492, 'text': 'but it happens to be very useful for writing regular expressions because it frees us from having to worry about layers of backslash processing.', 'start': 756.751, 'duration': 5.741}, {'end': 768.193, 'text': "So in fact, even though I've done my example so far without the R, I'm just going to use the lowercase R for all of my examples from here on out.", 'start': 762.772, 'duration': 5.421}], 'summary': 'Using lowercase r in regular expressions allows raw, uninterpreted input, freeing from backslash processing.', 'duration': 25.513, 'max_score': 742.68, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe4742680.jpg'}, {'end': 837.083, 'src': 'embed', 'start': 804.676, 'weight': 2, 'content': [{'end': 806.117, 'text': "and then there's some more text.", 'start': 804.676, 'duration': 1.441}, {'end': 808.298, 'text': "So let's say I want to pick that part out.", 'start': 806.417, 'duration': 1.881}, {'end': 818.111, 'text': "So, the next sort of regular expression code I'm going to talk about is backslash W.", 'start': 811.446, 'duration': 6.665}, {'end': 824.675, 'text': 'So, backslash W, which actually I have up here, backslash W matches what you would call a word character.', 'start': 818.111, 'duration': 6.564}, {'end': 828.378, 'text': 'So, that means a letter or a digit, and I think it also includes underbar.', 'start': 824.735, 'duration': 3.643}, {'end': 837.083, 'text': "So, in this case, I'm going to say, well, let's say I'm looking for a colon followed by three word characters.", 'start': 830.119, 'duration': 6.964}], 'summary': 'Discussion of backslash w in regular expressions, matching word characters, including letters, digits, and underbars.', 'duration': 32.407, 'max_score': 804.676, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe4804676.jpg'}, {'end': 987.589, 'src': 'embed', 'start': 956.318, 'weight': 3, 'content': [{'end': 959.04, 'text': "So suppose I'm looking for this pattern.", 'start': 956.318, 'duration': 2.722}, {'end': 963.303, 'text': "It's like, well, I want some digits, and they're separated by spaces.", 'start': 959.6, 'duration': 3.703}, {'end': 970.206, 'text': 'So the simplest way you do that is a backslash s represents a whitespace character.', 'start': 964.642, 'duration': 5.564}, {'end': 975.23, 'text': 'And the backslash s is smart that it knows about space, tab, new line.', 'start': 970.686, 'duration': 4.544}, {'end': 977.171, 'text': 'Those all count as a whitespace character.', 'start': 975.25, 'duration': 1.921}, {'end': 980.053, 'text': 'It knows about the whole sort of space of whitespace characters.', 'start': 977.191, 'duration': 2.862}, {'end': 982.575, 'text': "So hopefully that'll work so that that finds it.", 'start': 980.233, 'duration': 2.342}, {'end': 987.589, 'text': 'Yeah, so the question is if you had two spaces.', 'start': 985.668, 'duration': 1.921}], 'summary': 'Using backslash s to represent whitespace characters simplifies pattern matching.', 'duration': 31.271, 'max_score': 956.318, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe4956318.jpg'}, {'end': 1056.837, 'src': 'embed', 'start': 1028.645, 'weight': 1, 'content': [{'end': 1030.747, 'text': 'And that means, yeah, one or more.', 'start': 1028.645, 'duration': 2.102}, {'end': 1031.867, 'text': 'That element repeats.', 'start': 1031.047, 'duration': 0.82}, {'end': 1033.089, 'text': "There's just one or more of those.", 'start': 1032.008, 'duration': 1.081}, {'end': 1035.412, 'text': "And I'll do it with this one as well.", 'start': 1034.21, 'duration': 1.202}, {'end': 1039.075, 'text': 'Oops So if I hit return there.', 'start': 1036.772, 'duration': 2.303}, {'end': 1040.257, 'text': 'So now that matches.', 'start': 1039.316, 'duration': 0.941}, {'end': 1045.54, 'text': "So adding the plus and the star, and I'll do a bunch of examples with these.", 'start': 1042.037, 'duration': 3.503}, {'end': 1051.107, 'text': 'but this makes the language really exactly what we want to start matching more complicated patterns.', 'start': 1045.54, 'duration': 5.567}, {'end': 1056.837, 'text': 'Also, remember how I was saying how per character regular expressions.', 'start': 1051.387, 'duration': 5.45}], 'summary': 'The session covers using plus and star for matching one or more elements in regular expressions.', 'duration': 28.192, 'max_score': 1028.645, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe41028645.jpg'}], 'start': 471.653, 'title': 'Regular expressions in python', 'summary': 'Discusses basic rules of regular expressions, including matching of simple and special characters, asymmetry of pattern matching, left-to-right search behavior, backtracking, and use of escape characters. it also covers the syntax of regular expressions in python, including the use of raw strings, character matching, whitespace representation, and repetition modifiers like plus and star.', 'chapters': [{'end': 692.672, 'start': 471.653, 'title': 'Regular expressions: matching rules and patterns', 'summary': 'Discusses the basic rules of regular expressions, including the matching of simple and special characters. it emphasizes the asymmetry of pattern matching and the left-to-right search behavior, while also touching on backtracking and the use of escape characters.', 'duration': 221.019, 'highlights': ['The dot Very special. Dot matches any character.', "The search is going to go left to right. And it's satisfied as soon as it finds a solution.", "It finds all the things it's supposed to. And it's smart. It'll backtrack.", 'If you were actually looking for the period character, would you just need to escape it with a??']}, {'end': 1079.135, 'start': 693.412, 'title': 'Regular expressions in python', 'summary': 'Covers the syntax of regular expressions in python, including the use of raw strings, character matching, whitespace representation, and repetition modifiers like plus and star.', 'duration': 385.723, 'highlights': ['The use of raw strings in Python regular expressions frees from having to worry about backslash processing.', 'Explanation of character matching using backslash W for word characters and backslash d for digits.', 'The representation of whitespace using backslash s in regular expressions and its awareness of space, tab, and new line.', 'Demonstration of repetition modifiers like plus and star for matching one or more and zero or more occurrences.']}], 'duration': 607.482, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe4471653.jpg', 'highlights': ['The use of raw strings in Python regular expressions frees from backslash processing.', 'Demonstration of repetition modifiers like plus and star for matching occurrences.', 'Explanation of character matching using backslash W for word characters and backslash d for digits.', 'Representation of whitespace using backslash s in regular expressions and its awareness of space, tab, and new line.', 'The dot Very special. Dot matches any character.', "The search is going to go left to right. And it's satisfied as soon as it finds a solution.", "It finds all the things it's supposed to. And it's smart. It'll backtrack.", 'If you were actually looking for the period character, would you just need to escape it with a??']}, {'end': 1291.326, 'segs': [{'end': 1163.656, 'src': 'embed', 'start': 1101, 'weight': 0, 'content': [{'end': 1109.144, 'text': "The more typical way to do this would be, I'd say, well, there's a colon, and I'll say, and then there's just some number of word characters.", 'start': 1101, 'duration': 8.144}, {'end': 1111.585, 'text': 'So I would write that as backslash W plus.', 'start': 1109.664, 'duration': 1.921}, {'end': 1114.127, 'text': "that's a much more typical way, right?", 'start': 1112.827, 'duration': 1.3}, {'end': 1118.228, 'text': "Like there's some, a quote or a column, there's something that sort of starts, and then you're like yeah, whatever.", 'start': 1114.167, 'duration': 4.061}, {'end': 1120.109, 'text': 'Then just take all the word characters from there.', 'start': 1118.248, 'duration': 1.861}, {'end': 1123.909, 'text': 'So if I write it that way, then it like, it just picks out the kitten part.', 'start': 1120.749, 'duration': 3.16}, {'end': 1128.63, 'text': "So that is a, it's beginning to look a little more the way these things actually work.", 'start': 1125.15, 'duration': 3.48}, {'end': 1133.452, 'text': 'Yeah, so the space is not a word character.', 'start': 1131.891, 'duration': 1.561}, {'end': 1134.612, 'text': "That's what's making it stop there.", 'start': 1133.612, 'duration': 1}, {'end': 1141.729, 'text': 'So, oh, and actually there was the question before, like, does it include digits? So what if it was like kitten one, two, three, That still works.', 'start': 1134.892, 'duration': 6.837}, {'end': 1147.891, 'text': "But if I kit in 1, 2, 3, and at some point I have to add a character, like let's say ampersand, then it stops at the ampersand.", 'start': 1141.969, 'duration': 5.922}, {'end': 1148.511, 'text': 'So this is the thing.', 'start': 1147.911, 'duration': 0.6}, {'end': 1152.492, 'text': 'So what the plus does is the plus is greedy.', 'start': 1148.551, 'duration': 3.941}, {'end': 1155.353, 'text': 'It goes as far as it can, and then it stops.', 'start': 1152.672, 'duration': 2.681}, {'end': 1163.656, 'text': 'So just kind of the mnemonic for regular expressions is it finds the leftmost solution, the first one, and the largest solution.', 'start': 1155.853, 'duration': 7.803}], 'summary': "Using backslash w plus, the regular expression picks out 'kitten' and stops at non-word characters like space or ampersand.", 'duration': 62.656, 'max_score': 1101, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe41101000.jpg'}, {'end': 1202.905, 'src': 'embed', 'start': 1177.186, 'weight': 1, 'content': [{'end': 1182.048, 'text': 'What if I said period plus? And the answer is, yeah, it just goes all the way to the end.', 'start': 1177.186, 'duration': 4.862}, {'end': 1186.31, 'text': 'So period matches, dot, ampersand, everything, except for new line.', 'start': 1182.809, 'duration': 3.501}, {'end': 1188.431, 'text': 'All righty.', 'start': 1188.051, 'duration': 0.38}, {'end': 1189.732, 'text': 'Yeah, question.', 'start': 1188.451, 'duration': 1.281}, {'end': 1195.555, 'text': 'When you say largest, do you mean that if you say kitten 1, 2, 3, 1, 2, 3, they will find.', 'start': 1189.972, 'duration': 5.583}, {'end': 1201.885, 'text': "So if I say you mean here, if I say kitten 1, 2, 3, 1, 2, 3? And I'll go back to this is a backslash W plus?", 'start': 1195.555, 'duration': 6.33}, {'end': 1202.905, 'text': 'OK,', 'start': 1202.705, 'duration': 0.2}], 'summary': 'Regex pattern matching discussion, with mention of period, dot, ampersand, new line, and backslash w plus.', 'duration': 25.719, 'max_score': 1177.186, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe41177186.jpg'}, {'end': 1260.08, 'src': 'embed', 'start': 1223.67, 'weight': 2, 'content': [{'end': 1223.97, 'text': 'All right.', 'start': 1223.67, 'duration': 0.3}, {'end': 1226.611, 'text': "So one more code I'm going to show you, which I'll just type in here.", 'start': 1224.21, 'duration': 2.401}, {'end': 1234.135, 'text': 'is backslash uppercase S is a non-whitespace character.', 'start': 1228.189, 'duration': 5.946}, {'end': 1235.957, 'text': "It's kind of like the opposite.", 'start': 1234.195, 'duration': 1.762}, {'end': 1243.826, 'text': "And I'm a little saddened that whoever designed regular expressions chose to have uppercase and lowercase mean something different,", 'start': 1237.199, 'duration': 6.627}, {'end': 1246.008, 'text': 'because it just makes it a little bit confusing.', 'start': 1243.826, 'duration': 2.182}, {'end': 1249.271, 'text': 'But backslash uppercase S is really pretty darn handy.', 'start': 1246.028, 'duration': 3.243}, {'end': 1260.08, 'text': "So let's say, For example, I knew that it was kitten 1, 2, 3, and A equals 1, 2, 3, ampersand, you know, yada, whatever.", 'start': 1249.732, 'duration': 10.348}], 'summary': 'Backslash uppercase s represents a non-whitespace character in regular expressions, which can be handy in certain situations.', 'duration': 36.41, 'max_score': 1223.67, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe41223670.jpg'}], 'start': 1079.875, 'title': 'Regular expressions and their usage', 'summary': "Explains the usage of regular expressions, covering the plus operator's greedy nature, period plus behavior, and backslash uppercase s functionality for non-whitespace character recognition.", 'chapters': [{'end': 1141.729, 'start': 1079.875, 'title': 'Regular expression: matching text patterns', 'summary': "Illustrates the use of regular expressions to locate specific patterns in a text, emphasizing the importance of using backslash w plus to capture word characters and demonstrating its ability to identify 'kitten' and 'kitten one, two, three' as a match.", 'duration': 61.854, 'highlights': ["The more typical way to do this would be, I'd say, well, there's a colon, and I'll say, and then there's just some number of word characters. So I would write that as backslash W plus. that's a much more typical way, right? Like there's some, a quote or a column, there's something that sort of starts, and then you're like yeah, whatever. Then just take all the word characters from there.", 'So if I write it that way, then it like, it just picks out the kitten part.', 'So, oh, and actually there was the question before, like, does it include digits? So what if it was like kitten one, two, three, That still works.']}, {'end': 1291.326, 'start': 1141.969, 'title': 'Regular expressions and their usage', 'summary': 'Discusses the usage of regular expressions, explaining the greedy nature of the plus operator, the behavior of period plus, and the functionality of backslash uppercase s for non-whitespace character recognition.', 'duration': 149.357, 'highlights': ['The plus operator in regular expressions is greedy, going as far as it can and then stopping.', 'The behavior of period plus in regular expressions, where it matches everything until the end of the line.', 'The functionality of backslash uppercase S as a non-whitespace character recognition tool in regular expressions.']}], 'duration': 211.451, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe41079875.jpg', 'highlights': ['The plus operator in regular expressions is greedy, going as far as it can and then stopping.', 'The behavior of period plus in regular expressions, where it matches everything until the end of the line.', 'The functionality of backslash uppercase S as a non-whitespace character recognition tool in regular expressions.', "The more typical way to do this would be, I'd say, well, there's a colon, and I'll say, and then there's just some number of word characters. So I would write that as backslash W plus. that's a much more typical way, right? Like there's some, a quote or a column, there's something that sort of starts, and then you're like yeah, whatever. Then just take all the word characters from there.", 'So if I write it that way, then it like, it just picks out the kitten part.', 'So, oh, and actually there was the question before, like, does it include digits? So what if it was like kitten one, two, three, That still works.']}, {'end': 1732.84, 'segs': [{'end': 1364.967, 'src': 'embed', 'start': 1310.417, 'weight': 0, 'content': [{'end': 1314.299, 'text': "I'm going to kind of build it up and hopefully show you pretty practical patterns you can use.", 'start': 1310.417, 'duration': 3.882}, {'end': 1316.22, 'text': 'All right.', 'start': 1315.98, 'duration': 0.24}, {'end': 1319.421, 'text': "So I'm going to make up some text here.", 'start': 1316.28, 'duration': 3.141}, {'end': 1322.062, 'text': "I'll keep the blah.", 'start': 1321.221, 'duration': 0.841}, {'end': 1332.594, 'text': "So let's say we're looking for nick.p at gmail.com, and then there's some more junky text.", 'start': 1323.427, 'duration': 9.167}, {'end': 1334.475, 'text': "And there's an at sign just by itself.", 'start': 1332.634, 'duration': 1.841}, {'end': 1336.877, 'text': "I'll just leave it like that for now.", 'start': 1335.796, 'duration': 1.081}, {'end': 1342.601, 'text': 'So the problem I want to solve is pulling email.', 'start': 1338.017, 'duration': 4.584}, {'end': 1348.305, 'text': "I want to imagine I've got this big body of text, and I want to pull email addresses out of it using Grego expressions.", 'start': 1342.621, 'duration': 5.684}, {'end': 1359.484, 'text': "So, the, I'm going to try, first I'm going to try to write this as backslash W plus and then there's an at sign and then there's backslash W plus.", 'start': 1349.798, 'duration': 9.686}, {'end': 1363.946, 'text': "It's a kind of, you know, plausible first shot at this.", 'start': 1361.005, 'duration': 2.941}, {'end': 1364.967, 'text': 'So, if we run that.', 'start': 1364.286, 'duration': 0.681}], 'summary': 'Demonstrating practical patterns for extracting email addresses using regular expressions.', 'duration': 54.55, 'max_score': 1310.417, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe41310417.jpg'}, {'end': 1414.027, 'src': 'embed', 'start': 1387, 'weight': 1, 'content': [{'end': 1390.025, 'text': "Really, it's word characters plus some other stuff.", 'start': 1387, 'duration': 3.025}, {'end': 1395.755, 'text': "So regular expressions, there's this very old syntax for indicating a set of characters.", 'start': 1390.767, 'duration': 4.988}, {'end': 1398.019, 'text': "And it's going to use the square brackets.", 'start': 1396.196, 'duration': 1.823}, {'end': 1404.32, 'text': "So inside of the square brackets, I can put, well, here's the set of characters that I'm going to allow here.", 'start': 1399.437, 'duration': 4.883}, {'end': 1409.243, 'text': "And actually, the backslash w works inside of the square bracket because it's just such a common case.", 'start': 1404.58, 'duration': 4.663}, {'end': 1414.027, 'text': 'So what I want to say here is, well, backslash w or, say, dot.', 'start': 1409.564, 'duration': 4.463}], 'summary': 'Regular expressions use square brackets to indicate a set of characters, including backslash w and dot.', 'duration': 27.027, 'max_score': 1387, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe41387000.jpg'}, {'end': 1604.779, 'src': 'embed', 'start': 1574.079, 'weight': 2, 'content': [{'end': 1575.44, 'text': "All I've been doing there is just using group.", 'start': 1574.079, 'duration': 1.361}, {'end': 1580.844, 'text': "So now what I'd like to show you is I'm going to stop using my find function.", 'start': 1575.881, 'duration': 4.963}, {'end': 1582.886, 'text': "I'm going to start doing this raw here.", 'start': 1580.864, 'duration': 2.022}, {'end': 1595.434, 'text': "And what I'd like to do is I want to imagine that I want to pick out the username and the host name separately.", 'start': 1582.906, 'duration': 12.528}, {'end': 1597.235, 'text': 'I want to sort of pick those out.', 'start': 1595.454, 'duration': 1.781}, {'end': 1601.717, 'text': 'And so I will just go back here.', 'start': 1598.375, 'duration': 3.342}, {'end': 1604.779, 'text': "I'll just change this to m equals re.search.", 'start': 1602.077, 'duration': 2.702}], 'summary': 'Using regular expressions to separate username and hostname.', 'duration': 30.7, 'max_score': 1574.079, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe41574079.jpg'}], 'start': 1291.926, 'title': 'Using regular expressions for email extraction', 'summary': 'Discusses using regular expressions to extract email addresses from a body of text, highlighting the challenges faced, the process of refining the expression, and the practical application, ultimately achieving the extraction of email addresses. it also covers the basics of regular expressions, including the use of square brackets and the process of using parentheses in the regular expression for extracting username and host name separately.', 'chapters': [{'end': 1465.127, 'start': 1291.926, 'title': 'Regular expressions for email extraction', 'summary': 'Discusses using regular expressions to extract email addresses from a body of text, highlighting the challenges faced and the process of refining the expression to include necessary characters, ultimately achieving the extraction of email addresses.', 'duration': 173.201, 'highlights': ['The chapter discusses the process of refining regular expressions to extract email addresses from a body of text, highlighting the challenges faced and the necessary characters required to achieve successful extraction.', 'The presenter demonstrates the limitations of using backslash W plus and then an at sign to extract email addresses and explains the need to include additional characters such as dot in the expression.', 'The presenter explains the use of square brackets to indicate a set of characters in regular expressions and demonstrates the inclusion of necessary characters like backslash w or dot to successfully extract email addresses.', 'The discussion also touches on the complexities and necessary considerations when dealing with characters like dot within square brackets in regular expressions.']}, {'end': 1573.259, 'start': 1465.507, 'title': 'Regular expression basics', 'summary': 'Covers the basics of regular expressions, including the use of square brackets to define sets of characters and restricting the first character with a word character, as well as demonstrating the flexibility of the order inside the brackets and the practical application using an emails example.', 'duration': 107.752, 'highlights': ['The use of square brackets to define sets of characters and the application in regular expressions.', 'Restricting the first character to be a word character and the use of dot, plus, and star in the pattern.', 'Demonstrating the flexibility of the order inside the brackets and the concept of a set of characters.']}, {'end': 1732.84, 'start': 1574.079, 'title': 'Using regular expressions to extract username and host name', 'summary': 'Explains how to manually extract the username and host name separately using regular expressions, highlighting the process of using parentheses in the regular expression and accessing the extracted parts using group numbers.', 'duration': 158.761, 'highlights': ['Explaining the process of manually extracting the username and host name using regular expressions', 'Accessing the extracted parts using group numbers', 'Addressing a question about the impact of using plus or star after the parentheses on group numbering']}], 'duration': 440.914, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe41291926.jpg', 'highlights': ['The chapter discusses the process of refining regular expressions to extract email addresses from a body of text, highlighting the challenges faced and the necessary characters required to achieve successful extraction.', 'The presenter explains the use of square brackets to indicate a set of characters in regular expressions and demonstrates the inclusion of necessary characters like backslash w or dot to successfully extract email addresses.', 'Explaining the process of manually extracting the username and host name using regular expressions', 'The use of square brackets to define sets of characters and the application in regular expressions.', 'The presenter demonstrates the limitations of using backslash W plus and then an at sign to extract email addresses and explains the need to include additional characters such as dot in the expression.']}, {'end': 2049.496, 'segs': [{'end': 1844.459, 'src': 'embed', 'start': 1819.879, 'weight': 0, 'content': [{'end': 1828.686, 'text': 'So a pattern I always enjoy, because it just saves me so much work, is I just call f.read and I pass that in as the second argument to a find all.', 'start': 1819.879, 'duration': 8.807}, {'end': 1832.369, 'text': 'I just feed the entire file into an re.findall.', 'start': 1828.746, 'duration': 3.623}, {'end': 1833.209, 'text': 'I have a pattern.', 'start': 1832.609, 'duration': 0.6}, {'end': 1836.432, 'text': 'I just let it rip through the entire text, skip new lines, whatever.', 'start': 1833.269, 'duration': 3.163}, {'end': 1837.433, 'text': 'All that stuff it just handles.', 'start': 1836.452, 'duration': 0.981}, {'end': 1841.056, 'text': 'And it just pulls out the things I want and just returns them to me as a Python list.', 'start': 1837.693, 'duration': 3.363}, {'end': 1844.459, 'text': 'And then you could write a for loop or all the stuff we were doing yesterday.', 'start': 1841.936, 'duration': 2.523}], 'summary': 'Using f.read with re.findall to efficiently extract desired data from a file.', 'duration': 24.58, 'max_score': 1819.879, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe41819879.jpg'}, {'end': 1916.754, 'src': 'embed', 'start': 1885.395, 'weight': 2, 'content': [{'end': 1887.396, 'text': "I'm going to return tuples length two.", 'start': 1885.395, 'duration': 2.001}, {'end': 1892.158, 'text': 'So each tuple represents a single match and then the tuple just has the groups in there.', 'start': 1887.876, 'duration': 4.282}, {'end': 1896.104, 'text': 'So that, yeah, you can see where this can be pretty handy.', 'start': 1893.303, 'duration': 2.801}, {'end': 1898.285, 'text': "if you've got some big file and you just want to kind of.", 'start': 1896.104, 'duration': 2.181}, {'end': 1899.446, 'text': "there's some part about it you care about.", 'start': 1898.285, 'duration': 1.161}, {'end': 1901.127, 'text': 'you just want to rip it out as lazily as possible.', 'start': 1899.446, 'duration': 1.681}, {'end': 1903.628, 'text': 'So, re.findall works really well for that.', 'start': 1901.867, 'duration': 1.761}, {'end': 1906.649, 'text': 'Excuse me? You lose the format.', 'start': 1904.508, 'duration': 2.141}, {'end': 1908.71, 'text': 'Yeah, I mean, I would say, the suggestion is you lose the format.', 'start': 1906.669, 'duration': 2.041}, {'end': 1911.431, 'text': "I'd say, well, the regular expression is narrowing.", 'start': 1908.73, 'duration': 2.701}, {'end': 1913.012, 'text': 'You get to say what you want to keep.', 'start': 1911.511, 'duration': 1.501}, {'end': 1916.754, 'text': 'And so, if you want to keep more, you know, write the regular expression bigger, you know, to keep more.', 'start': 1913.452, 'duration': 3.302}], 'summary': 'Using re.findall to extract specific data from a big file, suggested to lose the format and narrow the regular expression to keep more data.', 'duration': 31.359, 'max_score': 1885.395, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe41885395.jpg'}, {'end': 1989.687, 'src': 'embed', 'start': 1963.798, 'weight': 1, 'content': [{'end': 1972.576, 'text': "all, I had said that the dot matches any character except for new line, and that's kind of a historical thing,", 'start': 1963.798, 'duration': 8.778}, {'end': 1974.377, 'text': 'because the processing tended to go line by line.', 'start': 1972.576, 'duration': 1.801}, {'end': 1978.858, 'text': 'If you add the dot all flag, then the dot will match new line as well.', 'start': 1975.517, 'duration': 3.341}, {'end': 1983.74, 'text': "And so you could, because right now, if you use dot, your pattern can't span more than one line.", 'start': 1979.258, 'duration': 4.482}, {'end': 1989.687, 'text': "Although, if you use backslash s, where you think there's a new line, that'll span a line, but the dot will not go over a line.", 'start': 1984.801, 'duration': 4.886}], 'summary': 'Adding the dot all flag allows the dot to match new lines, enabling patterns to span more than one line.', 'duration': 25.889, 'max_score': 1963.798, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe41963798.jpg'}], 'start': 1733.26, 'title': 'Python regular expression functions', 'summary': 'Explains python regular expression functions like re.search, re.findall, and their optional arguments, and their application in processing text data.', 'chapters': [{'end': 2049.496, 'start': 1733.26, 'title': 'Python regular expression functions', 'summary': 'Explains python regular expression functions, such as re.search, re.findall, and optional arguments, and the application of these functions in processing text data.', 'duration': 316.236, 'highlights': ['The re.findall function is the favorite regular expression function and it returns all matches in a Python list, allowing for easy processing of the text data.', 'The re.findall function can be used to lazily extract specific parts of a big file by specifying the desired pattern.', 'The optional arguments in regular expressions, such as ignore case and dot all, provide additional flexibility in matching and processing text data.']}], 'duration': 316.236, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe41733260.jpg', 'highlights': ['The re.findall function is the favorite regular expression function and it returns all matches in a Python list, allowing for easy processing of the text data.', 'The optional arguments in regular expressions, such as ignore case and dot all, provide additional flexibility in matching and processing text data.', 'The re.findall function can be used to lazily extract specific parts of a big file by specifying the desired pattern.']}, {'end': 2509.012, 'segs': [{'end': 2102.515, 'src': 'embed', 'start': 2075.12, 'weight': 0, 'content': [{'end': 2078.382, 'text': 'If you do a Google search for Social Security Administration baby names,', 'start': 2075.12, 'duration': 3.262}, {'end': 2085.888, 'text': 'they do this thing where they keep track of what the popular baby names are for babies born in that year.', 'start': 2078.382, 'duration': 7.506}, {'end': 2088.37, 'text': "And they've been doing it actually for 100 years.", 'start': 2087.248, 'duration': 1.122}, {'end': 2091.752, 'text': 'So you could look at 1900, 1950, whatever.', 'start': 2089.991, 'duration': 1.761}, {'end': 2093.112, 'text': "You can just see what's there.", 'start': 2091.772, 'duration': 1.34}, {'end': 2096.053, 'text': "And it turns out for baby names, there's sort of a, there's a popularity of it.", 'start': 2093.132, 'duration': 2.921}, {'end': 2097.513, 'text': "There's sort of names kind of ebb and flow.", 'start': 2096.072, 'duration': 1.441}, {'end': 2102.515, 'text': 'So I look at this and I see assignment idea.', 'start': 2098.073, 'duration': 4.442}], 'summary': 'The social security administration tracks popular baby names for 100 years, showing the ebb and flow of name popularity, inspiring an assignment idea.', 'duration': 27.395, 'max_score': 2075.12, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe42075120.jpg'}, {'end': 2179.772, 'src': 'embed', 'start': 2148.492, 'weight': 1, 'content': [{'end': 2150.514, 'text': 'Down to the, you know, the less popular names.', 'start': 2148.492, 'duration': 2.022}, {'end': 2151.774, 'text': 'All right.', 'start': 2151.594, 'duration': 0.18}, {'end': 2156.477, 'text': 'So, what I would like to do, going back to Python here.', 'start': 2152.235, 'duration': 4.242}, {'end': 2158.279, 'text': "Let's see where I have it.", 'start': 2156.497, 'duration': 1.782}, {'end': 2162.361, 'text': "Okay So, I'm going to go into day two here.", 'start': 2158.619, 'duration': 3.742}, {'end': 2164.523, 'text': "And there's this directory babynames.", 'start': 2163.102, 'duration': 1.421}, {'end': 2168.909, 'text': "So if I look inside here, I'm going to look at baby 1990.", 'start': 2165.888, 'duration': 3.021}, {'end': 2175.671, 'text': "I've pulled this sort of, I've just sort of copied and cleaned up just a teeny bit the text from the Social Security Administration site.", 'start': 2168.909, 'duration': 6.762}, {'end': 2176.451, 'text': 'But this is very realistic.', 'start': 2175.691, 'duration': 0.76}, {'end': 2179.772, 'text': "Okay, well, there's some poorly written CSS and whatever.", 'start': 2176.511, 'duration': 3.261}], 'summary': 'Discussion about python and analyzing baby names data from 1990.', 'duration': 31.28, 'max_score': 2148.492, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe42148492.jpg'}, {'end': 2272.524, 'src': 'heatmap', 'start': 2213.939, 'weight': 0.826, 'content': [{'end': 2219.001, 'text': "And then there's some more TD stuff, and there's Michael, and there's Jessica, and then here's row two, and so on.", 'start': 2213.939, 'duration': 5.062}, {'end': 2221.143, 'text': "And it just goes on like, there's all the data.", 'start': 2219.282, 'duration': 1.861}, {'end': 2225.505, 'text': 'This is beginning to look like an actual problem.', 'start': 2223.624, 'duration': 1.881}, {'end': 2225.785, 'text': 'All right.', 'start': 2225.525, 'duration': 0.26}, {'end': 2236.202, 'text': "So the first thing that I want your baby names program to do is given a file like baby1990.html and I'm going to pipe this into more.", 'start': 2226.085, 'duration': 10.117}, {'end': 2239.143, 'text': 'What I want you to do is I want you to rip through that entire file.', 'start': 2236.862, 'duration': 2.281}, {'end': 2241.724, 'text': 'I want you to figure out what year it represents.', 'start': 2239.863, 'duration': 1.861}, {'end': 2244.405, 'text': 'I want you to pull out all the names and all the ranks.', 'start': 2241.964, 'duration': 2.441}, {'end': 2250.167, 'text': "I want you to organize it so that you can then produce a printout that's just in alphabetical order by name.", 'start': 2244.845, 'duration': 5.322}, {'end': 2251.347, 'text': 'So just as shown here.', 'start': 2250.627, 'duration': 0.72}, {'end': 2257.389, 'text': 'So you say, so the first you print the year and then I want to see Aaron 34, Abby 42 and so on.', 'start': 2251.647, 'duration': 5.742}, {'end': 2259.53, 'text': "So you're just showing alphabetical list, here's what all the names are.", 'start': 2257.409, 'duration': 2.121}, {'end': 2266.599, 'text': "So that'll get us through the.", 'start': 2262.376, 'duration': 4.223}, {'end': 2271.042, 'text': "Yeah, so what's going to happen, there's a strange case, but sometimes a name will appear as both male and female.", 'start': 2266.599, 'duration': 4.443}, {'end': 2272.524, 'text': "And I'm not making any distinction, male from female.", 'start': 2271.063, 'duration': 1.461}], 'summary': 'Develop a program to extract and organize baby names from a file, showing them in alphabetical order with their ranks for a specific year.', 'duration': 58.585, 'max_score': 2213.939, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe42213939.jpg'}, {'end': 2266.599, 'src': 'embed', 'start': 2236.862, 'weight': 2, 'content': [{'end': 2239.143, 'text': 'What I want you to do is I want you to rip through that entire file.', 'start': 2236.862, 'duration': 2.281}, {'end': 2241.724, 'text': 'I want you to figure out what year it represents.', 'start': 2239.863, 'duration': 1.861}, {'end': 2244.405, 'text': 'I want you to pull out all the names and all the ranks.', 'start': 2241.964, 'duration': 2.441}, {'end': 2250.167, 'text': "I want you to organize it so that you can then produce a printout that's just in alphabetical order by name.", 'start': 2244.845, 'duration': 5.322}, {'end': 2251.347, 'text': 'So just as shown here.', 'start': 2250.627, 'duration': 0.72}, {'end': 2257.389, 'text': 'So you say, so the first you print the year and then I want to see Aaron 34, Abby 42 and so on.', 'start': 2251.647, 'duration': 5.742}, {'end': 2259.53, 'text': "So you're just showing alphabetical list, here's what all the names are.", 'start': 2257.409, 'duration': 2.121}, {'end': 2266.599, 'text': "So that'll get us through the.", 'start': 2262.376, 'duration': 4.223}], 'summary': 'Rip through file, extract year, names, ranks, and print alphabetically by name.', 'duration': 29.737, 'max_score': 2236.862, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe42236862.jpg'}, {'end': 2366.032, 'src': 'embed', 'start': 2338.344, 'weight': 3, 'content': [{'end': 2342.464, 'text': 'You know, use a regular expression, find all, maybe a dictionary, I mean, just total regular work.', 'start': 2338.344, 'duration': 4.12}, {'end': 2354.209, 'text': "So for part B, What I'm going to do is there's an option called dash dash summary file.", 'start': 2343.405, 'duration': 10.804}, {'end': 2357.59, 'text': "And I'm going to run this on a star.", 'start': 2355.309, 'duration': 2.281}, {'end': 2360.871, 'text': "I'm going to say baby star dot HTML.", 'start': 2357.75, 'duration': 3.121}, {'end': 2366.032, 'text': 'In that case, I want you to produce no output.', 'start': 2363.392, 'duration': 2.64}], 'summary': 'Using regex to find all files with .html extension and produce no output', 'duration': 27.688, 'max_score': 2338.344, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe42338344.jpg'}, {'end': 2480.183, 'src': 'embed', 'start': 2458.592, 'weight': 4, 'content': [{'end': 2468.679, 'text': "And the question is, in what year did the movie The Matrix come out? Yeah, there's another PhD thesis you could do here.", 'start': 2458.592, 'duration': 10.087}, {'end': 2474.362, 'text': "It's sort of like, well, maybe The Matrix was reacting to a social phenomenon, or it was the other way around.", 'start': 2468.699, 'duration': 5.663}, {'end': 2475.223, 'text': "It's all very complicated.", 'start': 2474.562, 'duration': 0.661}, {'end': 2478.982, 'text': 'Yeah, Freakonomics comes in.', 'start': 2477.782, 'duration': 1.2}, {'end': 2480.183, 'text': 'There was a New York Times Magazine article about it.', 'start': 2479.002, 'duration': 1.181}], 'summary': 'The matrix movie release year is a topic of social analysis.', 'duration': 21.591, 'max_score': 2458.592, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe42458592.jpg'}], 'start': 2051.937, 'title': 'Baby name analysis with python', 'summary': "Delves into social security's baby name tracking over 100 years, presenting a python program for organizing and printing names. it also covers file handling, regular expressions, and analyzing baby name popularity data over a decade.", 'chapters': [{'end': 2272.524, 'start': 2051.937, 'title': 'Social security baby names analysis', 'summary': "Discusses the social security administration's practice of tracking popular baby names, dating back 100 years, and presents a python program that organizes and prints baby names in alphabetical order by year.", 'duration': 220.587, 'highlights': ['The Social Security Administration tracks popular baby names for babies born in a year, dating back 100 years.', 'The presented Python program organizes and prints baby names in alphabetical order by year.', "The program's output includes an alphabetical list of names with their respective ranks for a particular year."]}, {'end': 2509.012, 'start': 2272.804, 'title': 'Baby name popularity analysis', 'summary': "Covers file handling in python, including reading and writing files, using regular expressions to extract data, and creating summary files based on a given option. it also delves into analyzing baby name popularity data over a decade and poses a question regarding the release year of the movie 'the matrix'.", 'duration': 236.208, 'highlights': ['The chapter covers file handling in Python, including reading and writing files, using regular expressions to extract data, and creating summary files based on a given option.', "Analyzing baby name popularity data over a decade and posing a question regarding the release year of the movie 'The Matrix'.", 'The chapter involves a combination of coding, lunch break, and resumption of work.']}], 'duration': 457.075, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/kWyoYtvJpe4/pics/kWyoYtvJpe42051937.jpg', 'highlights': ['The Social Security Administration tracks popular baby names for babies born in a year, dating back 100 years.', 'The presented Python program organizes and prints baby names in alphabetical order by year.', "The program's output includes an alphabetical list of names with their respective ranks for a particular year.", 'The chapter covers file handling in Python, including reading and writing files, using regular expressions to extract data, and creating summary files based on a given option.', "Analyzing baby name popularity data over a decade and posing a question regarding the release year of the movie 'The Matrix'."]}], 'highlights': ['The re.findall function is the favorite regular expression function and it returns all matches in a Python list, allowing for easy processing of the text data.', 'The chapter discusses the process of refining regular expressions to extract email addresses from a body of text, highlighting the challenges faced and the necessary characters required to achieve successful extraction.', 'The Social Security Administration tracks popular baby names for babies born in a year, dating back 100 years.', 'The presented Python program organizes and prints baby names in alphabetical order by year.', "The program's output includes an alphabetical list of names with their respective ranks for a particular year.", 'The use of raw strings in Python regular expressions frees from backslash processing.', 'The plus operator in regular expressions is greedy, going as far as it can and then stopping.', 'The functionality of backslash uppercase S as a non-whitespace character recognition tool in regular expressions.', 'The search function in the re module is used to find the first instance of a pattern within a text', "The search is going to go left to right. And it's satisfied as soon as it finds a solution."]}