title
Python 3 Programming Tutorial - urllib module
description
The urllib module in Python 3 allows you access websites via your program. This opens up as many doors for your programs as the internet opens up for you. urllib in Python 3 is slightly different than urllib2 in Python 2, but they are mostly the same. Through urllib, you can access websites, download data, parse data, modify your headers, and do any GET and POST requests you might need to do.
Sample code for this basics series: http://pythonprogramming.net/beginner-python-programming-tutorials/
Python 3 Programming tutorial Playlist: http://www.youtube.com/watch?v=oVp1vrfL_w4&feature=share&list=PLQVvvaa0QuDe8XSftW-RAxdo6OmaeL85M
http://seaofbtc.com
http://sentdex.com
http://hkinsley.com
https://twitter.com/sentdex
Bitcoin donations: 1GV7srgR4NJx4vrk7avCmmVQQrqmv87ty6
detail
{'title': 'Python 3 Programming Tutorial - urllib module', 'heatmap': [{'end': 145.563, 'start': 129.685, 'weight': 1}, {'end': 1058.577, 'start': 1018.828, 'weight': 0.701}], 'summary': "This tutorial covers python 3 urllib module for internet access, data retrieval, and url encoding, along with practical examples for making http requests, handling exceptions, web scraping, and using regular expressions, providing a comprehensive understanding of python's capabilities for web interactions.", 'chapters': [{'end': 197.265, 'segs': [{'end': 28.299, 'src': 'embed', 'start': 0.149, 'weight': 0, 'content': [{'end': 3.07, 'text': 'Hello everybody, and welcome to another Python 3 tutorial video.', 'start': 0.149, 'duration': 2.921}, {'end': 8.612, 'text': "In this video, what we're going to be talking about is another one of our standard library modules, and that's going to be urllib.", 'start': 3.11, 'duration': 5.502}, {'end': 14.154, 'text': 'The idea of urllib is it allows you to, via Python, access the internet.', 'start': 9.412, 'duration': 4.742}, {'end': 23.557, 'text': 'So, just like the internet allows you to do all sorts of amazing things, urllib is going to let you do all sorts of the same amazing things,', 'start': 14.214, 'duration': 9.343}, {'end': 26.038, 'text': 'only using Python in your programming language.', 'start': 23.557, 'duration': 2.481}, {'end': 28.299, 'text': "So with that, let's go ahead and get started.", 'start': 27.078, 'duration': 1.221}], 'summary': 'Python 3 tutorial on using urllib to access the internet.', 'duration': 28.15, 'max_score': 0.149, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/5GzVNi0oTxQ/pics/5GzVNi0oTxQ149.jpg'}, {'end': 113.475, 'src': 'embed', 'start': 80.515, 'weight': 1, 'content': [{'end': 83.276, 'text': 'So an example of visiting a website will be as follows.', 'start': 80.515, 'duration': 2.761}, {'end': 85.577, 'text': "So let's say we'll define a variable as x.", 'start': 83.316, 'duration': 2.261}, {'end': 91.199, 'text': "And we'll say x equals urllib.request.urlopen.", 'start': 85.577, 'duration': 5.622}, {'end': 97.441, 'text': 'And then in these parameters is where we specify the address that we want to visit.', 'start': 91.819, 'duration': 5.622}, {'end': 101.863, 'text': 'You always have to leave this with HTTP or HTTPS.', 'start': 97.821, 'duration': 4.042}, {'end': 113.475, 'text': "So, for example, HTTPS colon slash, slash, and let's go to www.google.com.", 'start': 102.383, 'duration': 11.092}], 'summary': "Using python's urllib library, we can visit websites by defining a variable and specifying the address, such as www.google.com, with https.", 'duration': 32.96, 'max_score': 80.515, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/5GzVNi0oTxQ/pics/5GzVNi0oTxQ80515.jpg'}, {'end': 177.027, 'src': 'heatmap', 'start': 129.685, 'weight': 2, 'content': [{'end': 132.486, 'text': "so we're reading the request.", 'start': 129.685, 'duration': 2.801}, {'end': 134.768, 'text': 'so we can now save and run this.', 'start': 132.486, 'duration': 2.282}, {'end': 138.088, 'text': 'And this is our output.', 'start': 137.026, 'duration': 1.062}, {'end': 142.377, 'text': 'just a whole bunch of gobbledygook text.', 'start': 138.088, 'duration': 4.289}, {'end': 145.563, 'text': 'But this is basically the source code of Google.com.', 'start': 142.838, 'duration': 2.725}, {'end': 148.669, 'text': 'So for example, we could open up a browser.', 'start': 145.603, 'duration': 3.066}, {'end': 161.778, 'text': 'And we could go to the top, go google.com, hit U, or control U rather.', 'start': 151.972, 'duration': 9.806}, {'end': 167.04, 'text': 'And this is the source code, right? So again, it is just a bunch of junk here.', 'start': 162.118, 'duration': 4.922}, {'end': 173.944, 'text': "But you get the idea that this is what we've done is we've used Python to reach this page.", 'start': 167.221, 'duration': 6.723}, {'end': 177.027, 'text': 'so we can minimize that.', 'start': 175.405, 'duration': 1.622}], 'summary': 'Python used to access google.com source code.', 'duration': 34.189, 'max_score': 129.685, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/5GzVNi0oTxQ/pics/5GzVNi0oTxQ129685.jpg'}], 'start': 0.149, 'title': 'Python 3 urllib module', 'summary': 'Introduces the python 3 urllib module, detailing its usage for internet access, data retrieval, and showcasing an example, emphasizing the importance of importing urllib.request.', 'chapters': [{'end': 197.265, 'start': 0.149, 'title': 'Python 3 urllib module', 'summary': 'Introduces the python 3 urllib module, explaining how to use it to access the internet and retrieve data from a specified url, emphasizing the importance of importing urllib.request and showcasing an example of visiting a website using urllib.', 'duration': 197.116, 'highlights': ['The chapter introduces the Python 3 urllib module, explaining how to use it to access the internet and retrieve data from a specified URL. The tutorial video focuses on the urllib module, demonstrating its use for accessing the internet and obtaining data from a specified URL.', 'The importance of importing urllib.request is emphasized, and an example of visiting a website using urllib is showcased. The tutorial emphasizes the need to import urllib.request in Python 3 and demonstrates how to visit a website using urllib by specifying the address and making a request.', 'The process of reading the request and obtaining the source code of a website using Python is explained. The tutorial explains the process of reading the request with urllib and obtaining the source code of a website using Python, showcasing an example with www.google.com.']}], 'duration': 197.116, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/5GzVNi0oTxQ/pics/5GzVNi0oTxQ149.jpg', 'highlights': ['The chapter introduces the Python 3 urllib module, explaining how to use it to access the internet and retrieve data from a specified URL.', 'The importance of importing urllib.request is emphasized, and an example of visiting a website using urllib is showcased.', 'The process of reading the request and obtaining the source code of a website using Python is explained.']}, {'end': 407.019, 'segs': [{'end': 245.219, 'src': 'embed', 'start': 197.265, 'weight': 0, 'content': [{'end': 202.969, 'text': "So we're gonna have to show how to how to handle that, and actually how to handle that will be using another standard library.", 'start': 197.265, 'duration': 5.704}, {'end': 203.889, 'text': 'So have no fear.', 'start': 202.969, 'duration': 0.92}, {'end': 207.052, 'text': "We'll be covering that very shortly.", 'start': 203.909, 'duration': 3.143}, {'end': 210.214, 'text': 'So the next thing I want to talk about is post.', 'start': 207.052, 'duration': 3.162}, {'end': 222.583, 'text': "so, for example, if we were to go to, let's say we go back to where we were and we do the following let's say we want to go to python programming net,", 'start': 210.214, 'duration': 12.369}, {'end': 227.092, 'text': "and And that's where we can get all of our sample code.", 'start': 222.583, 'duration': 4.509}, {'end': 227.874, 'text': "if you're not familiar,", 'start': 227.092, 'duration': 0.782}, {'end': 230.479, 'text': "But if we scroll down to the bottom, there's actually a search bar here.", 'start': 228.234, 'duration': 2.245}, {'end': 231.34, 'text': 'We could search.', 'start': 230.779, 'duration': 0.561}, {'end': 232.983, 'text': "And let's say we search for basic.", 'start': 231.601, 'duration': 1.382}, {'end': 238.053, 'text': 'Okay, and you get a bunch of search results for the keyword basic.', 'start': 233.809, 'duration': 4.244}, {'end': 245.219, 'text': 'But if you look at our URL, you see that we have some extra stuff added to the end of our python programming net.', 'start': 238.053, 'duration': 7.166}], 'summary': 'Demonstrating use of standard library for handling, searching and retrieving sample code from python programming net.', 'duration': 47.954, 'max_score': 197.265, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/5GzVNi0oTxQ/pics/5GzVNi0oTxQ197265.jpg'}, {'end': 314.922, 'src': 'embed', 'start': 270.36, 'weight': 2, 'content': [{'end': 271.36, 'text': 'that is true.', 'start': 270.36, 'duration': 1}, {'end': 276.061, 'text': 'so if you look at variables, or at least links that have variables in it,', 'start': 271.36, 'duration': 4.701}, {'end': 282.303, 'text': 'the first variable will have a question mark and then the variable name equals, and then all subsequent variables are gonna have this little and sign,', 'start': 276.061, 'duration': 6.242}, {'end': 285.723, 'text': 'and then the variable equals and it continues on like that.', 'start': 282.303, 'duration': 3.42}, {'end': 289.624, 'text': "so that's this is an example of a get request.", 'start': 285.723, 'duration': 3.901}, {'end': 292.085, 'text': "we're getting data based on these.", 'start': 289.624, 'duration': 2.461}, {'end': 295.006, 'text': "Well, actually it's a post right?", 'start': 293.905, 'duration': 1.101}, {'end': 298.769, 'text': "We're getting data based on these posted variables.", 'start': 295.266, 'duration': 3.503}, {'end': 301.331, 'text': "So let's say we want to make a post request.", 'start': 299.289, 'duration': 2.042}, {'end': 308.036, 'text': 'Now, first of all, you could just do a get to this URL, right? You could just use a request and put in this URL.', 'start': 301.371, 'duration': 6.665}, {'end': 314.922, 'text': "But the other thing that we can do, and the more Pythonic thing that we're supposed to do, is go to pythonprogramming.net.", 'start': 308.537, 'duration': 6.385}], 'summary': 'Explanation of variable usage in get and post requests.', 'duration': 44.562, 'max_score': 270.36, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/5GzVNi0oTxQ/pics/5GzVNi0oTxQ270360.jpg'}], 'start': 197.265, 'title': "Python's standard libraries and requests", 'summary': 'Covers using standard libraries in python for specific tasks such as filtering search results and understanding the structure, syntax, and pythonic approach to perform get and post requests, including an example of sending data using urllib.parse module.', 'chapters': [{'end': 245.219, 'start': 197.265, 'title': 'Using standard libraries in python', 'summary': "Discusses using standard libraries in python to handle specific tasks and demonstrates how to use a search bar on a website to filter results based on keywords, as seen in the example of searching for 'basic' on pythonprogramming.net.", 'duration': 47.954, 'highlights': ['Demonstrating the use of standard libraries in Python to handle specific tasks. The chapter covers how to handle tasks using standard libraries in Python.', "Using a search bar on a website to filter results based on keywords, exemplified by searching for 'basic' on pythonprogramming.net. The example illustrates the process of using a search bar to filter results based on keywords, as seen when searching for 'basic' on pythonprogramming.net."]}, {'end': 407.019, 'start': 245.219, 'title': 'Understanding get and post requests', 'summary': 'Discusses the structure of get and post requests, highlighting the syntax and purpose of each method, and demonstrates the pythonic way to perform a post request using the urllib.parse module and a dictionary to send data to a url.', 'duration': 161.8, 'highlights': ['The chapter discusses the structure of GET and POST requests The chapter provides insights into the structure of GET and POST requests and how variables are defined within the URL.', 'Demonstrates the Pythonic way to perform a POST request using the urllib.parse module and a dictionary to send data to a URL The chapter demonstrates the Pythonic method of executing a POST request by utilizing the urllib.parse module and a dictionary to send data to a specific URL.', 'Variables are defined within the URL using specific syntax like a question mark, equal sign, and ampersand The variables in the URL are defined using a specific syntax involving a question mark, equal sign, and ampersand, with subsequent variables denoted by an ampersand.']}], 'duration': 209.754, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/5GzVNi0oTxQ/pics/5GzVNi0oTxQ197265.jpg', 'highlights': ['The chapter covers how to handle tasks using standard libraries in Python.', 'The example illustrates the process of using a search bar to filter results based on keywords.', 'The chapter provides insights into the structure of GET and POST requests and how variables are defined within the URL.', 'The chapter demonstrates the Pythonic method of executing a POST request by utilizing the urllib.parse module and a dictionary to send data to a specific URL.', 'The variables in the URL are defined using a specific syntax involving a question mark, equal sign, and ampersand, with subsequent variables denoted by an ampersand.']}, {'end': 778.909, 'segs': [{'end': 461.464, 'src': 'embed', 'start': 407.359, 'weight': 0, 'content': [{'end': 414.808, 'text': 'So this is going to be data from the website equals urllib.parse.urlencode.', 'start': 407.359, 'duration': 7.449}, {'end': 417.571, 'text': 'And we want to encode values.', 'start': 415.288, 'duration': 2.283}, {'end': 425.578, 'text': "So first we're just encoding simply values, and what you really encode is going to do is it's going to encode it as it? Should be in the url.", 'start': 419.152, 'duration': 6.426}, {'end': 438.288, 'text': "so, for example, if we go back to where we've been working and we go to, like google.com, and we did a search for Hey, check that out, okay?", 'start': 425.578, 'duration': 12.71}, {'end': 442.432, 'text': "You see that it's hey plus, check, plus that, plus out.", 'start': 438.749, 'duration': 3.683}, {'end': 449.228, 'text': "you could also do the query is this hey, check that out, and least, usually it's not there.", 'start': 442.432, 'duration': 6.796}, {'end': 455.336, 'text': 'it goes okay, and you can see that it has changed now to hey percent 20, check percent, 20, that percent.', 'start': 449.228, 'duration': 6.108}, {'end': 456.197, 'text': 'what was that?', 'start': 455.336, 'duration': 0.861}, {'end': 458.06, 'text': "that's? URL encoding percent 20.", 'start': 456.197, 'duration': 1.863}, {'end': 461.464, 'text': 'is the encode of space, okay?', 'start': 458.06, 'duration': 3.404}], 'summary': 'The transcript covers encoding values using urllib.parse.urlencode, demonstrating encoding of values in a url, such as converting spaces to percent 20.', 'duration': 54.105, 'max_score': 407.359, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/5GzVNi0oTxQ/pics/5GzVNi0oTxQ407359.jpg'}, {'end': 556.611, 'src': 'embed', 'start': 530.036, 'weight': 4, 'content': [{'end': 537.345, 'text': "So we're going to request now from this URL, pythonprogrammer.net, we're going to pass the following variables, s equals basic, submit equals search.", 'start': 530.036, 'duration': 7.309}, {'end': 548.016, 'text': "And then after rec, we're gonna say resp for response equals urllib.request.urlopen rec.", 'start': 539.008, 'duration': 9.008}, {'end': 553.37, 'text': "okay. so now we're actually URL request that you're all open.", 'start': 548.848, 'duration': 4.522}, {'end': 556.611, 'text': "we're actually visiting the URL now, like we did right up here.", 'start': 553.37, 'duration': 3.241}], 'summary': 'Using pythonprogrammer.net, sending s=basic, submit=search, and visiting the url with urllib.request.urlopen.', 'duration': 26.575, 'max_score': 530.036, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/5GzVNi0oTxQ/pics/5GzVNi0oTxQ530036.jpg'}, {'end': 755.873, 'src': 'embed', 'start': 716.035, 'weight': 3, 'content': [{'end': 720.558, 'text': "It's almost like, I don't know, some sort of filter, right? If you're not good enough, you can't use it.", 'start': 716.035, 'duration': 4.523}, {'end': 724.1, 'text': "But if you're good enough, I guess you can use their services still with your program.", 'start': 720.598, 'duration': 3.502}, {'end': 732.267, 'text': 'That said, usually websites that block your access, they do it because they offer an API and they want you to use their API.', 'start': 725.905, 'duration': 6.362}, {'end': 733.787, 'text': 'Google offers an API.', 'start': 732.367, 'duration': 1.42}, {'end': 736.748, 'text': "So try to use Google's API before you cheat Google.", 'start': 734.268, 'duration': 2.48}, {'end': 742.41, 'text': "Try to use Wikipedia's API before you start cheating Wikipedia and just programming your way around it.", 'start': 737.088, 'duration': 5.322}, {'end': 750.592, 'text': "Because the API is gonna make it easier on Wikipedia and it's gonna make it easier on YouTube because they don't need to send All of the HTML data.", 'start': 743.25, 'duration': 7.342}, {'end': 752.972, 'text': "They don't need to send serve advertisements right?", 'start': 750.612, 'duration': 2.36}, {'end': 755.873, 'text': "Because your program isn't gonna read it, That kind of stuff.", 'start': 752.992, 'duration': 2.881}], 'summary': "Apis like google and wikipedia's can make accessing data easier and more efficient for developers, reducing the need to bypass website filters.", 'duration': 39.838, 'max_score': 716.035, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/5GzVNi0oTxQ/pics/5GzVNi0oTxQ716035.jpg'}], 'start': 407.359, 'title': 'Url encoding and web scraping with python', 'summary': 'Covers url encoding with urllib.parse.urlencode, demonstrating encoding of spaces as percent 20 in urls, and provides a practical example. additionally, it discusses web scraping, making requests, dealing with access restrictions, and emphasizes the importance of using apis for web scraping.', 'chapters': [{'end': 461.464, 'start': 407.359, 'title': 'Url encoding with urllib.parse.urlencode', 'summary': 'Explains the process of encoding values using urllib.parse.urlencode, demonstrating how it encodes spaces as percent 20 in urls, as well as providing a practical example of the encoding.', 'duration': 54.105, 'highlights': ["It encodes spaces as percent 20 in URLs, as demonstrated by the example where 'hey, check that out' becomes 'hey%20check%20that%20out'.", 'The process of encoding values using urllib.parse.urlencode is explained, illustrating its function in preparing values to be included in a URL.', "A practical example is given where a search query is transformed from 'hey, check that out' to 'hey%20check%20that%20out' through URL encoding."]}, {'end': 778.909, 'start': 463.065, 'title': 'Web scraping with python', 'summary': 'Discusses the process of encoding data, making a request to visit a website, reading the response, dealing with website access restrictions, and the importance of using apis for web scraping.', 'duration': 315.844, 'highlights': ["The chapter explains the process of encoding data as UTF-8 before making a request to visit pythonprogrammer.net and passing variables like 's=basic' and 'submit=search'.", "It discusses the challenge of website access restrictions and the ease of overcoming basic systems, while highlighting the importance of using APIs like Google's and Wikipedia's for web scraping.", "The chapter emphasizes the significance of using APIs, such as Google's and Wikipedia's, before resorting to bypassing website access restrictions for easier data retrieval and avoiding unnecessary HTML data and advertisements."]}], 'duration': 371.55, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/5GzVNi0oTxQ/pics/5GzVNi0oTxQ407359.jpg', 'highlights': ['Covers url encoding with urllib.parse.urlencode, demonstrating encoding of spaces as percent 20 in urls.', "A practical example is given where a search query is transformed from 'hey, check that out' to 'hey%20check%20that%20out' through URL encoding.", 'The process of encoding values using urllib.parse.urlencode is explained, illustrating its function in preparing values to be included in a URL.', "The chapter emphasizes the significance of using APIs, such as Google's and Wikipedia's, before resorting to bypassing website access restrictions for easier data retrieval and avoiding unnecessary HTML data and advertisements.", "The chapter explains the process of encoding data as UTF-8 before making a request to visit pythonprogrammer.net and passing variables like 's=basic' and 'submit=search'.", "It discusses the challenge of website access restrictions and the ease of overcoming basic systems, while highlighting the importance of using APIs like Google's and Wikipedia's for web scraping."]}, {'end': 1076.091, 'segs': [{'end': 812.781, 'src': 'embed', 'start': 778.949, 'weight': 0, 'content': [{'end': 784.831, 'text': "We're gonna say try x equals urllib.request.urlopen.", 'start': 778.949, 'duration': 5.882}, {'end': 798.935, 'text': "And the URL we're gonna tend to open is https://www.google.com slash search and then question mark.", 'start': 786.231, 'duration': 12.704}, {'end': 800.396, 'text': "So we're defining a variable here.", 'start': 798.995, 'duration': 1.401}, {'end': 804.838, 'text': "uh, we're going to say q equals test.", 'start': 801.876, 'duration': 2.962}, {'end': 807.719, 'text': 'so q stands for query for google.', 'start': 804.838, 'duration': 2.881}, {'end': 809.56, 'text': "so we're going to attempt to visit this url.", 'start': 807.719, 'duration': 1.841}, {'end': 812.781, 'text': 'so this is a search request for the string text.', 'start': 809.56, 'duration': 3.221}], 'summary': 'Using urllib.request.urlopen to visit https://www.google.com/search?q=test for search request.', 'duration': 33.832, 'max_score': 778.949, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/5GzVNi0oTxQ/pics/5GzVNi0oTxQ778949.jpg'}, {'end': 886.093, 'src': 'embed', 'start': 858.037, 'weight': 1, 'content': [{'end': 862.841, 'text': "otherwise we're going to throw the exception as e and then we'll print the string version of that exception.", 'start': 858.037, 'duration': 4.804}, {'end': 865.163, 'text': "so let's go ahead and run that and you see what happens to us.", 'start': 862.841, 'duration': 2.322}, {'end': 872.29, 'text': 'So we run that and we get HTTP error 403 forbidden.', 'start': 868.749, 'duration': 3.541}, {'end': 877.191, 'text': "We're forbidden because Google says, hey, you're a program and we're gonna go ahead and say no.", 'start': 872.77, 'duration': 4.421}, {'end': 882.772, 'text': "Okay, so if you happen to find yourself in this situation, here's how you get around it.", 'start': 877.871, 'duration': 4.901}, {'end': 886.093, 'text': 'So we try to accept that, we fail.', 'start': 884.192, 'duration': 1.901}], 'summary': 'Http error 403 forbidden when running a program accessing google.', 'duration': 28.056, 'max_score': 858.037, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/5GzVNi0oTxQ/pics/5GzVNi0oTxQ858037.jpg'}, {'end': 978.046, 'src': 'embed', 'start': 932.143, 'weight': 2, 'content': [{'end': 933.584, 'text': 'It just sends in a bunch of information on you.', 'start': 932.143, 'duration': 1.441}, {'end': 940.545, 'text': 'and so within your headers there is a data piece of data that is called user agent.', 'start': 934.464, 'duration': 6.081}, {'end': 941.966, 'text': "so now that we've got headers defined,", 'start': 940.545, 'duration': 1.421}, {'end': 949.927, 'text': "let's make some more space and we'll say headers and then square brackets to define a piece of data in this dictionary,", 'start': 941.966, 'duration': 7.961}, {'end': 953.528, 'text': "and we're going to call this piece of data user dash agent.", 'start': 949.927, 'duration': 3.601}, {'end': 958.749, 'text': 'so user agent is the type of browser, basically, that you are using.', 'start': 953.528, 'duration': 5.221}, {'end': 966.603, 'text': 'so in our case, what, uh, python does is it says python dash, url, lib slash and then your python version.', 'start': 959.329, 'duration': 7.274}, {'end': 967.444, 'text': 'so for me it would be 3.4.', 'start': 966.603, 'duration': 0.841}, {'end': 977.025, 'text': "so Within almost an instant, when you visit a website with Python using the methods that we've shown so far, that website knows exactly what you are.", 'start': 967.444, 'duration': 9.581}, {'end': 978.046, 'text': "They know you're a program.", 'start': 977.045, 'duration': 1.001}], 'summary': 'The user agent in headers identifies the browser and version, allowing websites to recognize and categorize python users.', 'duration': 45.903, 'max_score': 932.143, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/5GzVNi0oTxQ/pics/5GzVNi0oTxQ932143.jpg'}, {'end': 1058.577, 'src': 'heatmap', 'start': 1018.828, 'weight': 0.701, 'content': [{'end': 1029.615, 'text': "basically, uh, this tells um, well, this acts like we're using mozilla and then it gives all this other information and all this compatibility stuff.", 'start': 1018.828, 'duration': 10.787}, {'end': 1036.873, 'text': "um, And so, basically it just changes that we're no longer are we announcing ourself as Python.", 'start': 1029.615, 'duration': 7.258}, {'end': 1046.575, 'text': 'Sorry about that.', 'start': 1046.075, 'duration': 0.5}, {'end': 1048.115, 'text': 'I have no idea where I left off.', 'start': 1046.694, 'duration': 1.421}, {'end': 1050.316, 'text': "I'll just kind of start at this point here.", 'start': 1048.155, 'duration': 2.161}, {'end': 1058.577, 'text': "Turns out my dog knows how to open a sliding glass door, so he was running around in here when he shouldn't have been.", 'start': 1051.076, 'duration': 7.501}], 'summary': 'Transcript discusses using mozilla and compatibility, with a dog interruption.', 'duration': 39.749, 'max_score': 1018.828, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/5GzVNi0oTxQ/pics/5GzVNi0oTxQ1018828.jpg'}], 'start': 778.949, 'title': 'Python http request, exception handling, and web scraping', 'summary': 'Demonstrates making an http request using urllib in python, handling http error 403 forbidden, and discusses user agent manipulation for web scraping to avoid detection and potential restriction.', 'chapters': [{'end': 877.191, 'start': 778.949, 'title': 'Python http request and exception handling', 'summary': 'Demonstrates making an http request using urllib in python to visit a search url, reading the source code of the results, and handling the http error 403 forbidden, encountered due to being identified as a program by google.', 'duration': 98.242, 'highlights': ['Making an HTTP request using urllib to visit a search URL and reading the source code of the results.', 'Handling the HTTP error 403 forbidden encountered due to being identified as a program by Google.', 'Defining variables for the URL and query, attempting to visit the URL, and reading the results.']}, {'end': 1076.091, 'start': 877.871, 'title': 'Python web scraping: user agent manipulation', 'summary': 'Discusses the process of changing the user agent in python to avoid detection by websites, enabling users to access web data without detection and potential restriction.', 'duration': 198.22, 'highlights': ['By changing the user agent in Python, users can avoid detection by websites and access web data without being shut out.', 'User agents contain information on the user, including the type of browser and Python version, making it easier for websites to identify and restrict automated access.', 'The chapter outlines the process of defining and modifying headers in Python to manipulate the user agent, thereby enabling users to masquerade as a different browser and access web data undetected.', 'The user agent manipulation aims to fool Google and other websites, allowing for undetected web scraping and data retrieval.']}], 'duration': 297.142, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/5GzVNi0oTxQ/pics/5GzVNi0oTxQ778949.jpg', 'highlights': ['Demonstrates making an HTTP request using urllib to visit a search URL and reading the source code of the results.', 'Handling the HTTP error 403 forbidden encountered due to being identified as a program by Google.', 'By changing the user agent in Python, users can avoid detection by websites and access web data without being shut out.', 'The chapter outlines the process of defining and modifying headers in Python to manipulate the user agent, thereby enabling users to masquerade as a different browser and access web data undetected.']}, {'end': 1262.104, 'segs': [{'end': 1142.966, 'src': 'embed', 'start': 1101.295, 'weight': 0, 'content': [{'end': 1105.579, 'text': "Under normal circumstances you would maybe say like if we're making a post request, we could do that.", 'start': 1101.295, 'duration': 4.284}, {'end': 1109.503, 'text': 'And then we would add in the whole search data or the values and make the post.', 'start': 1106.039, 'duration': 3.464}, {'end': 1111.264, 'text': "But instead, we're just going to hard code this for now.", 'start': 1109.523, 'duration': 1.741}, {'end': 1113.627, 'text': 'Feel free to mix them on your own time.', 'start': 1112.185, 'duration': 1.442}, {'end': 1114.628, 'text': 'Homework assignment.', 'start': 1113.867, 'duration': 0.761}, {'end': 1120.058, 'text': "URL and then we're going to say headers equals headers.", 'start': 1116.035, 'duration': 4.023}, {'end': 1129.225, 'text': "okay. so we're telling Python now to visit this URL and instead of setting our normal headers, the default parameter headers,", 'start': 1120.058, 'duration': 9.167}, {'end': 1132.027, 'text': "We're going to change these up and call the headers.", 'start': 1129.225, 'duration': 2.802}, {'end': 1134.919, 'text': 'this now my opinion.', 'start': 1132.027, 'duration': 2.892}, {'end': 1142.966, 'text': 'it just kind of makes a little bit of sense to eventually go into urllib.request.request.', 'start': 1134.919, 'duration': 8.047}], 'summary': 'Hard coding the request for now, considering changing headers to urllib.request.request', 'duration': 41.671, 'max_score': 1101.295, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/5GzVNi0oTxQ/pics/5GzVNi0oTxQ1101295.jpg'}, {'end': 1248.443, 'src': 'embed', 'start': 1216.904, 'weight': 2, 'content': [{'end': 1217.704, 'text': "And then we're going to write.", 'start': 1216.904, 'duration': 0.8}, {'end': 1224.566, 'text': "We have to write the string version of REST data because right now the response data isn't in string format.", 'start': 1218.304, 'duration': 6.262}, {'end': 1230.629, 'text': "So that's also kind of new-ish if you're coming from Python 2.7.", 'start': 1225.027, 'duration': 5.602}, {'end': 1238.716, 'text': 'And then of course we need to do save file dot close Now the other thing we have not done is we did a try and we have no except yet.', 'start': 1230.629, 'duration': 8.087}, {'end': 1248.443, 'text': "So we're gonna say except exception as E and then we're gonna go ahead and print String e just in case we throw an exception.", 'start': 1238.816, 'duration': 9.627}], 'summary': 'Updating response data to string format, saving file, and adding exception handling.', 'duration': 31.539, 'max_score': 1216.904, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/5GzVNi0oTxQ/pics/5GzVNi0oTxQ1216904.jpg'}], 'start': 1076.091, 'title': 'Python url handling and response data', 'summary': 'Covers making requests in python using `requests` library, modifying urllib headers, and writing response data to a file, offering practical examples for url handling and data manipulation.', 'chapters': [{'end': 1120.058, 'start': 1076.091, 'title': 'Python request and url handling', 'summary': 'Introduces making a request to a url in python using the `requests` library, without passing any data and hard coding the url and headers.', 'duration': 43.967, 'highlights': ['The chapter introduces making a request to a URL in Python using the `requests` library.', 'The example demonstrates making a request without passing any data and hard coding the URL and headers.']}, {'end': 1194.553, 'start': 1120.058, 'title': 'Using urllib in python', 'summary': "Explores using python's urllib to change default headers and make a request to a url, with a focus on editing the urllib function to set default headers, and extracting a large amount of data from the response.", 'duration': 74.495, 'highlights': ['By changing default headers in urllib, Python can make a request to a URL with customized headers, offering flexibility and control over the request process.', 'The suggestion to edit the urllib function to set default headers enables a more streamlined and efficient approach, potentially improving the usability of the urllib library.', 'When making a request using urllib, the response can contain a large amount of data, such as a complete search result page and its accompanying HTML, highlighting the capability of urllib to handle substantial data retrieval.']}, {'end': 1262.104, 'start': 1194.573, 'title': 'Writing response data to a file', 'summary': 'Explains the process of writing response data to a file in python, including opening a file, writing the data, handling exceptions, and running the code.', 'duration': 67.531, 'highlights': ['The process involves opening a file with the intention to write, writing the response data in string format, and closing the file.', 'The code also includes exception handling, specifically printing the exception string in case an exception is thrown.', 'Running the code without proper handling would cause console lag due to printing the response data directly to the console.']}], 'duration': 186.013, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/5GzVNi0oTxQ/pics/5GzVNi0oTxQ1076091.jpg', 'highlights': ['The example demonstrates making a request without passing any data and hard coding the URL and headers.', 'By changing default headers in urllib, Python can make a request to a URL with customized headers, offering flexibility and control over the request process.', 'The process involves opening a file with the intention to write, writing the response data in string format, and closing the file.']}, {'end': 1438.229, 'segs': [{'end': 1366.381, 'src': 'embed', 'start': 1339.87, 'weight': 2, 'content': [{'end': 1345.195, 'text': 'But luckily, regular expressions, being their own programming language basically, are transferable.', 'start': 1339.87, 'duration': 5.325}, {'end': 1348.616, 'text': 'Pretty much anywhere you go, the rules of regular expressions will remain.', 'start': 1345.535, 'duration': 3.081}, {'end': 1352.777, 'text': 'So once you understand the logic of regular expressions, you can take it to any language.', 'start': 1348.676, 'duration': 4.101}, {'end': 1354.478, 'text': "It's a lot like SQL, right?", 'start': 1352.817, 'duration': 1.661}, {'end': 1364.1, 'text': "If you learn SQL, or, as the cool kids say it, SQL, it's its own programming language and you can take it anywhere, to any other programming language,", 'start': 1354.898, 'duration': 9.202}, {'end': 1366.381, 'text': 'and work with SQL or SQL, whatever you want to call it.', 'start': 1364.1, 'duration': 2.281}], 'summary': 'Regular expressions are transferable like sql, applicable in various languages.', 'duration': 26.511, 'max_score': 1339.87, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/5GzVNi0oTxQ/pics/5GzVNi0oTxQ1339870.jpg'}, {'end': 1401.871, 'src': 'embed', 'start': 1373.72, 'weight': 0, 'content': [{'end': 1376.903, 'text': "And then after we cover it, we'll mesh regular expressions with urllibs.", 'start': 1373.72, 'duration': 3.183}, {'end': 1385.411, 'text': 'So a lot like your basic programs are just a combination of very, or your complex programs rather, are just a combination of very basic tools.', 'start': 1376.943, 'duration': 8.468}, {'end': 1392.36, 'text': 'Even some of these really complex tasks are a lot of times just a combination of really basic modules and tools that you already have.', 'start': 1386.172, 'duration': 6.188}, {'end': 1400.149, 'text': 'Maybe not if statements and all that, but URL lib plus regular expressions equals a pretty darn good website parser already.', 'start': 1392.72, 'duration': 7.429}, {'end': 1401.871, 'text': 'You could also use something like Beautiful Soup.', 'start': 1400.189, 'duration': 1.682}], 'summary': 'Combining urllibs with regular expressions simplifies complex tasks. beautiful soup is also useful.', 'duration': 28.151, 'max_score': 1373.72, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/5GzVNi0oTxQ/pics/5GzVNi0oTxQ1373720.jpg'}], 'start': 1263.244, 'title': 'Introduction to url lib and regular expressions', 'summary': 'Introduces the basics of urllib and regular expressions, emphasizing their versatility and the potential for parsing through messy data obtained from google search results.', 'chapters': [{'end': 1438.229, 'start': 1263.244, 'title': 'Introduction to url lib and regular expressions', 'summary': 'Introduces the basics of urllib and regular expressions, emphasizing their versatility and the potential for parsing through messy data obtained from google search results.', 'duration': 174.985, 'highlights': ['The basics of urllib and regular expressions are introduced, highlighting their potential for parsing through messy data obtained from Google search results.', 'Regular expressions are discussed as a separate programming language that can be transferred to different languages, similar to SQL, emphasizing their versatility and applicability.', 'The speaker emphasizes that complex tasks can often be achieved through a combination of basic modules and tools, citing the combination of URL lib and regular expressions as a powerful website parser.']}], 'duration': 174.985, 'thumbnail': 'https://coursnap.oss-ap-southeast-1.aliyuncs.com/video-capture/5GzVNi0oTxQ/pics/5GzVNi0oTxQ1263244.jpg', 'highlights': ['The basics of urllib and regular expressions are introduced, highlighting their potential for parsing through messy data obtained from Google search results.', 'The speaker emphasizes that complex tasks can often be achieved through a combination of basic modules and tools, citing the combination of URL lib and regular expressions as a powerful website parser.', 'Regular expressions are discussed as a separate programming language that can be transferred to different languages, similar to SQL, emphasizing their versatility and applicability.']}], 'highlights': ['The chapter introduces the Python 3 urllib module, explaining how to use it to access the internet and retrieve data from a specified URL.', 'The process of reading the request and obtaining the source code of a website using Python is explained.', 'The importance of importing urllib.request is emphasized, and an example of visiting a website using urllib is showcased.', 'The example illustrates the process of using a search bar to filter results based on keywords.', 'The chapter provides insights into the structure of GET and POST requests and how variables are defined within the URL.', 'The chapter demonstrates the Pythonic method of executing a POST request by utilizing the urllib.parse module and a dictionary to send data to a specific URL.', 'The variables in the URL are defined using a specific syntax involving a question mark, equal sign, and ampersand, with subsequent variables denoted by an ampersand.', 'Covers url encoding with urllib.parse.urlencode, demonstrating encoding of spaces as percent 20 in urls.', "A practical example is given where a search query is transformed from 'hey, check that out' to 'hey%20check%20that%20out' through URL encoding.", 'The process of encoding values using urllib.parse.urlencode is explained, illustrating its function in preparing values to be included in a URL.', "The chapter emphasizes the significance of using APIs, such as Google's and Wikipedia's, before resorting to bypassing website access restrictions for easier data retrieval and avoiding unnecessary HTML data and advertisements.", "The chapter explains the process of encoding data as UTF-8 before making a request to visit pythonprogrammer.net and passing variables like 's=basic' and 'submit=search'.", "It discusses the challenge of website access restrictions and the ease of overcoming basic systems, while highlighting the importance of using APIs like Google's and Wikipedia's for web scraping.", 'Demonstrates making an HTTP request using urllib to visit a search URL and reading the source code of the results.', 'Handling the HTTP error 403 forbidden encountered due to being identified as a program by Google.', 'By changing the user agent in Python, users can avoid detection by websites and access web data without being shut out.', 'The chapter outlines the process of defining and modifying headers in Python to manipulate the user agent, thereby enabling users to masquerade as a different browser and access web data undetected.', 'The example demonstrates making a request without passing any data and hard coding the URL and headers.', 'By changing default headers in urllib, Python can make a request to a URL with customized headers, offering flexibility and control over the request process.', 'The process involves opening a file with the intention to write, writing the response data in string format, and closing the file.', 'The basics of urllib and regular expressions are introduced, highlighting their potential for parsing through messy data obtained from Google search results.', 'The speaker emphasizes that complex tasks can often be achieved through a combination of basic modules and tools, citing the combination of URL lib and regular expressions as a powerful website parser.', 'Regular expressions are discussed as a separate programming language that can be transferred to different languages, similar to SQL, emphasizing their versatility and applicability.']}