Web Scraping with Python
Now you are familiar with the main aspects of
HTML. Let's learn the first way to work with it in
One of the modules that you can use to handle
HTML files in
urllib.request. We must import the
urlopen method for opening web pages. Pass the URL page you want to open as the method's parameter.
As you can see above, you received
http.client.HTTPResponse object as a result. Differs from what we wanted. You should apply the
.decode("utf-8") methods to the object you got to get the
As you can see, after applying the
.decode() methods you got the string as the result. This string stores an
HTML structure in a pretty format so that it can be easily read and you can apply string methods to it.
.decode() method wasn't been applied, then you would receive the
bytes object with all the
HTML page been represented in a single string with specific characters. Feel free to try!