Summary  
This chapter demonstrates how to use Python’s urllib.request.urlopen to fetch an HTTPResponse object from a URL and then apply .read() and .decode("utf-8") to convert the raw bytes into a human-readable HTML string.

General domain of usage  
Web content retrieval (web scraping)

You're acquainted with the fundamental aspects of __HTML__, let's explore the initial method of working with it in __Python__.


One of the modules you can employ to handle __HTML__ files in __Python__ is `urllib.request`. You'll need to `import` the `urlopen` __method__ to access web pages. Simply provide the __URL__ of the page you wish to open as a __parameter__ to this __method__.

# Importing the module
from urllib.request import urlopen

# Opening web page
url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/mother.html"
page = urlopen(url)
print(page)

As seen in the example above, you receive an `http.client.HTTPResponse` object as a result, which differs from what we intended. To obtain the __HTML__ structure, you should apply the `.read()` and `.decode("utf-8")` methods to the object you've acquired.

# Importing the module
from urllib.request import urlopen

# Opening web page
url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/mother.html"
page = urlopen(url)

# Reading and decoding
web_page = page.read().decode("utf-8")
print(type(web_page))
print(web_page)

As a result of applying the `.read()` and `.decode()` methods, you obtain a string. This string contains the __HTML__ structure in a well-formatted manner, making it easily readable and allowing you to apply string methods to it.

If the `.decode()` method weren't applied, you would receive a __bytes__ object with the entire __HTML__ page represented as a single string with specific characters. Feel free to experiment with it!

Learn to extract valuable data from websites automatically. Master techniques to gather information efficiently, enabling you to perform in-depth analysis, make data-driven decisions, and unlock new insights from the vast ocean of online information.

Explore the structure of an HTML file, learn how to load it efficiently, and gain practical skills in working with the data it contains.

Explore the basics of Beautiful Soup to extract data from HTML documents. Learn how to navigate the structure of HTML, access specific elements, and work with their child elements to build a strong foundation in web scraping.

Master working with element attributes and content using Beautiful Soup. Gain the skills to extract specific information, search elements by attribute values, and use advanced techniques to efficiently parse and analyze HTML data.

Opening HTML File

Opening HTML File