Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Working with Specific Elements | Decoding HTML with Beautiful Soup
Web Scraping with Python

bookWorking with Specific Elements

Navigating an HTML document using Python attributes retrieves only the first occurrence of an element. If you want to find the first instance of an element without knowing its full path, use the .find() method and pass the tag name as a string (without < > brackets). For example, locate the first <div> element in the HTML document.

123456789101112
# Importing libraries from bs4 import BeautifulSoup from urllib.request import urlopen # Reading web page url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/jesus.html" page = urlopen(url) html = page.read().decode("utf-8") # Reading HTML with BeautifulSoup soup = BeautifulSoup(html, "html.parser") print(soup.find("div"))
copy

You can also retrieve all instances of a specific element using the .find_all() method. It returns a list of all matches. For example, find all <p> tags in the HTML document.

123456789101112
# Importing libraries from bs4 import BeautifulSoup from urllib.request import urlopen # Reading web page url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/jesus.html" page = urlopen(url) html = page.read().decode("utf-8") # Reading HTML with BeautifulSoup soup = BeautifulSoup(html, "html.parser") print(soup.find_all("p"))
copy

You can use the .find_all() method to locate multiple tags by passing a list of tag names. For example, collect all <div> and <title> elements.

12345678910111213
# Importing libraries from bs4 import BeautifulSoup from urllib.request import urlopen # Reading web page url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/page.html" page = urlopen(url) html = page.read().decode("utf-8") # Reading HTML with BeautifulSoup soup = BeautifulSoup(html, "html.parser") for el in soup.find_all(["div", "title"]): print(el)
copy
Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 2. ChapterΒ 5

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Awesome!

Completion rate improved to 4.35

bookWorking with Specific Elements

Swipe to show menu

Navigating an HTML document using Python attributes retrieves only the first occurrence of an element. If you want to find the first instance of an element without knowing its full path, use the .find() method and pass the tag name as a string (without < > brackets). For example, locate the first <div> element in the HTML document.

123456789101112
# Importing libraries from bs4 import BeautifulSoup from urllib.request import urlopen # Reading web page url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/jesus.html" page = urlopen(url) html = page.read().decode("utf-8") # Reading HTML with BeautifulSoup soup = BeautifulSoup(html, "html.parser") print(soup.find("div"))
copy

You can also retrieve all instances of a specific element using the .find_all() method. It returns a list of all matches. For example, find all <p> tags in the HTML document.

123456789101112
# Importing libraries from bs4 import BeautifulSoup from urllib.request import urlopen # Reading web page url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/jesus.html" page = urlopen(url) html = page.read().decode("utf-8") # Reading HTML with BeautifulSoup soup = BeautifulSoup(html, "html.parser") print(soup.find_all("p"))
copy

You can use the .find_all() method to locate multiple tags by passing a list of tag names. For example, collect all <div> and <title> elements.

12345678910111213
# Importing libraries from bs4 import BeautifulSoup from urllib.request import urlopen # Reading web page url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/page.html" page = urlopen(url) html = page.read().decode("utf-8") # Reading HTML with BeautifulSoup soup = BeautifulSoup(html, "html.parser") for el in soup.find_all(["div", "title"]): print(el)
copy
Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 2. ChapterΒ 5
some-alt