Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære Attributes and Contents of Multiple Elements | Working with Element Attributes in Beautiful Soup
Web Scraping with Python

book
Attributes and Contents of Multiple Elements

All the methods discussed in the previous chapter can be applied to all elements with a specific tag (i.e., to the result of the .find_all() method). However, it's essential to keep in mind that the outcome of applying the .find_all() method is a list, so you must use attributes and methods for each element individually. Just as we did previously, you should employ a for loop in this context as well. For example, let's retrieve all the attributes of all <div> elements.

# Importing libraries
from bs4 import BeautifulSoup
from urllib.request import urlopen

# Reading web page
url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/jesus.html"
page = urlopen(url)
html = page.read().decode("utf-8")

# Reading HTML with BeautifulSoup
soup = BeautifulSoup(html, "html.parser")
for div in soup.find_all("div"):
print(div.attrs)
12345678910111213
# Importing libraries from bs4 import BeautifulSoup from urllib.request import urlopen # Reading web page url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/jesus.html" page = urlopen(url) html = page.read().decode("utf-8") # Reading HTML with BeautifulSoup soup = BeautifulSoup(html, "html.parser") for div in soup.find_all("div"): print(div.attrs)
copy

The same approach applies to extracting text. For instance, let's obtain all the text from all the <p> elements.

# Importing libraries
from bs4 import BeautifulSoup
from urllib.request import urlopen

# Reading web page
url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/jesus.html"
page = urlopen(url)
html = page.read().decode("utf-8")

# Reading HTML with BeautifulSoup
soup = BeautifulSoup(html, "html.parser")
for p in soup.find_all("p"):
print(p.get_text())
12345678910111213
# Importing libraries from bs4 import BeautifulSoup from urllib.request import urlopen # Reading web page url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/jesus.html" page = urlopen(url) html = page.read().decode("utf-8") # Reading HTML with BeautifulSoup soup = BeautifulSoup(html, "html.parser") for p in soup.find_all("p"): print(p.get_text())
copy

Alt var klart?

Hvordan kan vi forbedre det?

Takk for tilbakemeldingene dine!

Seksjon 3. Kapittel 3
some-alt