学ぶ Attributes and Contents of Element | Working with Element Attributes in Beautiful Soup

メニューを表示するにはスワイプしてください

The methods covered earlier return specific parts of the HTML code. BeautifulSoup also allows you to access the attributes and contents of particular elements. To get an element’s attributes, use the .attrs attribute. For example, retrieve the attributes of the first <div> element.


              123456789101112
            
# Importing libraries
from bs4 import BeautifulSoup
from urllib.request import urlopen

# Reading web page
url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/jesus.html"
page = urlopen(url)
html = page.read().decode("utf-8")

# Reading HTML with BeautifulSoup
soup = BeautifulSoup(html, "html.parser")
print(soup.find("div").attrs)

The result of using the .attrs attribute is a dictionary where the keys are attribute names and the values are their corresponding values. To get the content inside a tag, use the .contents attribute. For example, check the contents of the first <div> element.


              123456789101112
            
# Importing libraries
from bs4 import BeautifulSoup
from urllib.request import urlopen

# Reading web page
url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/jesus.html"
page = urlopen(url)
html = page.read().decode("utf-8")

# Reading HTML with BeautifulSoup
soup = BeautifulSoup(html, "html.parser")
print(soup.find("div").contents)

As observed above, all the newline characters were included in a list of elements, which may not be the most desirable representation of content. If you want to extract only the text within a specific element, utilize the .get_text() method. Compare the results from the example below with the one obtained earlier.


              123456789101112
            
# Importing libraries
from bs4 import BeautifulSoup
from urllib.request import urlopen

# Reading web page
url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/jesus.html"
page = urlopen(url)
html = page.read().decode("utf-8")

# Reading HTML with BeautifulSoup
soup = BeautifulSoup(html, "html.parser")
print(soup.find("div").get_text())

すべて明確でしたか？

フィードバックありがとうございます！

セクション 3. 章 1

AIに質問する

何でも質問するか、提案された質問の1つを試してチャットを始めてください

セクション 3. 章 1