Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
学ぶ Attributes and Contents of Element | Working with Element Attributes in Beautiful Soup
Web Scraping with Python

bookAttributes and Contents of Element

メニューを表示するにはスワイプしてください

The methods covered earlier return specific parts of the HTML code. BeautifulSoup also allows you to access the attributes and contents of particular elements. To get an element’s attributes, use the .attrs attribute. For example, retrieve the attributes of the first <div> element.

123456789101112
# Importing libraries from bs4 import BeautifulSoup from urllib.request import urlopen # Reading web page url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/jesus.html" page = urlopen(url) html = page.read().decode("utf-8") # Reading HTML with BeautifulSoup soup = BeautifulSoup(html, "html.parser") print(soup.find("div").attrs)
copy

The result of using the .attrs attribute is a dictionary where the keys are attribute names and the values are their corresponding values. To get the content inside a tag, use the .contents attribute. For example, check the contents of the first <div> element.

123456789101112
# Importing libraries from bs4 import BeautifulSoup from urllib.request import urlopen # Reading web page url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/jesus.html" page = urlopen(url) html = page.read().decode("utf-8") # Reading HTML with BeautifulSoup soup = BeautifulSoup(html, "html.parser") print(soup.find("div").contents)
copy

As observed above, all the newline characters were included in a list of elements, which may not be the most desirable representation of content. If you want to extract only the text within a specific element, utilize the .get_text() method. Compare the results from the example below with the one obtained earlier.

123456789101112
# Importing libraries from bs4 import BeautifulSoup from urllib.request import urlopen # Reading web page url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/jesus.html" page = urlopen(url) html = page.read().decode("utf-8") # Reading HTML with BeautifulSoup soup = BeautifulSoup(html, "html.parser") print(soup.find("div").get_text())
copy
すべて明確でしたか?

どのように改善できますか?

フィードバックありがとうございます!

セクション 3.  1

AIに質問する

expand

AIに質問する

ChatGPT

何でも質問するか、提案された質問の1つを試してチャットを始めてください

セクション 3.  1
some-alt