Advanced Search
Swipe to show menu
Some HTML tags require mandatory attributes, such as the anchor tag needing the href attribute or the <img> tag requiring the src attribute. To access a specific attribute, use the .get() method after .attrs. For example, retrieve all src attributes from all <img> elements.
12345678910111213# Importing libraries from bs4 import BeautifulSoup from urllib.request import urlopen # Reading web page url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/page.html" page = urlopen(url) html = page.read().decode("utf-8") # Reading HTML with BeautifulSoup soup = BeautifulSoup(html, "html.parser") for img in soup.find_all("img"): print(img.attrs.get("src"))
You may also come across the id attribute, which is commonly used to distinguish elements with the same tag. To search for elements with specific attribute values, pass them as a dictionary in the format attr_name: attr_value to the .find_all() method, right after specifying the tag. For example, find all <div> elements with the class attribute set to "box" or the <p> element with the "id" attribute value "id2".
12345678910111213141516# Importing libraries from bs4 import BeautifulSoup from urllib.request import urlopen # Reading web page url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/page.html" page = urlopen(url) html = page.read().decode("utf-8") # Reading HTML with BeautifulSoup soup = BeautifulSoup(html, "html.parser") for div in soup.find_all("div", {"class": "box"}): print(div) # Filtering by id attribute value print(soup.find("p", {"id": "id2"}))
The .find() method is used instead of .find_all() to get an element by its id, as an id is a unique identifier and cannot appear more than once. To confirm that only specific <div> elements were retrieved, check the classes assigned to the <div> elements.
12345678910111213# Importing libraries from bs4 import BeautifulSoup from urllib.request import urlopen # Reading web page url = "https://codefinity-content-media.s3.eu-west-1.amazonaws.com/18a4e428-1a0f-44c2-a8ad-244cd9c7985e/page.html" page = urlopen(url) html = page.read().decode("utf-8") # Reading HTML with BeautifulSoup soup = BeautifulSoup(html, "html.parser") for div in soup.find_all("div"): print(div.attrs.get("class"))
Thanks for your feedback!
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat