Course Content
Web Scraping with Python
Several html tags have obligatory attributes like the anchor tag requires the href
attribute or <img>
requires the src
attribute. If you are interested in only a specific attribute, apply the .get()
method after .attrs
. For instance, let's get all the src
attributes of all <img>
elements.
Also, you may meet the id
attribute, which is quite popular and used to differentiate elements under the same tag. Suppose you are interested in some specific values for attributes. In that case, you can pass them as a dictionary (in the format attr_name: attr_value
) as the .find_all()
parameter (right after the tag you are looking for). For instance, we are interested in only <div>
elements with the class
attribute being 'box'
, or we are looking for the <p>
element with the "id2"
value of the "id"
attribute.
We used the .find()
method (instead of .find_all()
) to get the element with a specific id since the id is a unique identifier, and there can be no more than 1 element with the same value. To ensure that we got only specific <div>
elements, let's see what classes <div>
elements have.
Section 3.
Chapter 5