course content

Course Content

Web Scraping with Python

Advanced SearchAdvanced Search

Several html tags have obligatory attributes like the anchor tag requires the href attribute or <img> requires the src attribute. If you are interested in only a specific attribute, apply the .get() method after .attrs. For instance, let's get all the src attributes of all <img> elements.

Also, you may meet the id attribute, which is quite popular and used to differentiate elements under the same tag. Suppose you are interested in some specific values for attributes. In that case, you can pass them as a dictionary (in the format attr_name: attr_value) as the .find_all() parameter (right after the tag you are looking for). For instance, we are interested in only <div> elements with the class attribute being 'box', or we are looking for the <p> element with the "id2" value of the "id" attribute.

We used the .find() method (instead of .find_all()) to get the element with a specific id since the id is a unique identifier, and there can be no more than 1 element with the same value. To ensure that we got only specific <div> elements, let's see what classes <div> elements have.

Section 3.

Chapter 5