course content

Course Content

Web Scraping with Python

Attributes & Contents of ElementAttributes & Contents of Element

The methods discussed in the previous sections return specific parts of the HTML code. BeautifulSoup enables us to retrieve the attributes and contents of particular elements. To access the attributes of an object, use the .attrs attribute. For instance, we can retrieve the attributes of the first <div> element.

It's important to note that the result of using the .attrs attribute is a dictionary where the keys are attribute names and the values are their respective values. If you wish to obtain the content stored within a tag, employ the .contents attribute. For example, let's examine the contents of the first <div> element.

As observed above, all the newline characters were included in a list of elements, which may not be the most desirable representation of content. If you want to extract only the text within a specific element, utilize the .get_text() method. Compare the results from the example below with the one obtained earlier.

Everything was clear?

Section 3. Chapter 1