Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Aprende Store Scraped Data Into a Pandas DataFrame | Automating Data Collection from Web Sources
Automating Data Collection from Web Sources

book
Store Scraped Data Into a Pandas DataFrame

Storing scraped data in a pandas DataFrame is a convenient way to manipulate and work with the data. pandas is a powerful library in Python that provides easy-to-use data structures and data analysis tools.

A DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. You can think of it as a spreadsheet, SQL table, or a dictionary of Series objects. It is generally the most commonly used pandas object.

Tarea

Swipe to start coding

  1. Import pandas and initialize an empty DF;
  2. Scrape the country name (find all instances on the web page);
  3. Scrape the capital city (find all instances on the web page);
  4. Append the scraped values (country_name, item) in the df.

Solución

import pandas as pd

col_names = ["Country", "Capital City"]
countries = pd.DataFrame(columns = col_names)

for item in soup.find_all("div",{"class":"col-md-4 country"}):
country_name = item.find_all("h3", {"class":"country-name"})[0].text.lstrip().rstrip()
capital = item.find_all("span", {"class":"country-capital"})[0].text
countries.loc[len(countries)] = country_name, capital

countries

Mark tasks as Completed
¿Todo estuvo claro?

¿Cómo podemos mejorarlo?

¡Gracias por tus comentarios!

Sección 1. Capítulo 5
AVAILABLE TO ULTIMATE ONLY
some-alt