Store Scraped Data Into a Pandas DataFrame
Storing scraped data in a pandas
DataFrame is a convenient way to manipulate and work with the data. pandas
is a powerful library in Python that provides easy-to-use data structures and data analysis tools.
A DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. You can think of it as a spreadsheet, SQL table, or a dictionary of Series objects. It is generally the most commonly used pandas
object.
Tarea
Swipe to start coding
- Import
pandas
and initialize an empty DF; - Scrape the country name (find all instances on the web page);
- Scrape the capital city (find all instances on the web page);
- Append the scraped values (
country_name
,item
) in thedf
.
Solución
import pandas as pd
col_names = ["Country", "Capital City"]
countries = pd.DataFrame(columns = col_names)
for item in soup.find_all("div",{"class":"col-md-4 country"}):
country_name = item.find_all("h3", {"class":"country-name"})[0].text.lstrip().rstrip()
capital = item.find_all("span", {"class":"country-capital"})[0].text
countries.loc[len(countries)] = country_name, capital
countries
Mark tasks as Completed
¿Todo estuvo claro?
¡Gracias por tus comentarios!
Sección 1. Capítulo 5
AVAILABLE TO ULTIMATE ONLY