Aprende DataFrames for Economic Data | Economic Data Analysis with Python

Desliza para mostrar el menú

When dealing with economic data, you will often encounter large tables of numbers—such as unemployment rates, GDP, or inflation figures—organized by country and year. Managing and analyzing this kind of tabular data efficiently is essential for economists. The pandas library in Python provides a powerful tool called a DataFrame, which is specifically designed for working with structured data like this. A DataFrame allows you to organize economic datasets, perform calculations, and extract insights with just a few lines of code.


              1234567891011121314
            
import pandas as pd

# Create a DataFrame with unemployment rates for several countries over multiple years
data = {
    "Year": [2018, 2019, 2020, 2021],
    "United States": [3.9, 3.7, 8.1, 5.4],
    "Germany": [3.4, 3.2, 4.2, 3.6],
    "Japan": [2.4, 2.4, 2.8, 2.8],
    "Brazil": [12.3, 11.9, 13.5, 13.2]
}

unemployment_df = pd.DataFrame(data)
unemployment_df.set_index("Year", inplace=True)
print(unemployment_df)

In the example above, you use the pandas library to create a DataFrame called unemployment_df. The data is organized so that each row represents a year, and each column (after "Year") represents a country. By setting the "Year" column as the index, you make it easier to select data for specific years or perform time-series analysis.

To access a specific column, such as the unemployment rates for Germany, you can use unemployment_df["Germany"]. This returns a Series containing Germany's unemployment rates for each year. You can also perform operations directly on these columns, such as calculating the average unemployment rate.

If you want to select data for a particular country, you access the column by its name. If you want to select all data for a specific year, you can use the .loc[] accessor with the year as the index.

DataFrames also support a variety of useful methods for basic analysis:

Use .mean() to compute the average unemployment rate for a country across all years;
Use .max() and .min() to find the highest and lowest rates.

This structure makes it easy to compare economic indicators across countries and years, perform calculations, and prepare your data for further analysis or visualization.


              1234567
            
# Select unemployment data for Germany
germany_unemployment = unemployment_df["Germany"]
print("Germany's unemployment rates:\n", germany_unemployment)

# Calculate the mean unemployment rate for Germany
mean_germany = germany_unemployment.mean()
print("Average unemployment rate in Germany (2018-2021):", round(mean_germany, 2))

1. What is a pandas DataFrame and why is it useful for economic data?

2. Fill in the blank: To select the Germany row from the DataFrame above, you would use ____.

3. Which method would you use to calculate the average value of a column in a DataFrame?

¿Todo estuvo claro?

¡Gracias por tus comentarios!

Sección 1. Capítulo 2

Pregunte a AI

Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla

Sección 1. Capítulo 2