Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Aprende DataFrames for Economic Data | Economic Data Analysis with Python
Python for Economists

bookDataFrames for Economic Data

When dealing with economic data, you will often encounter large tables of numbers—such as unemployment rates, GDP, or inflation figures—organized by country and year. Managing and analyzing this kind of tabular data efficiently is essential for economists. The pandas library in Python provides a powerful tool called a DataFrame, which is specifically designed for working with structured data like this. A DataFrame allows you to organize economic datasets, perform calculations, and extract insights with just a few lines of code.

1234567891011121314
import pandas as pd # Create a DataFrame with unemployment rates for several countries over multiple years data = { "Year": [2018, 2019, 2020, 2021], "United States": [3.9, 3.7, 8.1, 5.4], "Germany": [3.4, 3.2, 4.2, 3.6], "Japan": [2.4, 2.4, 2.8, 2.8], "Brazil": [12.3, 11.9, 13.5, 13.2] } unemployment_df = pd.DataFrame(data) unemployment_df.set_index("Year", inplace=True) print(unemployment_df)
copy

In the example above, you use the pandas library to create a DataFrame called unemployment_df. The data is organized so that each row represents a year, and each column (after "Year") represents a country. By setting the "Year" column as the index, you make it easier to select data for specific years or perform time-series analysis.

To access a specific column, such as the unemployment rates for Germany, you can use unemployment_df["Germany"]. This returns a Series containing Germany's unemployment rates for each year. You can also perform operations directly on these columns, such as calculating the average unemployment rate.

If you want to select data for a particular country, you access the column by its name. If you want to select all data for a specific year, you can use the .loc[] accessor with the year as the index.

DataFrames also support a variety of useful methods for basic analysis:

  • Use .mean() to compute the average unemployment rate for a country across all years;
  • Use .max() and .min() to find the highest and lowest rates.

This structure makes it easy to compare economic indicators across countries and years, perform calculations, and prepare your data for further analysis or visualization.

1234567
# Select unemployment data for Germany germany_unemployment = unemployment_df["Germany"] print("Germany's unemployment rates:\n", germany_unemployment) # Calculate the mean unemployment rate for Germany mean_germany = germany_unemployment.mean() print("Average unemployment rate in Germany (2018-2021):", round(mean_germany, 2))
copy

1. What is a pandas DataFrame and why is it useful for economic data?

2. Fill in the blank: To select the Germany row from the DataFrame above, you would use ____.

3. Which method would you use to calculate the average value of a column in a DataFrame?

question mark

What is a pandas DataFrame and why is it useful for economic data?

Select the correct answer

question-icon

Fill in the blank: To select the Germany row from the DataFrame above, you would use ____.

Year
2018 3.4
2019 3.2
2020 4.2
2021 3.6
Name: Germany, dtype: float64
question mark

Which method would you use to calculate the average value of a column in a DataFrame?

Select the correct answer

¿Todo estuvo claro?

¿Cómo podemos mejorarlo?

¡Gracias por tus comentarios!

Sección 1. Capítulo 2

Pregunte a AI

expand

Pregunte a AI

ChatGPT

Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla

Suggested prompts:

How can I select unemployment data for a different country?

How do I get the unemployment rates for a specific year?

Can you show me how to find the highest and lowest unemployment rates for a country?

bookDataFrames for Economic Data

Desliza para mostrar el menú

When dealing with economic data, you will often encounter large tables of numbers—such as unemployment rates, GDP, or inflation figures—organized by country and year. Managing and analyzing this kind of tabular data efficiently is essential for economists. The pandas library in Python provides a powerful tool called a DataFrame, which is specifically designed for working with structured data like this. A DataFrame allows you to organize economic datasets, perform calculations, and extract insights with just a few lines of code.

1234567891011121314
import pandas as pd # Create a DataFrame with unemployment rates for several countries over multiple years data = { "Year": [2018, 2019, 2020, 2021], "United States": [3.9, 3.7, 8.1, 5.4], "Germany": [3.4, 3.2, 4.2, 3.6], "Japan": [2.4, 2.4, 2.8, 2.8], "Brazil": [12.3, 11.9, 13.5, 13.2] } unemployment_df = pd.DataFrame(data) unemployment_df.set_index("Year", inplace=True) print(unemployment_df)
copy

In the example above, you use the pandas library to create a DataFrame called unemployment_df. The data is organized so that each row represents a year, and each column (after "Year") represents a country. By setting the "Year" column as the index, you make it easier to select data for specific years or perform time-series analysis.

To access a specific column, such as the unemployment rates for Germany, you can use unemployment_df["Germany"]. This returns a Series containing Germany's unemployment rates for each year. You can also perform operations directly on these columns, such as calculating the average unemployment rate.

If you want to select data for a particular country, you access the column by its name. If you want to select all data for a specific year, you can use the .loc[] accessor with the year as the index.

DataFrames also support a variety of useful methods for basic analysis:

  • Use .mean() to compute the average unemployment rate for a country across all years;
  • Use .max() and .min() to find the highest and lowest rates.

This structure makes it easy to compare economic indicators across countries and years, perform calculations, and prepare your data for further analysis or visualization.

1234567
# Select unemployment data for Germany germany_unemployment = unemployment_df["Germany"] print("Germany's unemployment rates:\n", germany_unemployment) # Calculate the mean unemployment rate for Germany mean_germany = germany_unemployment.mean() print("Average unemployment rate in Germany (2018-2021):", round(mean_germany, 2))
copy

1. What is a pandas DataFrame and why is it useful for economic data?

2. Fill in the blank: To select the Germany row from the DataFrame above, you would use ____.

3. Which method would you use to calculate the average value of a column in a DataFrame?

question mark

What is a pandas DataFrame and why is it useful for economic data?

Select the correct answer

question-icon

Fill in the blank: To select the Germany row from the DataFrame above, you would use ____.

Year
2018 3.4
2019 3.2
2020 4.2
2021 3.6
Name: Germany, dtype: float64
question mark

Which method would you use to calculate the average value of a column in a DataFrame?

Select the correct answer

¿Todo estuvo claro?

¿Cómo podemos mejorarlo?

¡Gracias por tus comentarios!

Sección 1. Capítulo 2
some-alt