Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Aprende Challenge: Unemployment Rate Summary | Economic Data Analysis with Python
Practice
Projects
Quizzes & Challenges
Quizzes
Challenges
/
Python for Economists

bookChallenge: Unemployment Rate Summary

You are now ready to apply your knowledge of pandas and descriptive statistics to a practical economic dataset. Imagine you have unemployment rate data for several countries, collected over a five-year period. Your goal is to summarize this data by calculating key statistics for each country and identifying important trends.

To begin, you will work with a hardcoded pandas DataFrame that contains unemployment rates for countries such as the United States, Germany, Japan, and Brazil from 2018 to 2022. For each country, you need to calculate the mean, median, and standard deviation of the unemployment rates across these years. In addition, you should find out which year had the highest unemployment rate for each country. This summary will help you quickly compare the labor market situation across countries and spot years of particular economic difficulty.

123456789101112131415161718192021222324252627282930313233
import pandas as pd # Hardcoded unemployment rate data data = { "Country": ["United States", "United States", "United States", "United States", "United States", "Germany", "Germany", "Germany", "Germany", "Germany", "Japan", "Japan", "Japan", "Japan", "Japan", "Brazil", "Brazil", "Brazil", "Brazil", "Brazil"], "Year": [2018, 2019, 2020, 2021, 2022]*4, "Unemployment Rate": [3.9, 3.7, 8.1, 5.4, 3.6, 3.4, 3.2, 4.0, 3.6, 3.0, 2.4, 2.4, 2.8, 2.8, 2.6, 12.3, 11.9, 13.5, 13.2, 9.3] } df = pd.DataFrame(data) def unemployment_summary(df): # Group by country and calculate statistics grouped = df.groupby("Country")["Unemployment Rate"] summary = grouped.agg(["mean", "median", "std"]).reset_index() # Find the year with the highest unemployment rate for each country idx = df.groupby("Country")["Unemployment Rate"].idxmax() max_years = df.loc[idx, ["Country", "Year"]].set_index("Country") # Merge the summary with the year of highest unemployment summary = summary.merge(max_years, left_on="Country", right_index=True) summary = summary.rename(columns={"Year": "Year of Max Unemployment"}) return summary summary_df = unemployment_summary(df) print(summary_df)
copy

This code creates a function called unemployment_summary that takes a DataFrame with unemployment data and returns a summary DataFrame. For each country, you will see the mean, median, and standard deviation of unemployment rates, along with the year when unemployment was at its highest. This kind of summary is valuable for economists who want to quickly understand labor market trends and identify years of significant change.

Now, it's your turn to practice and reinforce these concepts by writing your own function to generate this summary.

Tarea

Swipe to start coding

Write a function called unemployment_summary that takes a pandas DataFrame with columns "Country", "Year", and "Unemployment Rate". The function should:

  • Calculate the mean, median, and standard deviation of unemployment rates for each country.
  • Identify the year with the highest unemployment rate for each country.
  • Return a DataFrame with columns: Country, mean, median, std, Year of Max Unemployment.

The result should be sorted alphabetically by country name.

The input DataFrame will look like this:

CountryYearUnemployment Rate
United States20183.9
United States20193.7
.........

Your function will be tested with similar data.

Solución

¿Todo estuvo claro?

¿Cómo podemos mejorarlo?

¡Gracias por tus comentarios!

Sección 1. Capítulo 5
single

single

Pregunte a AI

expand

Pregunte a AI

ChatGPT

Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla

Suggested prompts:

Can you explain how the function calculates the statistics for each country?

What does the standard deviation tell us about the unemployment rates?

Why did all countries have their highest unemployment rate in 2020?

close

bookChallenge: Unemployment Rate Summary

Desliza para mostrar el menú

You are now ready to apply your knowledge of pandas and descriptive statistics to a practical economic dataset. Imagine you have unemployment rate data for several countries, collected over a five-year period. Your goal is to summarize this data by calculating key statistics for each country and identifying important trends.

To begin, you will work with a hardcoded pandas DataFrame that contains unemployment rates for countries such as the United States, Germany, Japan, and Brazil from 2018 to 2022. For each country, you need to calculate the mean, median, and standard deviation of the unemployment rates across these years. In addition, you should find out which year had the highest unemployment rate for each country. This summary will help you quickly compare the labor market situation across countries and spot years of particular economic difficulty.

123456789101112131415161718192021222324252627282930313233
import pandas as pd # Hardcoded unemployment rate data data = { "Country": ["United States", "United States", "United States", "United States", "United States", "Germany", "Germany", "Germany", "Germany", "Germany", "Japan", "Japan", "Japan", "Japan", "Japan", "Brazil", "Brazil", "Brazil", "Brazil", "Brazil"], "Year": [2018, 2019, 2020, 2021, 2022]*4, "Unemployment Rate": [3.9, 3.7, 8.1, 5.4, 3.6, 3.4, 3.2, 4.0, 3.6, 3.0, 2.4, 2.4, 2.8, 2.8, 2.6, 12.3, 11.9, 13.5, 13.2, 9.3] } df = pd.DataFrame(data) def unemployment_summary(df): # Group by country and calculate statistics grouped = df.groupby("Country")["Unemployment Rate"] summary = grouped.agg(["mean", "median", "std"]).reset_index() # Find the year with the highest unemployment rate for each country idx = df.groupby("Country")["Unemployment Rate"].idxmax() max_years = df.loc[idx, ["Country", "Year"]].set_index("Country") # Merge the summary with the year of highest unemployment summary = summary.merge(max_years, left_on="Country", right_index=True) summary = summary.rename(columns={"Year": "Year of Max Unemployment"}) return summary summary_df = unemployment_summary(df) print(summary_df)
copy

This code creates a function called unemployment_summary that takes a DataFrame with unemployment data and returns a summary DataFrame. For each country, you will see the mean, median, and standard deviation of unemployment rates, along with the year when unemployment was at its highest. This kind of summary is valuable for economists who want to quickly understand labor market trends and identify years of significant change.

Now, it's your turn to practice and reinforce these concepts by writing your own function to generate this summary.

Tarea

Swipe to start coding

Write a function called unemployment_summary that takes a pandas DataFrame with columns "Country", "Year", and "Unemployment Rate". The function should:

  • Calculate the mean, median, and standard deviation of unemployment rates for each country.
  • Identify the year with the highest unemployment rate for each country.
  • Return a DataFrame with columns: Country, mean, median, std, Year of Max Unemployment.

The result should be sorted alphabetically by country name.

The input DataFrame will look like this:

CountryYearUnemployment Rate
United States20183.9
United States20193.7
.........

Your function will be tested with similar data.

Solución

Switch to desktopCambia al escritorio para practicar en el mundo realContinúe desde donde se encuentra utilizando una de las siguientes opciones
¿Todo estuvo claro?

¿Cómo podemos mejorarlo?

¡Gracias por tus comentarios!

Sección 1. Capítulo 5
single

single

some-alt