Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lernen Challenge: Unemployment Rate Summary | Economic Data Analysis with Python
Python for Economists

bookChallenge: Unemployment Rate Summary

You are now ready to apply your knowledge of pandas and descriptive statistics to a practical economic dataset. Imagine you have unemployment rate data for several countries, collected over a five-year period. Your goal is to summarize this data by calculating key statistics for each country and identifying important trends.

To begin, you will work with a hardcoded pandas DataFrame that contains unemployment rates for countries such as the United States, Germany, Japan, and Brazil from 2018 to 2022. For each country, you need to calculate the mean, median, and standard deviation of the unemployment rates across these years. In addition, you should find out which year had the highest unemployment rate for each country. This summary will help you quickly compare the labor market situation across countries and spot years of particular economic difficulty.

123456789101112131415161718192021222324252627282930313233
import pandas as pd # Hardcoded unemployment rate data data = { "Country": ["United States", "United States", "United States", "United States", "United States", "Germany", "Germany", "Germany", "Germany", "Germany", "Japan", "Japan", "Japan", "Japan", "Japan", "Brazil", "Brazil", "Brazil", "Brazil", "Brazil"], "Year": [2018, 2019, 2020, 2021, 2022]*4, "Unemployment Rate": [3.9, 3.7, 8.1, 5.4, 3.6, 3.4, 3.2, 4.0, 3.6, 3.0, 2.4, 2.4, 2.8, 2.8, 2.6, 12.3, 11.9, 13.5, 13.2, 9.3] } df = pd.DataFrame(data) def unemployment_summary(df): # Group by country and calculate statistics grouped = df.groupby("Country")["Unemployment Rate"] summary = grouped.agg(["mean", "median", "std"]).reset_index() # Find the year with the highest unemployment rate for each country idx = df.groupby("Country")["Unemployment Rate"].idxmax() max_years = df.loc[idx, ["Country", "Year"]].set_index("Country") # Merge the summary with the year of highest unemployment summary = summary.merge(max_years, left_on="Country", right_index=True) summary = summary.rename(columns={"Year": "Year of Max Unemployment"}) return summary summary_df = unemployment_summary(df) print(summary_df)
copy

This code creates a function called unemployment_summary that takes a DataFrame with unemployment data and returns a summary DataFrame. For each country, you will see the mean, median, and standard deviation of unemployment rates, along with the year when unemployment was at its highest. This kind of summary is valuable for economists who want to quickly understand labor market trends and identify years of significant change.

Now, it's your turn to practice and reinforce these concepts by writing your own function to generate this summary.

Aufgabe

Swipe to start coding

Write a function called unemployment_summary that takes a pandas DataFrame with columns "Country", "Year", and "Unemployment Rate". The function should:

  • Calculate the mean, median, and standard deviation of unemployment rates for each country.
  • Identify the year with the highest unemployment rate for each country.
  • Return a DataFrame with columns: Country, mean, median, std, Year of Max Unemployment.

The result should be sorted alphabetically by country name.

The input DataFrame will look like this:

CountryYearUnemployment Rate
United States20183.9
United States20193.7
.........

Your function will be tested with similar data.

Lösung

War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 1. Kapitel 5
single

single

Fragen Sie AI

expand

Fragen Sie AI

ChatGPT

Fragen Sie alles oder probieren Sie eine der vorgeschlagenen Fragen, um unser Gespräch zu beginnen

Suggested prompts:

Can you explain how the function calculates the statistics for each country?

What does the standard deviation tell us about the unemployment rates?

Why did all countries have their highest unemployment rate in 2020?

close

bookChallenge: Unemployment Rate Summary

Swipe um das Menü anzuzeigen

You are now ready to apply your knowledge of pandas and descriptive statistics to a practical economic dataset. Imagine you have unemployment rate data for several countries, collected over a five-year period. Your goal is to summarize this data by calculating key statistics for each country and identifying important trends.

To begin, you will work with a hardcoded pandas DataFrame that contains unemployment rates for countries such as the United States, Germany, Japan, and Brazil from 2018 to 2022. For each country, you need to calculate the mean, median, and standard deviation of the unemployment rates across these years. In addition, you should find out which year had the highest unemployment rate for each country. This summary will help you quickly compare the labor market situation across countries and spot years of particular economic difficulty.

123456789101112131415161718192021222324252627282930313233
import pandas as pd # Hardcoded unemployment rate data data = { "Country": ["United States", "United States", "United States", "United States", "United States", "Germany", "Germany", "Germany", "Germany", "Germany", "Japan", "Japan", "Japan", "Japan", "Japan", "Brazil", "Brazil", "Brazil", "Brazil", "Brazil"], "Year": [2018, 2019, 2020, 2021, 2022]*4, "Unemployment Rate": [3.9, 3.7, 8.1, 5.4, 3.6, 3.4, 3.2, 4.0, 3.6, 3.0, 2.4, 2.4, 2.8, 2.8, 2.6, 12.3, 11.9, 13.5, 13.2, 9.3] } df = pd.DataFrame(data) def unemployment_summary(df): # Group by country and calculate statistics grouped = df.groupby("Country")["Unemployment Rate"] summary = grouped.agg(["mean", "median", "std"]).reset_index() # Find the year with the highest unemployment rate for each country idx = df.groupby("Country")["Unemployment Rate"].idxmax() max_years = df.loc[idx, ["Country", "Year"]].set_index("Country") # Merge the summary with the year of highest unemployment summary = summary.merge(max_years, left_on="Country", right_index=True) summary = summary.rename(columns={"Year": "Year of Max Unemployment"}) return summary summary_df = unemployment_summary(df) print(summary_df)
copy

This code creates a function called unemployment_summary that takes a DataFrame with unemployment data and returns a summary DataFrame. For each country, you will see the mean, median, and standard deviation of unemployment rates, along with the year when unemployment was at its highest. This kind of summary is valuable for economists who want to quickly understand labor market trends and identify years of significant change.

Now, it's your turn to practice and reinforce these concepts by writing your own function to generate this summary.

Aufgabe

Swipe to start coding

Write a function called unemployment_summary that takes a pandas DataFrame with columns "Country", "Year", and "Unemployment Rate". The function should:

  • Calculate the mean, median, and standard deviation of unemployment rates for each country.
  • Identify the year with the highest unemployment rate for each country.
  • Return a DataFrame with columns: Country, mean, median, std, Year of Max Unemployment.

The result should be sorted alphabetically by country name.

The input DataFrame will look like this:

CountryYearUnemployment Rate
United States20183.9
United States20193.7
.........

Your function will be tested with similar data.

Lösung

Switch to desktopWechseln Sie zum Desktop, um in der realen Welt zu übenFahren Sie dort fort, wo Sie sind, indem Sie eine der folgenden Optionen verwenden
War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 1. Kapitel 5
single

single

some-alt