Conteúdo do Curso

Analyzing and Visualizing Real-World Data

## Analyzing and Visualizing Real-World Data

2. Preprocessing Data: Part II

# The Most Profitable Holiday

Let's find out which exact holiday was the most profitable according to sales. To do this, we will need to replace the dates with respective labels of holidays, so that we can group by a certain holiday.
But all of the holidays have different dates in the data, so we need to perform a replacement first and then group by the new values.
We have gathered the dates and their respective holidays in the `holidays_dates`

dictionary (keys are dates, and values are the names of the holidays).

Tarefa

- Replace the values in the
`'Date'`

column using the`holidays_dates`

dictionary. Overwrite the original dataframe. - Filter the data in the
`df`

DataFrame that satisfies the following conditions: these are holiday data (the`'Holiday_Flag'`

column is equal to`1`

)**or**`'Pre-Christmas'`

value within the`'Date'`

column. Save the resulting data in the`data`

variable. - Group values in the
`data`

dataframe by the`'Date'`

column, select the`'Weekly_Sales'`

column, and calculate the mean and median of the chosen column.

Tudo estava claro?

# The Most Profitable Holiday

Let's find out which exact holiday was the most profitable according to sales. To do this, we will need to replace the dates with respective labels of holidays, so that we can group by a certain holiday.
But all of the holidays have different dates in the data, so we need to perform a replacement first and then group by the new values.
We have gathered the dates and their respective holidays in the `holidays_dates`

dictionary (keys are dates, and values are the names of the holidays).

Tarefa

- Replace the values in the
`'Date'`

column using the`holidays_dates`

dictionary. Overwrite the original dataframe. - Filter the data in the
`df`

DataFrame that satisfies the following conditions: these are holiday data (the`'Holiday_Flag'`

column is equal to`1`

)**or**`'Pre-Christmas'`

value within the`'Date'`

column. Save the resulting data in the`data`

variable. - Group values in the
`data`

dataframe by the`'Date'`

column, select the`'Weekly_Sales'`

column, and calculate the mean and median of the chosen column.

Tudo estava claro?