Filling In the Missing Values
Deleting missing values is not the only way to get rid of them. You can also replace all NaNs with a defined value, for instance, with the mean value of the column or with zeros. It can be useful in a lot of cases. You will learn this in the course Learning Statistics with Python.
Look at the example of filling missing values in the column 'Age'
with the median value of this column:
1234import pandas as pd data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/4bf24830-59ba-4418-969b-aaf8117d522e/titanic_2', index_col = 0) data['Age'].fillna(value=data['Age'].median(), inplace=True) print(data['Age'].isna().sum())
Explanation:
.fillna(value=data['Age'].median(), inplace=True)
value = data['Age'].median()
- using the argumentvalue
, we tell the.fillna()
method what to do with theNaN
values. In this case, we applied the.fillna()
method to the column'Age'
and replaced all missing values with the median of the column;inplace=True
- the argument we can use for saving changes.
Swipe to start coding
One of the most common ways of filling missing values is replacing them with the mean value of the column. So, your task here is to replace the NaN
values in the column 'Age'
with the mean value of the column (using the inplace = True
argument). Then output the sum of the missing value in the column 'Age'
.
Løsning
Takk for tilbakemeldingene dine!
single
Spør AI
Spør AI
Spør om hva du vil, eller prøv ett av de foreslåtte spørsmålene for å starte chatten vår
Awesome!
Completion rate improved to 3.03
Filling In the Missing Values
Sveip for å vise menyen
Deleting missing values is not the only way to get rid of them. You can also replace all NaNs with a defined value, for instance, with the mean value of the column or with zeros. It can be useful in a lot of cases. You will learn this in the course Learning Statistics with Python.
Look at the example of filling missing values in the column 'Age'
with the median value of this column:
1234import pandas as pd data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/4bf24830-59ba-4418-969b-aaf8117d522e/titanic_2', index_col = 0) data['Age'].fillna(value=data['Age'].median(), inplace=True) print(data['Age'].isna().sum())
Explanation:
.fillna(value=data['Age'].median(), inplace=True)
value = data['Age'].median()
- using the argumentvalue
, we tell the.fillna()
method what to do with theNaN
values. In this case, we applied the.fillna()
method to the column'Age'
and replaced all missing values with the median of the column;inplace=True
- the argument we can use for saving changes.
Swipe to start coding
One of the most common ways of filling missing values is replacing them with the mean value of the column. So, your task here is to replace the NaN
values in the column 'Age'
with the mean value of the column (using the inplace = True
argument). Then output the sum of the missing value in the column 'Age'
.
Løsning
Takk for tilbakemeldingene dine!
Awesome!
Completion rate improved to 3.03single