Data Standardization
Another approach to scale the data is to standardize it: transform in such a way that mean
is equal to 0 and std
is equal to 1. Look at the formula:
Here μ is a mean value of x and σ is a standard deviation. This way, we move each value such that total mean is 0, and normalize it with std
, so new std is equal to 1.
Aufgabe
Swipe to start coding
Standardize the data in SibSp
column by calculating mean and std values and use the formula above. Output some sample to see the modified data and check if the new values of mean
and std
are equal to 0 and 1 respectively.
Lösung
99
1
2
3
4
5
6
7
8
9
10
import pandas as pd
data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/10db3746-c8ff-4c55-9ac3-4affa0b65c16/titanic.csv')
sibsp = data['SibSp']
mean, std = sibsp.mean(), sibsp.std()
sibsp = (sibsp - mean) / std
print(sibsp.sample(10))
new_mean, new_std = sibsp.mean(), sibsp.std()
print(round(new_mean, 2), round(new_std, 2))
War alles klar?
Danke für Ihr Feedback!
Abschnitt 4. Kapitel 2
9
1
2
3
4
5
6
7
8
import pandas as pd
data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/10db3746-c8ff-4c55-9ac3-4affa0b65c16/titanic.csv')
sibsp = data['SibSp']
# standardize the sibsp
print(sibsp.sample(10))
# check the new values of mean and std
Fragen Sie AI
Fragen Sie alles oder probieren Sie eine der vorgeschlagenen Fragen, um unser Gespräch zu beginnen