Challenge
Taak
Swipe to start coding
The task is to process the dataset and create a principal component analysis model with 3 components.
- Load the
train.csv
(from web) dataset. - Drop the
'Id'
column. - Drop columns that contain
NaN
values. - Standardize the dataset.
- Create a PCA model with 3 components for the dataset.
Oplossing
99
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Importing libraries
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
# Read the data
df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/7b22c447-77ad-48ae-a2d2-4e6714f7a4a6/train.csv')
# Drop the 'Id column'
df = df.drop('Id', axis = 1)
# Drop columns containing NaN values
df = df.dropna(axis = 1)
# Standardize data
df_scaled = StandardScaler().fit_transform(df)
# Create PCA model
pca_model = PCA(n_components = 3)
# Fit and transform data
df_reduced = pca_model.fit_transform(df_scaled)
print(df_reduced)
Was alles duidelijk?
Bedankt voor je feedback!
Sectie 3. Hoofdstuk 4
99
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Importing libraries
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
# Read the data
df = pd.___('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/7b22c447-77ad-48ae-a2d2-4e6714f7a4a6/train.csv')
# Drop the 'Id column'
df = df.___(___)
# Drop columns containing NaN values
df = df.___(___)
# Standardize data
df_scaled = ___().___(df)
# Create PCA model
pca_model = ___(n_components = ___)
# Fit and transform data
df_reduced = ___.____(___)
print(df_reduced)
Vraag AI
Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.