LabelEncoder
The OrdinalEncoder
and OneHotEncoder
are typically used to encode features (the X
variable). However, the target variable (y
) can also be categorical.
9
1
2
3
4
5
6
7
8
9
import pandas as pd
# Load the data and assign X, y variables
df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a65bbc96-309e-4df9-a790-a1eb8c815a1c/adult_edu.csv')
y = df['income'] # Income is a target in this dataset
X = df.drop('income', axis=1)
print(y)
print('All values: ', y.unique())
123456789import pandas as pd # Load the data and assign X, y variables df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a65bbc96-309e-4df9-a790-a1eb8c815a1c/adult_edu.csv') y = df['income'] # Income is a target in this dataset X = df.drop('income', axis=1) print(y) print('All values: ', y.unique())
The LabelEncoder
is used to encode the target, regardless of whether it is nominal or ordinal.
ML models do not consider the order of the target, allowing it to be encoded as any numerical values.
LabelEncoder
encodes the target to numbers 0, 1, ... .
99
1
2
3
4
5
6
7
8
9
10
11
12
13
14
import pandas as pd
from sklearn.preprocessing import LabelEncoder
# Load the data and assign X, y variables
df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a65bbc96-309e-4df9-a790-a1eb8c815a1c/adult_edu.csv')
y = df['income'] # Income is a target in this dataset
X = df.drop('income', axis=1)
# Initialize a LabelEncoder object and encode the y variable
label_enc = LabelEncoder()
y = label_enc.fit_transform(y)
print(y)
# Decode the y variable back
y_decoded = label_enc.inverse_transform(y)
print(y_decoded)
1234567891011121314import pandas as pd from sklearn.preprocessing import LabelEncoder # Load the data and assign X, y variables df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a65bbc96-309e-4df9-a790-a1eb8c815a1c/adult_edu.csv') y = df['income'] # Income is a target in this dataset X = df.drop('income', axis=1) # Initialize a LabelEncoder object and encode the y variable label_enc = LabelEncoder() y = label_enc.fit_transform(y) print(y) # Decode the y variable back y_decoded = label_enc.inverse_transform(y) print(y_decoded)
The code above encodes the target using LabelEncoder
and then uses the .inverse_transform()
method to convert it back to the original representation.
Var alt klart?
Tak for dine kommentarer!
Sektion 2. Kapitel 7
Spørg AI
Spørg om hvad som helst eller prøv et af de foreslåede spørgsmål for at starte vores chat