学ぶ ColumnTransformer

メニューを表示するにはスワイプしてください

When calling .fit_transform(X) on a Pipeline, each transformer is applied to all columns, which is not always desirable. Some columns may require different encoders — for example, OrdinalEncoder for ordinal features and OneHotEncoder for nominal ones. ColumnTransformer solves this by letting you assign different transformers to specific columns using make_column_transformer.

make_column_transformer accepts tuples of (transformer, [columns]). For example, applying OrdinalEncoder to 'education' and OneHotEncoder to 'gender':

ct = make_column_transformer(
   (OrdinalEncoder(), ['education']),
   (OneHotEncoder(), ['gender']),
   remainder='passthrough'
)

Note

remainder controls what happens to unspecified columns. Default: 'drop'. To keep all other columns unchanged, set remainder='passthrough'.

For example, consider the exams.csv file. It contains several nominal columns ('gender', 'race/ethnicity', 'lunch', 'test preparation course') and one ordinal column, 'parental level of education'.


              12345
            
import pandas as pd

df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/a65bbc96-309e-4df9-a790-a1eb8c815a1c/exams.csv')

print(df.head())

Using ColumnTransformer, nominal data can be transformed with OneHotEncoder and ordinal data with OrdinalEncoder in a single step.


              12345678910111213
            
from sklearn.compose import make_column_transformer
from sklearn.preprocessing import OneHotEncoder, OrdinalEncoder

edu_categories = ['high school', 'some high school', 'some college', "associate's degree", 
                  "bachelor's degree", "master's degree"]

ct = make_column_transformer(
  (OrdinalEncoder(categories=[edu_categories]), ['parental level of education']),
  (OneHotEncoder(), ['gender', 'race/ethnicity', 'lunch', 'test preparation course']),
  remainder='passthrough'
)

print(ct.fit_transform(df))

The ColumnTransformer is itself a transformer, so it provides the standard methods .fit(), .fit_transform(), and .transform().

すべて明確でしたか？

フィードバックありがとうございます！

セクション 1. 章 18

AIに質問する

何でも質問するか、提案された質問の1つを試してチャットを始めてください

セクション 1. 章 18