Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Apprendre Transformers: fit, transform, and fit_transform | Transformers and Preprocessing Workflows
Practice
Projects
Quizzes & Challenges
Quizzes
Challenges
/
Mastering scikit-learn API and Workflows

bookTransformers: fit, transform, and fit_transform

A transformer in scikit-learn is any object that implements the fit, transform, and fit_transform methods. Transformers enable you to preprocess your data in a modular and consistent way. The fit method learns parameters from the data, such as means or variances, while transform applies the learned transformation to new data. The fit_transform method combines both steps for convenience, first fitting and then transforming the data in a single call.

123456789101112131415161718192021
import numpy as np from sklearn.preprocessing import StandardScaler # Example training and test data X_train = np.array([[1.0, 2.0], [2.0, 4.0], [3.0, 6.0]]) X_test = np.array([[4.0, 8.0]]) # Create the transformer scaler = StandardScaler() # Fit the scaler on training data scaler.fit(X_train) # Transform the training data X_train_scaled = scaler.transform(X_train) # Transform the test data using the same scaler X_test_scaled = scaler.transform(X_test) print("Scaled training data:\n", X_train_scaled) print("Scaled test data:\n", X_test_scaled)
copy

The fit method in the StandardScaler example examines the training data and computes the mean and standard deviation for each feature. The transform method then uses these statistics to scale both the training and test data, ensuring that the transformation is consistent. The fit_transform method is simply a shortcut that performs both steps in sequence, often used during training to streamline the workflow. By separating fit and transform, you prevent data leakage by ensuring only information from the training data influences the learned parameters, while still applying the transformation to any dataset.

question mark

Which statements about the fit, transform, and fit_transform methods in scikit-learn transformers are correct?

Select the correct answer

Tout était clair ?

Comment pouvons-nous l'améliorer ?

Merci pour vos commentaires !

Section 2. Chapitre 1

Demandez à l'IA

expand

Demandez à l'IA

ChatGPT

Posez n'importe quelle question ou essayez l'une des questions suggérées pour commencer notre discussion

bookTransformers: fit, transform, and fit_transform

Glissez pour afficher le menu

A transformer in scikit-learn is any object that implements the fit, transform, and fit_transform methods. Transformers enable you to preprocess your data in a modular and consistent way. The fit method learns parameters from the data, such as means or variances, while transform applies the learned transformation to new data. The fit_transform method combines both steps for convenience, first fitting and then transforming the data in a single call.

123456789101112131415161718192021
import numpy as np from sklearn.preprocessing import StandardScaler # Example training and test data X_train = np.array([[1.0, 2.0], [2.0, 4.0], [3.0, 6.0]]) X_test = np.array([[4.0, 8.0]]) # Create the transformer scaler = StandardScaler() # Fit the scaler on training data scaler.fit(X_train) # Transform the training data X_train_scaled = scaler.transform(X_train) # Transform the test data using the same scaler X_test_scaled = scaler.transform(X_test) print("Scaled training data:\n", X_train_scaled) print("Scaled test data:\n", X_test_scaled)
copy

The fit method in the StandardScaler example examines the training data and computes the mean and standard deviation for each feature. The transform method then uses these statistics to scale both the training and test data, ensuring that the transformation is consistent. The fit_transform method is simply a shortcut that performs both steps in sequence, often used during training to streamline the workflow. By separating fit and transform, you prevent data leakage by ensuring only information from the training data influences the learned parameters, while still applying the transformation to any dataset.

question mark

Which statements about the fit, transform, and fit_transform methods in scikit-learn transformers are correct?

Select the correct answer

Tout était clair ?

Comment pouvons-nous l'améliorer ?

Merci pour vos commentaires !

Section 2. Chapitre 1
some-alt