Transformers: fit, transform, and fit_transform
A transformer in scikit-learn is any object that implements the fit, transform, and fit_transform methods. Transformers enable you to preprocess your data in a modular and consistent way. The fit method learns parameters from the data, such as means or variances, while transform applies the learned transformation to new data. The fit_transform method combines both steps for convenience, first fitting and then transforming the data in a single call.
123456789101112131415161718192021import numpy as np from sklearn.preprocessing import StandardScaler # Example training and test data X_train = np.array([[1.0, 2.0], [2.0, 4.0], [3.0, 6.0]]) X_test = np.array([[4.0, 8.0]]) # Create the transformer scaler = StandardScaler() # Fit the scaler on training data scaler.fit(X_train) # Transform the training data X_train_scaled = scaler.transform(X_train) # Transform the test data using the same scaler X_test_scaled = scaler.transform(X_test) print("Scaled training data:\n", X_train_scaled) print("Scaled test data:\n", X_test_scaled)
The fit method in the StandardScaler example examines the training data and computes the mean and standard deviation for each feature. The transform method then uses these statistics to scale both the training and test data, ensuring that the transformation is consistent. The fit_transform method is simply a shortcut that performs both steps in sequence, often used during training to streamline the workflow. By separating fit and transform, you prevent data leakage by ensuring only information from the training data influences the learned parameters, while still applying the transformation to any dataset.
Obrigado pelo seu feedback!
Pergunte à IA
Pergunte à IA
Pergunte o que quiser ou experimente uma das perguntas sugeridas para iniciar nosso bate-papo
Incrível!
Completion taxa melhorada para 5.26
Transformers: fit, transform, and fit_transform
Deslize para mostrar o menu
A transformer in scikit-learn is any object that implements the fit, transform, and fit_transform methods. Transformers enable you to preprocess your data in a modular and consistent way. The fit method learns parameters from the data, such as means or variances, while transform applies the learned transformation to new data. The fit_transform method combines both steps for convenience, first fitting and then transforming the data in a single call.
123456789101112131415161718192021import numpy as np from sklearn.preprocessing import StandardScaler # Example training and test data X_train = np.array([[1.0, 2.0], [2.0, 4.0], [3.0, 6.0]]) X_test = np.array([[4.0, 8.0]]) # Create the transformer scaler = StandardScaler() # Fit the scaler on training data scaler.fit(X_train) # Transform the training data X_train_scaled = scaler.transform(X_train) # Transform the test data using the same scaler X_test_scaled = scaler.transform(X_test) print("Scaled training data:\n", X_train_scaled) print("Scaled test data:\n", X_test_scaled)
The fit method in the StandardScaler example examines the training data and computes the mean and standard deviation for each feature. The transform method then uses these statistics to scale both the training and test data, ensuring that the transformation is consistent. The fit_transform method is simply a shortcut that performs both steps in sequence, often used during training to streamline the workflow. By separating fit and transform, you prevent data leakage by ensuring only information from the training data influences the learned parameters, while still applying the transformation to any dataset.
Obrigado pelo seu feedback!