Aprenda Understanding the Estimator API | Core scikit-learn API Patterns

Deslize para mostrar o menu

When you use scikit-learn, you interact with a consistent and powerful interface known as the Estimator API. An estimator in scikit-learn is any object that learns from data; this includes models such as classifiers and regressors, as well as preprocessors and transformers. The API is built around a small set of core methods: fit, predict, and transform.

You use the fit method to train the estimator on data — this is where the estimator "learns" from the training set. For predictive models like classifiers and regressors, the predict method is then used to make predictions on new, unseen data. For data transformation tasks, such as scaling or encoding, the transform method is used to apply the learned transformation to any dataset. This unified interface allows you to chain together different steps in a workflow and swap out components with ease, making experimentation and deployment straightforward and reliable.


              123456789101112131415161718
            
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris

# Load a simple dataset
iris = load_iris()
X = iris.data
y = iris.target

# Create a LogisticRegression estimator
clf = LogisticRegression(max_iter=200)

# Fit the model to the data
clf.fit(X, y)

# Use the trained model to make predictions
predictions = clf.predict(X)

print("First 5 predictions:", predictions[:5])

Let's break down what happens in the code above. First, you import the LogisticRegression estimator and load a sample dataset with load_iris. The feature matrix X and target vector y are extracted from the dataset. Next, you create a LogisticRegression estimator instance called clf, specifying max_iter=200 to ensure the model has enough iterations to converge.

The critical step is calling clf.fit(X, y). This trains the estimator on the data, allowing it to learn the relationship between features and target labels. After fitting, the estimator has internal parameters set based on the training data.

To make predictions, you use clf.predict(X). This method applies the learned model to the data and returns predicted class labels. The result is a NumPy array of predictions, which you can inspect or use in further analysis. This pattern — instantiate, fit, predict — lis a core workflow for any scikit-learn estimator and provides a consistent, reliable interface for all modeling and transformation tasks.

Tudo estava claro?

Obrigado pelo seu feedback!

Seção 1. Capítulo 1

Pergunte à IA

Pergunte o que quiser ou experimente uma das perguntas sugeridas para iniciar nosso bate-papo

Seção 1. Capítulo 1