Understanding the Estimator API
When you use scikit-learn, you interact with a consistent and powerful interface known as the Estimator API. An estimator in scikit-learn is any object that learns from data; this includes models such as classifiers and regressors, as well as preprocessors and transformers. The API is built around a small set of core methods: fit, predict, and transform.
You use the fit method to train the estimator on data — this is where the estimator "learns" from the training set. For predictive models like classifiers and regressors, the predict method is then used to make predictions on new, unseen data. For data transformation tasks, such as scaling or encoding, the transform method is used to apply the learned transformation to any dataset. This unified interface allows you to chain together different steps in a workflow and swap out components with ease, making experimentation and deployment straightforward and reliable.
123456789101112131415161718from sklearn.linear_model import LogisticRegression from sklearn.datasets import load_iris # Load a simple dataset iris = load_iris() X = iris.data y = iris.target # Create a LogisticRegression estimator clf = LogisticRegression(max_iter=200) # Fit the model to the data clf.fit(X, y) # Use the trained model to make predictions predictions = clf.predict(X) print("First 5 predictions:", predictions[:5])
Let's break down what happens in the code above. First, you import the LogisticRegression estimator and load a sample dataset with load_iris. The feature matrix X and target vector y are extracted from the dataset. Next, you create a LogisticRegression estimator instance called clf, specifying max_iter=200 to ensure the model has enough iterations to converge.
The critical step is calling clf.fit(X, y). This trains the estimator on the data, allowing it to learn the relationship between features and target labels. After fitting, the estimator has internal parameters set based on the training data.
To make predictions, you use clf.predict(X). This method applies the learned model to the data and returns predicted class labels. The result is a NumPy array of predictions, which you can inspect or use in further analysis. This pattern — instantiate, fit, predict — lis a core workflow for any scikit-learn estimator and provides a consistent, reliable interface for all modeling and transformation tasks.
Tak for dine kommentarer!
Spørg AI
Spørg AI
Spørg om hvad som helst eller prøv et af de foreslåede spørgsmål for at starte vores chat
Fantastisk!
Completion rate forbedret til 5.26
Understanding the Estimator API
Stryg for at vise menuen
When you use scikit-learn, you interact with a consistent and powerful interface known as the Estimator API. An estimator in scikit-learn is any object that learns from data; this includes models such as classifiers and regressors, as well as preprocessors and transformers. The API is built around a small set of core methods: fit, predict, and transform.
You use the fit method to train the estimator on data — this is where the estimator "learns" from the training set. For predictive models like classifiers and regressors, the predict method is then used to make predictions on new, unseen data. For data transformation tasks, such as scaling or encoding, the transform method is used to apply the learned transformation to any dataset. This unified interface allows you to chain together different steps in a workflow and swap out components with ease, making experimentation and deployment straightforward and reliable.
123456789101112131415161718from sklearn.linear_model import LogisticRegression from sklearn.datasets import load_iris # Load a simple dataset iris = load_iris() X = iris.data y = iris.target # Create a LogisticRegression estimator clf = LogisticRegression(max_iter=200) # Fit the model to the data clf.fit(X, y) # Use the trained model to make predictions predictions = clf.predict(X) print("First 5 predictions:", predictions[:5])
Let's break down what happens in the code above. First, you import the LogisticRegression estimator and load a sample dataset with load_iris. The feature matrix X and target vector y are extracted from the dataset. Next, you create a LogisticRegression estimator instance called clf, specifying max_iter=200 to ensure the model has enough iterations to converge.
The critical step is calling clf.fit(X, y). This trains the estimator on the data, allowing it to learn the relationship between features and target labels. After fitting, the estimator has internal parameters set based on the training data.
To make predictions, you use clf.predict(X). This method applies the learned model to the data and returns predicted class labels. The result is a NumPy array of predictions, which you can inspect or use in further analysis. This pattern — instantiate, fit, predict — lis a core workflow for any scikit-learn estimator and provides a consistent, reliable interface for all modeling and transformation tasks.
Tak for dine kommentarer!