Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lära Challenge: Solving Task Using Bagging Classifier | Commonly Used Bagging Models
Ensemble Learning

book
Challenge: Solving Task Using Bagging Classifier

Uppgift

Swipe to start coding

The load_breast_cancer dataset is a built-in dataset provided by scikit-learn. It is commonly used for binary classification tasks, particularly in the context of breast cancer diagnosis. This dataset contains features that are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. The aim is to predict whether a given mass is malignant (cancerous) or benign (non-cancerous).

Your task is to solve the classification problem using BaggingClassifier on load_breast_cancer dataset:

  1. Create an instance of BaggingClassifier class: specify base SVC (Support Vector Classifier) model and set the number of base estimators equal to 10.
  2. Fit the ensemble model.
  3. Get the final result using soft voting technique: for each sample in test dataset get the probability matrix and find the class with maximum probability.

Lösning

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.ensemble import BaggingClassifier
from sklearn.svm import SVC
from sklearn.metrics import f1_score
import numpy as np

# Load the Breast Cancer dataset
data = load_breast_cancer()
X, y = data.data, data.target

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a base model (SVM classifier)
base_model = SVC(kernel='rbf')

# Create the Bagging Classifier
bagging_model = BaggingClassifier(base_model, n_estimators=10, n_jobs=-1)

# Train the Bagging Classifier
bagging_model.fit(X_train, y_train)

# Make probability predictions on the test data
prob_predictions = bagging_model.predict_proba(X_test)

# Get final predictions by taking the class with maximum probability
predictions = [np.argmax(prob) for prob in prob_predictions]

# Calculate F1 score
f1 = f1_score(y_test, predictions)
print(f'F1 score: {f1:.4f}')
Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 2. Kapitel 2
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.ensemble import BaggingClassifier
from sklearn.svm import SVC
from sklearn.metrics import f1_score
import numpy as np

# Load the Breast Cancer dataset
data = load_breast_cancer()
X, y = data.data, data.target

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a base model (SVM classifier)
base_model = SVC(kernel='rbf')

# Create the Bagging Classifier
bagging_model = ___(___, n_estimators=___, n_jobs=-1)

# Train the Bagging Classifier
bagging_model.___(X_train, y_train)

# Make probability predictions on the test data
prob_predictions = bagging_model.___(X_test)

# Get final predictions by taking the class with maximum probability
predictions = [np.___(prob) for prob in prob_predictions]

# Calculate F1 score
f1 = f1_score(y_test, predictions)
print(f'F1 score: {f1:.4f}')
toggle bottom row
some-alt