Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Leer Challenge 2: Basic Model Creation | Scikit-learn
Data Science Interview Challenge

book
Challenge 2: Basic Model Creation

In the realm of machine learning, the creation of models can be broadly categorized into supervised and unsupervised learning.

Supervised learning is a method where a model is trained on labeled data, meaning the algorithm is provided with input-output pairs, and it learns to map the inputs to the desired outputs. Examples include regression, where we predict a continuous value, and classification, where we assign input data into one of the predefined categories.

On the other hand, unsupervised learning operates without labeled data, aiming to identify patterns or structures within the data. The algorithm isn't told the "correct" answer but rather tries to extract insights on its own. Techniques such as clustering, where data is grouped based on inherent similarities, and dimensionality reduction, where redundant or less informative features are minimized or removed, are classic examples.

Both supervised and unsupervised learning methods are fundamental in data science and offer various tools to address a wide range of problems and challenges.

Taak

Swipe to start coding

Train a RandomForest classifier to predict wine types based on their chemical properties and evaluate the performance of the model.

  1. Split the data into training and test sets.
  2. Train a Random Forest Classifier using the training set. Set the number of trees of the forest to 20 and max depth of every of them to 4.
  3. Evaluate the model's performance using a classification report.

Oplossing

from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report, roc_auc_score

# Load Wine dataset
wine = load_wine()
X = wine.data
y = wine.target

# 1. Splitting data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# 2. Supervised Learning
clf = RandomForestClassifier(n_estimators=20, max_depth=4, random_state=0)
clf.fit(X_train, y_train)

# 3. Predict and evaluate model
y_pred = clf.predict(X_test)
report = classification_report(y_test, y_pred)
print("\nClassification Report:\n", report)

Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 7. Hoofdstuk 2
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report, roc_auc_score

# Load Wine dataset
wine = load_wine()
X = wine.data
y = wine.target

# 1. Splitting data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(___, test_size=0.3, random_state=42)

# 2. Supervised Learning
clf = RandomForestClassifier(___=20, ___=4, random_state=0)
clf.fit(___)

# 3. Predict and evaluate model
y_pred = clf.predict(X_test)
report = ___(___)
print("\nClassification Report:\n", report)

Vraag AI

expand
ChatGPT

Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.

some-alt