Challenge: Solving Task Using XGBoost
Taak
Swipe to start coding
The "Credit Scoring" dataset is commonly used for credit risk analysis and binary classification tasks. It contains information about customers and their credit applications, with the goal of predicting whether a customer's credit application will result in a good or bad credit outcome.
Your task is to solve classification task on "Credit Scoring" dataset:
- Create
Dmatrix
objects using training and test data. Specifyenable_categorical
argument to use categorical features. - Train the XGBoost model using the training
DMatrix
object. - Set the split threshold to
0.5
for correct class detection.
Note
'objective': 'binary:logistic'
parameter means that we will use logistic loss (also known as binary cross-entropy loss) as an objective function when training the XGBoost model.
Oplossing
99
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
import numpy as np
import xgboost as xgb
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split
from sklearn.metrics import f1_score
# Load the Credit Scoring dataset
data = fetch_openml(name="credit-g", version=1, parser='auto')
X = data.data
y = data.target
# Convert target to binary (1: Good, 0: Bad)
y = (y == 'good').astype(int)
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create DMatrix objects for XGBoost with categorical features enabled
dtrain = xgb.DMatrix(X_train, label=y_train, enable_categorical=True)
dtest = xgb.DMatrix(X_test, label=y_test, enable_categorical=True)
# Set hyperparameters
params = {
'objective': 'binary:logistic',
}
# Train the XGBoost classifier
model = xgb.train(params, dtrain)
# Make predictions
y_pred = model.predict(dtest)
y_pred_binary = (y_pred > 0.5).astype(int)
# Calculate F1-score
f1 = f1_score(y_test, y_pred_binary)
print(f'F1-score: {f1:.4f}')
Was alles duidelijk?
Bedankt voor je feedback!
Sectie 3. Hoofdstuk 6
single
99
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
import numpy as np
import xgboost as xgb
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split
from sklearn.metrics import f1_score
# Load the Credit Scoring dataset
data = fetch_openml(name="credit-g", version=1, parser='auto')
X = data.data
y = data.target
# Convert target to binary (1: Good, 0: Bad)
y = (y == 'good').astype(int)
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create DMatrix objects for XGBoost with categorical features enabled
dtrain = xgb.___(___, label=y_train, enable_categorical=True)
dtest = xgb.DMatrix(X_test, label=___, enable_categorical=___)
# Set hyperparameters
params = {
'objective': 'binary:logistic',
}
# Train the XGBoost classifier
model = xgb.train(params, ___)
# Make predictions
y_pred = model.predict(dtest)
y_pred_binary = (y_pred > ___).astype(int)
# Calculate F1-score
f1 = f1_score(y_test, y_pred_binary)
print(f'F1-score: {f1:.4f}')
Vraag AI
Vraag AI
Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.