Challenge: Comparing Models
Now we will compare the models we learned on one dataset. This is a breast cancer dataset. The target is the 'diagnosis'
column (1 – malignant, 0 – benign).
We will apply GridSearchCV
to each model to find the best parameters. Also, in this task, we would use the recall metric for scoring since we do not want to have False Negatives. GridSearchCV
can choose the parameters based on the recall metric if you set scoring='recall'
.
Swipe to start coding
The task is to build all the models we learned and to print the best parameters along with the best recall score of each model. You will need to fill in the parameter names in the param_grid
dictionaries.
- For the k-NN model find the best
n_neighbors
value out of[3, 5, 7, 12]
. - For the Logistic Regression run through
[0.1, 1, 10]
values ofC
. - For a Decision Tree, we want to configure two parameters,
max_depth
andmin_samples_leaf
. Run through values[2, 4, 6, 10]
formax_depth
and[1, 2, 4, 7]
formin_samples_leaf
. - For a Random Forest, find the best
max_depth
(maximum depth of each Tree) value out of[2, 4, 6]
and the best number of trees(n_estimators
). Try values[20, 50, 100]
for the number of trees.
Lösning
Note
The code takes some time to run(less than a minute).
Tack för dina kommentarer!
single
Fråga AI
Fråga AI
Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal
Awesome!
Completion rate improved to 3.57
Challenge: Comparing Models
Svep för att visa menyn
Now we will compare the models we learned on one dataset. This is a breast cancer dataset. The target is the 'diagnosis'
column (1 – malignant, 0 – benign).
We will apply GridSearchCV
to each model to find the best parameters. Also, in this task, we would use the recall metric for scoring since we do not want to have False Negatives. GridSearchCV
can choose the parameters based on the recall metric if you set scoring='recall'
.
Swipe to start coding
The task is to build all the models we learned and to print the best parameters along with the best recall score of each model. You will need to fill in the parameter names in the param_grid
dictionaries.
- For the k-NN model find the best
n_neighbors
value out of[3, 5, 7, 12]
. - For the Logistic Regression run through
[0.1, 1, 10]
values ofC
. - For a Decision Tree, we want to configure two parameters,
max_depth
andmin_samples_leaf
. Run through values[2, 4, 6, 10]
formax_depth
and[1, 2, 4, 7]
formin_samples_leaf
. - For a Random Forest, find the best
max_depth
(maximum depth of each Tree) value out of[2, 4, 6]
and the best number of trees(n_estimators
). Try values[20, 50, 100]
for the number of trees.
Lösning
Note
The code takes some time to run(less than a minute).
Tack för dina kommentarer!
Awesome!
Completion rate improved to 3.57single