Challenge: Comparing Models
Now we will compare the models we learned on one dataset. This is a breast cancer dataset. The target is the 'diagnosis' column (1 – malignant, 0 – benign).
We will apply GridSearchCV to each model to find the best parameters. Also, in this task, we would use the recall metric for scoring since we do not want to have False Negatives. GridSearchCV can choose the parameters based on the recall metric if you set scoring='recall'.
Swipe to start coding
The task is to build all the models we learned and to print the best parameters along with the best recall score of each model. You will need to fill in the parameter names in the param_grid dictionaries.
- For the k-NN model find the best
n_neighborsvalue out of[3, 5, 7, 12]. - For the Logistic Regression run through
[0.1, 1, 10]values ofC. - For a Decision Tree, we want to configure two parameters,
max_depthandmin_samples_leaf. Run through values[2, 4, 6, 10]formax_depthand[1, 2, 4, 7]formin_samples_leaf. - For a Random Forest, find the best
max_depth(maximum depth of each Tree) value out of[2, 4, 6]and the best number of trees(n_estimators). Try values[20, 50, 100]for the number of trees.
Solución
Note
The code takes some time to run(less than a minute).
¡Gracias por tus comentarios!
single
Pregunte a AI
Pregunte a AI
Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla
Awesome!
Completion rate improved to 3.57
Challenge: Comparing Models
Desliza para mostrar el menú
Now we will compare the models we learned on one dataset. This is a breast cancer dataset. The target is the 'diagnosis' column (1 – malignant, 0 – benign).
We will apply GridSearchCV to each model to find the best parameters. Also, in this task, we would use the recall metric for scoring since we do not want to have False Negatives. GridSearchCV can choose the parameters based on the recall metric if you set scoring='recall'.
Swipe to start coding
The task is to build all the models we learned and to print the best parameters along with the best recall score of each model. You will need to fill in the parameter names in the param_grid dictionaries.
- For the k-NN model find the best
n_neighborsvalue out of[3, 5, 7, 12]. - For the Logistic Regression run through
[0.1, 1, 10]values ofC. - For a Decision Tree, we want to configure two parameters,
max_depthandmin_samples_leaf. Run through values[2, 4, 6, 10]formax_depthand[1, 2, 4, 7]formin_samples_leaf. - For a Random Forest, find the best
max_depth(maximum depth of each Tree) value out of[2, 4, 6]and the best number of trees(n_estimators). Try values[20, 50, 100]for the number of trees.
Solución
Note
The code takes some time to run(less than a minute).
¡Gracias por tus comentarios!
single