Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Apprendre Building Classification Models | Section
Predictive Modeling with Tidymodels in R
Section 1. Chapitre 6
single

single

bookBuilding Classification Models

Glissez pour afficher le menu

In many real-world scenarios, you are not only interested in predicting numeric values but also in making decisions or classifying observations into distinct groups. This is where classification problems come into play. A classification problem involves predicting a categorical outcome—such as whether an email is spam or not, or if a patient has a particular disease—based on a set of input features. The two most common types of classification models you will encounter are logistic regression and decision trees. Logistic regression is especially useful for binary classification tasks, while decision trees can handle both binary and multiclass problems, offering interpretable rules for decision making. Both of these models can be easily implemented in R using the Tidymodels suite, which provides a consistent interface for model specification, training, and evaluation.

12345678910111213141516171819202122232425262728
options(crayon.enabled = FALSE) library(tidymodels) # Load example data data(iris) # Convert Species to a binary outcome for demonstration iris_binary <- iris %>% filter(Species != "setosa") %>% mutate(Species = factor(Species)) # Split the data set.seed(123) iris_split <- initial_split(iris_binary, prop = 0.8) iris_train <- training(iris_split) iris_test <- testing(iris_split) # Specify a logistic regression model log_reg_spec <- logistic_reg() %>% set_engine("glm") %>% set_mode("classification") # Fit the model log_reg_fit <- log_reg_spec %>% fit(Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width, data = iris_train) # View the model summary summary(log_reg_fit$fit)
copy

To build a classification model in Tidymodels, you start by specifying the type of model you want to use, such as logistic_reg() for logistic regression. The model specification defines the algorithm and its settings, including the computational engine (like "glm" for generalized linear models) and the mode ("classification" or "regression"). Once specified, you fit the model to your training data using the fit() function, providing the formula and data. The output of a fitted logistic regression model includes estimated coefficients for each predictor, which represent the change in the log-odds of the outcome for a one-unit increase in the predictor, holding other variables constant. By examining the summary of the fitted model, you can interpret which features are most influential in predicting the class and the direction of their effects. This interpretability is one of the strengths of logistic regression in classification tasks.

Tâche

Glissez pour commencer à coder

Build and fit a decision tree classifier on the training data using Tidymodels.

  • Load the tidymodels package.
  • Define a decision tree model specification utilizing the decision_tree() function.
  • Set the model's engine to "rpart" utilizing the set_engine() function.
  • Set the mode to "classification" utilizing the set_mode() function.
  • Fit the model to the provided training data utilizing the fit() function. Use Species as the outcome and Sepal.Length, Sepal.Width, Petal.Length, and Petal.Width as predictors.
  • Return the fitted model object.

Solution

Switch to desktopPassez à un bureau pour une pratique réelleContinuez d'où vous êtes en utilisant l'une des options ci-dessous
Tout était clair ?

Comment pouvons-nous l'améliorer ?

Merci pour vos commentaires !

Section 1. Chapitre 6
single

single

Demandez à l'IA

expand

Demandez à l'IA

ChatGPT

Posez n'importe quelle question ou essayez l'une des questions suggérées pour commencer notre discussion

some-alt