Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære Building Classification Models | Section
Predictive Modeling with Tidymodels in R
Sektion 1. Kapitel 6
single

single

bookBuilding Classification Models

Stryg for at vise menuen

In many real-world scenarios, you are not only interested in predicting numeric values but also in making decisions or classifying observations into distinct groups. This is where classification problems come into play. A classification problem involves predicting a categorical outcome—such as whether an email is spam or not, or if a patient has a particular disease—based on a set of input features. The two most common types of classification models you will encounter are logistic regression and decision trees. Logistic regression is especially useful for binary classification tasks, while decision trees can handle both binary and multiclass problems, offering interpretable rules for decision making. Both of these models can be easily implemented in R using the Tidymodels suite, which provides a consistent interface for model specification, training, and evaluation.

12345678910111213141516171819202122232425262728
options(crayon.enabled = FALSE) library(tidymodels) # Load example data data(iris) # Convert Species to a binary outcome for demonstration iris_binary <- iris %>% filter(Species != "setosa") %>% mutate(Species = factor(Species)) # Split the data set.seed(123) iris_split <- initial_split(iris_binary, prop = 0.8) iris_train <- training(iris_split) iris_test <- testing(iris_split) # Specify a logistic regression model log_reg_spec <- logistic_reg() %>% set_engine("glm") %>% set_mode("classification") # Fit the model log_reg_fit <- log_reg_spec %>% fit(Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width, data = iris_train) # View the model summary summary(log_reg_fit$fit)
copy

To build a classification model in Tidymodels, you start by specifying the type of model you want to use, such as logistic_reg() for logistic regression. The model specification defines the algorithm and its settings, including the computational engine (like "glm" for generalized linear models) and the mode ("classification" or "regression"). Once specified, you fit the model to your training data using the fit() function, providing the formula and data. The output of a fitted logistic regression model includes estimated coefficients for each predictor, which represent the change in the log-odds of the outcome for a one-unit increase in the predictor, holding other variables constant. By examining the summary of the fitted model, you can interpret which features are most influential in predicting the class and the direction of their effects. This interpretability is one of the strengths of logistic regression in classification tasks.

Opgave

Stryg for at begynde at kode

Build and fit a decision tree classifier on the training data using Tidymodels.

  • Load the tidymodels package.
  • Define a decision tree model specification utilizing the decision_tree() function.
  • Set the model's engine to "rpart" utilizing the set_engine() function.
  • Set the mode to "classification" utilizing the set_mode() function.
  • Fit the model to the provided training data utilizing the fit() function. Use Species as the outcome and Sepal.Length, Sepal.Width, Petal.Length, and Petal.Width as predictors.
  • Return the fitted model object.

Løsning

Switch to desktopSkift til skrivebord for at øve i den virkelige verdenFortsæt der, hvor du er, med en af nedenstående muligheder
Var alt klart?

Hvordan kan vi forbedre det?

Tak for dine kommentarer!

Sektion 1. Kapitel 6
single

single

Spørg AI

expand

Spørg AI

ChatGPT

Spørg om hvad som helst eller prøv et af de foreslåede spørgsmål for at starte vores chat

some-alt