Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære Data Splitting and Resampling | Section
Predictive Modeling with Tidymodels in R

bookData Splitting and Resampling

Stryg for at vise menuen

When building predictive models, you must ensure that your model can generalize well to new, unseen data. This is where data splitting becomes crucial. By dividing your dataset into separate training and testing sets, you can train your model on one portion of the data and evaluate its performance on another. This helps prevent overfitting, where a model learns the training data too well and fails to perform on new data. Data splitting provides a realistic estimate of how your model will behave in real-world scenarios.

1234567891011121314151617
options(crayon.enabled = FALSE) library(tidymodels) # Load example dataset data(ames, package = "modeldata") # Split the data: 80% for training, 20% for testing set.seed(123) data_split <- initial_split(ames, prop = 0.8) # Extract training and testing sets train_data <- training(data_split) test_data <- testing(data_split) # Check the number of rows in each set nrow(train_data) nrow(test_data)
copy

After splitting your data, you often want to further validate your model by using resampling methods. Tidymodels provides tools for techniques like cross-validation and bootstrapping.

  • Cross-validation involves dividing your training data into several folds;
  • Training the model on subsets, and validating it on the remaining fold;
  • This process is repeated so every fold serves as a validation set once.

Bootstrapping, on the other hand, generates multiple samples from the training data (with replacement) to estimate the variability in your model's performance. Both methods help you assess model stability and ensure your results are not due to a particular split of the data.

question mark

Why is it important to split your data and use resampling techniques when building predictive models?

Vælg det korrekte svar

Var alt klart?

Hvordan kan vi forbedre det?

Tak for dine kommentarer!

Sektion 1. Kapitel 1

Spørg AI

expand

Spørg AI

ChatGPT

Spørg om hvad som helst eller prøv et af de foreslåede spørgsmål for at starte vores chat

Sektion 1. Kapitel 1
some-alt