Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære Data Cleaning and Transformation | Section
Data Wrangling with Tidyverse in R

bookData Cleaning and Transformation

Sveip for å vise menyen

When working with real-world datasets, you will often encounter messy or inconsistent data that must be cleaned and transformed before analysis. Common data cleaning tasks include renaming columns to more meaningful names; handling missing values to ensure accurate calculations; and recoding variables to standardize categories or create new ones. These steps are essential for making your data tidy and analysis-ready.

123456789101112131415161718
# Load required libraries library(dplyr) # Example data frame df <- data.frame( id = 1:4, score = c(90, NA, 75, 88), group = c("A", "B", "A", "B") ) # Use mutate to create a new variable and replace NA values in 'score' df_clean <- df %>% mutate( score_clean = ifelse(is.na(score), 0, score), # Replace NA with 0 passed = score_clean >= 80 # Create new logical variable ) print(df_clean)
copy

To reshape your data for different analysis needs, the tidyr package provides powerful tools. The pivot_longer function transforms data from a wide format, where columns represent variables, to a long format, where each row is an observation-variable pair. Conversely, pivot_wider converts long-format data back to wide format, spreading key-value pairs across multiple columns. These functions make it easy to tidy your data and prepare it for further analysis.

question mark

Which statement best describes the difference between pivot_longer and pivot_wider in the tidyr package?

Velg det helt riktige svaret

Alt var klart?

Hvordan kan vi forbedre det?

Takk for tilbakemeldingene dine!

Seksjon 1. Kapittel 4

Spør AI

expand

Spør AI

ChatGPT

Spør om hva du vil, eller prøv ett av de foreslåtte spørsmålene for å starte chatten vår

Seksjon 1. Kapittel 4
some-alt