Lernen Reshaping Data for Analysis

Swipe um das Menü anzuzeigen

One of the most powerful aspects of data wrangling with the Tidyverse is the ability to reshape data frames to suit your analysis needs. In R, the tidyr package provides two essential functions for this purpose: pivot_longer and pivot_wider. These functions replace the older gather and spread functions, offering a more consistent and intuitive approach to data reshaping. pivot_longer is used to transform data from a wide format to a long format, while pivot_wider does the opposite, converting long data into a wide format.


              12345678910111213141516171819
            
library(tidyr)
library(dplyr)
options(crayon.enabled = FALSE)

scores <- tibble(
  student = c("Alice", "Bob", "Carol"),
  math = c(90, 85, 88),
  science = c(92, 80, 91)
)

# Convert the data from wide to long format using pivot_longer:
long_scores <- scores %>%
  pivot_longer(
    cols = c(math, science),
    names_to = "subject",
    values_to = "score"
  )

print(long_scores)

Reshaping data is often necessary because different analysis tasks require data in different forms. You might encounter wide data, where each variable has its own column, but many statistical analyses and visualizations work best with long data, where each observation is a row. Using pivot_longer helps standardize your data, making it easier to filter, group, and summarize by categories such as subject or measurement type. On the other hand, pivot_wider is useful when you want to compare variables side by side or prepare data for certain reporting formats. Choosing the right reshaping function depends on your analysis goals and the requirements of the functions or models you plan to use.

War alles klar?

Danke für Ihr Feedback!

Abschnitt 1. Kapitel 7

Fragen Sie AI

Fragen Sie alles oder probieren Sie eine der vorgeschlagenen Fragen, um unser Gespräch zu beginnen

Abschnitt 1. Kapitel 7