Exploring Data with dplyr

When you work with data frames in R, the dplyr package gives you a powerful set of tools for exploring and manipulating your data. The most important dplyr verbs are select, filter, arrange, mutate, and summarize. Each verb performs a specific type of operation:

select: choose specific columns from your data;
filter: keep only rows that meet certain conditions;
arrange: reorder rows based on column values;
mutate: add new columns or transform existing ones;
summarize: reduce your data to summary statistics.

These verbs allow you to quickly inspect and explore your data frames, making it easier to focus on the information that matters most.


              12345678910111213141516
            
library(dplyr)
options(crayon.enabled = FALSE)

# Creating a sample tibble
df <- tibble::tibble(
  name = c("Alice", "Bob", "Carol", "David"),
  age = c(25, 30, 22, 35),
  score = c(88, 92, 95, 85)
)

# Using select and filter to subset the tibble
result <- df %>%
  select(name, score) %>%
  filter(score > 90)

print(result)

A key feature of dplyr is the pipe operator %>%, which lets you chain together multiple operations in a clear, readable sequence. Instead of nesting functions inside each other, you pass the result of one operation directly into the next. This approach makes your code easier to read and understand, especially as your data wrangling tasks become more complex.

Var allt tydligt?

Tack för dina kommentarer!

Avsnitt 1. Kapitel 3

single

Svep för att visa menyn

When you work with data frames in R, the dplyr package gives you a powerful set of tools for exploring and manipulating your data. The most important dplyr verbs are select, filter, arrange, mutate, and summarize. Each verb performs a specific type of operation:

select: choose specific columns from your data;
filter: keep only rows that meet certain conditions;
arrange: reorder rows based on column values;
mutate: add new columns or transform existing ones;
summarize: reduce your data to summary statistics.

These verbs allow you to quickly inspect and explore your data frames, making it easier to focus on the information that matters most.


              12345678910111213141516
            
library(dplyr)
options(crayon.enabled = FALSE)

# Creating a sample tibble
df <- tibble::tibble(
  name = c("Alice", "Bob", "Carol", "David"),
  age = c(25, 30, 22, 35),
  score = c(88, 92, 95, 85)
)

# Using select and filter to subset the tibble
result <- df %>%
  select(name, score) %>%
  filter(score > 90)

print(result)

A key feature of dplyr is the pipe operator %>%, which lets you chain together multiple operations in a clear, readable sequence. Instead of nesting functions inside each other, you pass the result of one operation directly into the next. This approach makes your code easier to read and understand, especially as your data wrangling tasks become more complex.

Uppgift

Svep för att börja koda

Use dplyr to create a new tibble called high_scores that only contains the name and score columns for rows where the score is greater than 80.

Use the select function to choose only the name and score columns.
Use the filter function to keep only rows where the score column is greater than 80.
Assign the result to a new variable named high_scores.

Lösning

Byt till skrivbordet för praktisk övningFortsätt där du är med ett av alternativen nedan

Var allt tydligt?

Tack för dina kommentarer!

Avsnitt 1. Kapitel 3

single

Fråga AI

Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal