Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære Filtering and Arranging Data | Section
Practical Data Preparation in R with Tidyverse
Sektion 1. Kapitel 7
single

single

Filtering and Arranging Data

Stryg for at vise menuen

Filtering and arranging data are essential tasks in data wrangling, allowing you to focus your analysis on relevant subsets and to organize your data for easier interpretation. In the Tidyverse, the dplyr package provides two powerful functions for these purposes: filter and arrange.

Note
Definition

The filter function lets you select rows from a data frame that meet specific logical conditions.

For example, you might want to extract only those observations where a variable exceeds a certain value or matches a particular category.

Note
Definition

The arrange function reorders the rows of your data frame based on the values of one or more columns, either in ascending or descending order.

This is useful for ranking, prioritizing, or simply making large datasets easier to read.

123456789101112131415
library(dplyr) # Sample data frame df <- data.frame( name = c("Alice", "Bob", "Charlie", "Diana"), age = c(25, 30, 22, 28), score = c(88, 95, 78, 90) ) # Filtering rows where age is greater than 24 and arrange by score descending, then by name ascending filtered_arranged <- df %>% filter(age > 24) %>% arrange(desc(score), name) print(filtered_arranged)

You can combine filter and arrange in a single pipeline to create complex data queries. This approach allows you to first narrow down your dataset to the rows that matter most, and then sort the results to highlight patterns or outliers. For instance, you might filter for all records that meet a certain threshold and then arrange them by multiple columns to see the highest or lowest values grouped by categories. Using these functions together makes your data wrangling both efficient and expressive, helping you extract meaningful insights from your data.

question mark

Which statement best describes the difference between filter and arrange when using dplyr for data manipulation?

Vælg det korrekte svar

Opgave

Swipe to start coding

Use the provided data frame df containing information about several US cities.

  • Filter the rows to include only cities with a population greater than 2,000,000.
  • Arrange the filtered data by avg_temp in descending order.
  • Assign the resulting data frame to the variable filtered_sorted_cities.

Løsning

Switch to desktopSkift til skrivebord for at øve i den virkelige verdenFortsæt der, hvor du er, med en af nedenstående muligheder
Var alt klart?

Hvordan kan vi forbedre det?

Tak for dine kommentarer!

Sektion 1. Kapitel 7
single

single

Spørg AI

expand

Spørg AI

ChatGPT

Spørg om hvad som helst eller prøv et af de foreslåede spørgsmål for at starte vores chat

some-alt