Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Data Selection - Advanced Techniques | Data Manipulation and Cleaning
Data Analysis with R

bookData Selection - Advanced Techniques

You already know how to select single rows and columns using basic indexing. Now, it's time to go a step further and explore how to select multiple rows and columns using both base R and the dplyr package. These techniques are essential when you want to focus on specific parts of a dataset or prepare your data for further analysis.

Selecting Multiple Columns

Base R

You can select multiple columns by combining their positions or names with the c() function. The result is a smaller data frame containing only the specified columns.

Using column positions:

selected_data_base <- df[, c(1, 2, 3)]

Using column names:

selected_data_base <- df[, c("name", "selling_price", "transmission")]

dplyr

You can use the select() function and pass the column names directly.

selected_data_dplyr <- df %>%
  select(km_driven, fuel, transmission)

Indexing Single Values

To access a specific value, provide both the row and column numbers. This is useful when checking or debugging individual data points.

df[1, 2]  # accesses the value in row 1, column 2

Slicing Rows

Sometimes you only want to work with the first few rows, or specific rows by position.

Base R

You can select multiple rows by specifying the first and the last index and writing a : in between.

first_5_rows_base <- df[1:5, ]

dplyr

You can use the slice() function and pass it the range of rows you want to take.

first_5_rows_dplyr <- df %>%
  slice(1:5)
question mark

What does df[1:5, ] do?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 5

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Awesome!

Completion rate improved to 4

bookData Selection - Advanced Techniques

Swipe to show menu

You already know how to select single rows and columns using basic indexing. Now, it's time to go a step further and explore how to select multiple rows and columns using both base R and the dplyr package. These techniques are essential when you want to focus on specific parts of a dataset or prepare your data for further analysis.

Selecting Multiple Columns

Base R

You can select multiple columns by combining their positions or names with the c() function. The result is a smaller data frame containing only the specified columns.

Using column positions:

selected_data_base <- df[, c(1, 2, 3)]

Using column names:

selected_data_base <- df[, c("name", "selling_price", "transmission")]

dplyr

You can use the select() function and pass the column names directly.

selected_data_dplyr <- df %>%
  select(km_driven, fuel, transmission)

Indexing Single Values

To access a specific value, provide both the row and column numbers. This is useful when checking or debugging individual data points.

df[1, 2]  # accesses the value in row 1, column 2

Slicing Rows

Sometimes you only want to work with the first few rows, or specific rows by position.

Base R

You can select multiple rows by specifying the first and the last index and writing a : in between.

first_5_rows_base <- df[1:5, ]

dplyr

You can use the slice() function and pass it the range of rows you want to take.

first_5_rows_dplyr <- df %>%
  slice(1:5)
question mark

What does df[1:5, ] do?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 5
some-alt