Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lära Grouping Numeric Data | Factors
R Introduction: Part I

Svep för att visa menyn

book
Grouping Numeric Data

To categorize numeric data into groups, you can use the cut() function in R, which assigns each number to a category based on specified intervals. For instance, if you have a continuous variable like height, you can categorize individuals as 'tall', 'medium', or 'short' based on height ranges.

Here's how you can use it:

r

Among the parameters listed, these are crucial for categorizing data:

  • x is the numeric vector to be categorized;

  • breaks can be an integer specifying the number of intervals or a vector of cut points;

  • labels provide names for the categories;

  • right indicates if the intervals should be closed on the right;

  • ordered_result determines if the resulting factors should have an order.

To create three categories, set breaks to 3 or provide a vector with four cut points to form three intervals, for instance (a,b], (b,c], (c,d].

1234567
# Vector of heights heights <- c(170, 165, 195, 172, 189, 156, 178, 198, 157, 182, 171, 184, 163, 176, 169, 153) # Convert into factor by cutting into intervals heights_f <- cut(heights, breaks = c(0, 160, 190, 250), labels = c('small', 'medium', 'tall'), ordered_result = T) heights_f # Output the factor variable
copy

For our example of categorizing height, we choose c(0, 160, 190, 250) for breaks to divide the data into three groups: (0, 160], (160, 190], and (190, 250]. We also set ordered_result to TRUE to define a logical order among categories (e.g., short < medium < tall).

Uppgift

Swipe to start coding

  1. Given a vector of numerical grades, here's how to categorize them as factor levels:

    • [0, 60) - F;
    • [60, 75) - D;
    • [75, 85) - C;
    • [85, 95) - B;
    • [95, 100) - A.
  2. Create a variable grades_f that stores the factor levels with the specified breaks and labels, considering the ordering, and use right = FALSE to include the left boundary of the intervals;

    • breaks - c(0, 60, 75, 85, 95, 100);
    • labels - c('F', 'D', 'C', 'B', 'A');
    • ordered_result - TRUE (to order the factor values);
    • right - FALSE (to include the left boundary of an interval, not the right).
  3. Output the contents of grades_f.

Lösning

Switch to desktopByt till skrivbordet för praktisk övningFortsätt där du är med ett av alternativen nedan
Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 3. Kapitel 5
Vi beklagar att något gick fel. Vad hände?

Fråga AI

expand
ChatGPT

Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal

book
Grouping Numeric Data

To categorize numeric data into groups, you can use the cut() function in R, which assigns each number to a category based on specified intervals. For instance, if you have a continuous variable like height, you can categorize individuals as 'tall', 'medium', or 'short' based on height ranges.

Here's how you can use it:

r

Among the parameters listed, these are crucial for categorizing data:

  • x is the numeric vector to be categorized;

  • breaks can be an integer specifying the number of intervals or a vector of cut points;

  • labels provide names for the categories;

  • right indicates if the intervals should be closed on the right;

  • ordered_result determines if the resulting factors should have an order.

To create three categories, set breaks to 3 or provide a vector with four cut points to form three intervals, for instance (a,b], (b,c], (c,d].

1234567
# Vector of heights heights <- c(170, 165, 195, 172, 189, 156, 178, 198, 157, 182, 171, 184, 163, 176, 169, 153) # Convert into factor by cutting into intervals heights_f <- cut(heights, breaks = c(0, 160, 190, 250), labels = c('small', 'medium', 'tall'), ordered_result = T) heights_f # Output the factor variable
copy

For our example of categorizing height, we choose c(0, 160, 190, 250) for breaks to divide the data into three groups: (0, 160], (160, 190], and (190, 250]. We also set ordered_result to TRUE to define a logical order among categories (e.g., short < medium < tall).

Uppgift

Swipe to start coding

  1. Given a vector of numerical grades, here's how to categorize them as factor levels:

    • [0, 60) - F;
    • [60, 75) - D;
    • [75, 85) - C;
    • [85, 95) - B;
    • [95, 100) - A.
  2. Create a variable grades_f that stores the factor levels with the specified breaks and labels, considering the ordering, and use right = FALSE to include the left boundary of the intervals;

    • breaks - c(0, 60, 75, 85, 95, 100);
    • labels - c('F', 'D', 'C', 'B', 'A');
    • ordered_result - TRUE (to order the factor values);
    • right - FALSE (to include the left boundary of an interval, not the right).
  3. Output the contents of grades_f.

Lösning

Switch to desktopByt till skrivbordet för praktisk övningFortsätt där du är med ett av alternativen nedan
Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 3. Kapitel 5
Switch to desktopByt till skrivbordet för praktisk övningFortsätt där du är med ett av alternativen nedan
Vi beklagar att något gick fel. Vad hände?
some-alt