I'm sure this has been asked before, but I don't know what to search for, so I apologise in advance. Let's say that I have the following data frame: <pre class="prettyprint"><code>grades <- data.frame(a = 1:40, b = sample(45:100, 40)) </code></pre> Using deplyr, I want to create a new variable that indicates the grade the student received, based on the following criteria: 90-100 = excellent, 80-90 = very good, etc. I thought I could use the following to get that result with nestling ifelse() inside of mutate(): <pre class="prettyprint"><code>grades %>% mutate(ifelse(b >= 90, "excellent"), ifelse(b >= 80 & b < 90, "very_good"), ifelse(b >= 70 & b < 80, "fair"), ifelse(b >= 60 & b < 70, "poor", "fail")) </code></pre> This doesn't work, as I get the error message "argument no is missing, with no default"). I thought the "no" would be the "fail" at the end, but obviously I'm getting the syntax wrong. I can get this to get if I first filter the original data individually, and then call ifelse, as follows: <pre class="prettyprint"><code>a <- grades %>% filter( b >= 90) %>% mutate(final = ifelse(b >= 90, "excellent")) </code></pre> and the rbind a, b, c, etc. Obviously,this isn't how I want to do it, but I wanted to understand the syntax of ifelse(). I'm guessing the latter works because there aren't any values that don't fill the criteria, but I still can't figure out how to get it to work when there is more than one ifelse.

Define vectors with the levels and labels and then use <code>cut</code> on the <code>b</code> column: <pre class="prettyprint"><code>levels <- c(-Inf, 60, 70, 80, 90, Inf) labels <- c("Fail", "Poor", "fair", "very good", "excellent") grades %>% mutate(x = cut(b, levels, labels = labels)) a b x 1 1 66 Poor 2 2 78 fair 3 3 97 excellent 4 4 46 Fail 5 5 89 very good 6 6 57 Fail 7 7 80 fair 8 8 98 excellent 9 9 100 excellent 10 10 93 excellent 11 11 59 Fail 12 12 51 Fail 13 13 69 Poor 14 14 75 fair 15 15 72 fair 16 16 48 Fail 17 17 74 fair 18 18 54 Fail 19 19 62 Poor 20 20 64 Poor 21 21 88 very good 22 22 70 Poor 23 23 85 very good 24 24 58 Fail 25 25 95 excellent 26 26 56 Fail 27 27 65 Poor 28 28 68 Poor 29 29 91 excellent 30 30 76 fair 31 31 82 very good 32 32 55 Fail 33 33 96 excellent 34 34 83 very good 35 35 61 Poor 36 36 60 Fail 37 37 77 fair 38 38 47 Fail 39 39 73 fair 40 40 71 fair </code></pre> Or using data.table: <pre class="prettyprint"><code>library(data.table) setDT(grades)[, x := cut(b, levels, labels)] </code></pre> Or simply in base R: <pre class="prettyprint"><code>grades$x <- cut(grades$b, levels, labels) </code></pre> <h3>Note</h3> After taking another close look at your initial approach, I noticed that you would need to include <code>right = FALSE</code> in the <code>cut</code> call, because for example, 90 points should be "excellent", not just "very good". So it is used to define where the interval should be closed (left or right) and the default is on the right, which is slightly different from OP's initial approach. So in dplyr, it would then be: <pre class="prettyprint"><code>grades %>% mutate(x = cut(b, levels, labels, right = FALSE)) </code></pre> and accordingly in the other options.

Create column with grouped values based on another column

Tags:

r

if-statement

dplyr

I'm sure this has been asked before, but I don't know what to search for, so I apologise in advance.

Let's say that I have the following data frame:

Click to copy

grades <- data.frame(a = 1:40, b = sample(45:100, 40))

Using deplyr, I want to create a new variable that indicates the grade the student received, based on the following criteria: 90-100 = excellent, 80-90 = very good, etc.

I thought I could use the following to get that result with nestling ifelse() inside of mutate():

Click to copy

grades %>%
mutate(ifelse(b >= 90, "excellent"), 
       ifelse(b >= 80 & b < 90, "very_good"),
       ifelse(b >= 70 & b < 80, "fair"),
       ifelse(b >= 60 & b < 70, "poor", "fail"))

This doesn't work, as I get the error message "argument no is missing, with no default"). I thought the "no" would be the "fail" at the end, but obviously I'm getting the syntax wrong.

I can get this to get if I first filter the original data individually, and then call ifelse, as follows:

Click to copy

a <- grades %>%
     filter( b >= 90) %>%
     mutate(final = ifelse(b >= 90, "excellent"))

and the rbind a, b, c, etc. Obviously,this isn't how I want to do it, but I wanted to understand the syntax of ifelse(). I'm guessing the latter works because there aren't any values that don't fill the criteria, but I still can't figure out how to get it to work when there is more than one ifelse.

872

asked Jan 12 '15 13:01

JoeF

1 Answers

Define vectors with the levels and labels and then use cut on the b column:

Click to copy

levels <- c(-Inf, 60, 70, 80, 90, Inf)
labels <- c("Fail", "Poor", "fair", "very good", "excellent")
grades %>% mutate(x = cut(b, levels, labels = labels))
    a   b         x
1   1  66      Poor
2   2  78      fair
3   3  97 excellent
4   4  46      Fail
5   5  89 very good
6   6  57      Fail
7   7  80      fair
8   8  98 excellent
9   9 100 excellent
10 10  93 excellent
11 11  59      Fail
12 12  51      Fail
13 13  69      Poor
14 14  75      fair
15 15  72      fair
16 16  48      Fail
17 17  74      fair
18 18  54      Fail
19 19  62      Poor
20 20  64      Poor
21 21  88 very good
22 22  70      Poor
23 23  85 very good
24 24  58      Fail
25 25  95 excellent
26 26  56      Fail
27 27  65      Poor
28 28  68      Poor
29 29  91 excellent
30 30  76      fair
31 31  82 very good
32 32  55      Fail
33 33  96 excellent
34 34  83 very good
35 35  61      Poor
36 36  60      Fail
37 37  77      fair
38 38  47      Fail
39 39  73      fair
40 40  71      fair

Or using data.table:

Click to copy

library(data.table)
setDT(grades)[, x := cut(b, levels, labels)]

Or simply in base R:

Click to copy

grades$x <- cut(grades$b, levels, labels)

Note

After taking another close look at your initial approach, I noticed that you would need to include right = FALSE in the cut call, because for example, 90 points should be "excellent", not just "very good". So it is used to define where the interval should be closed (left or right) and the default is on the right, which is slightly different from OP's initial approach. So in dplyr, it would then be:

Click to copy

grades %>% mutate(x = cut(b, levels, labels, right = FALSE))

and accordingly in the other options.

129

answered Oct 21 '22 03:10

talat

Related questions
                            
                                R CRAN Check fail when using parallel functions
                            
                                Getting the unique count of strings from a text string
                            
                                Extracting column names with condition from a data frame
                            
                                Assigning a specific number of values informed by a probability distribution (in R)
                            
                                Loop through a series of qplots
                            
                                Observation number by group [duplicate]
                            
                                Error writing to csv
                            
                                Why the "=" R operator should not be used in functions?
                            
                                remove comma from a digits portion string
                            
                                Overlapped density plots in ggplot2
                            
                                Is there a better syntax for subsetting a data frame in R?
                            
                                Selecting data frame columns to plot in ggplot2
                            
                                Plot 3d density
                            
                                How to change default aesthetics in ggplot?
                            
                                Efficient way to create a circulant matrix in R
                            
                                Controlling legend and colors for raster values in R?
                            
                                Uppercase the first letter in data frame
                            
                                What is the difference between string and character in R?
                            
                                Creating correlation matrix p values [duplicate]
                            
                                Rstudio shiny ggvis tooltip on mouse hover

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Create column with grouped values based on another column

Tags:

r

if-statement

dplyr

JoeF

People also ask

1 Answers

Note

talat

Recent Activity

Donate For Us