I have such a data frame(df): <pre class="prettyprint"><code>group col1 col2 x a 22 x a 23 x b 16 x b 18 y a 11 y a 12 y a 16 y a 45 y b 24 </code></pre> Desired output is: <pre class="prettyprint"><code>group col1 col2 rank x a 22 1 x a 23 2 x b 16 0 x b 18 0 y a 11 1 y a 12 2 y a 16 3 y a 45 4 y b 24 0 </code></pre> Namely, <ul> <li>order col2 by group and col1</li> <li>when col1="b" then rank is 0</li> <li>rank values of col2 from smallest to largest</li> </ul> How can I do that by using R? I will be very glad for any help. Thanks a lot.

You could try <pre class="prettyprint"><code>library(dplyr) df %>% group_by(group, col1) %>% mutate(rank=replace(min_rank(col2), col1=='b',0) ) # group col1 col2 rank #1 x a 22 1 #2 x a 23 2 #3 x b 16 0 #4 x b 18 0 #5 y a 11 1 #6 y a 12 2 #7 y a 16 3 #8 y a 45 4 #9 y b 24 0 </code></pre> If you don't want gaps between ranks when there are ties, replace <code>min_rank</code> with <code>dense_rank</code> Or, instead of <code>replace</code> <pre class="prettyprint"><code> res <- df %>% group_by(group, col1) %>% mutate(rank=(col1!='b')*min_rank(col2)) as.data.frame(res) #would be `data.frame` # group col1 col2 rank #1 x a 22 1 #2 x a 23 2 #3 x b 16 0 #4 x b 18 0 #5 y a 11 1 #6 y a 12 2 #7 y a 16 3 #8 y a 45 4 #9 y b 24 0 </code></pre>

Or using <code>data.table</code> v>= 1.9.5 <pre class="prettyprint"><code>library(data.table) setDT(df)[, rank := frank(col2, ties.method = "dense"), by = .(group, col1)][col1 == "b", rank := 0L][] # group col1 col2 rank # 1: x a 22 1 # 2: x a 23 2 # 3: x b 16 0 # 4: x b 18 0 # 5: y a 11 1 # 6: y a 12 2 # 7: y a 16 3 # 8: y a 45 4 # 9: y b 24 0 </code></pre> Or like @Arun suggested, you can skip one grouping step if you will set <code>b</code> to zero first <pre class="prettyprint"><code>dt[, rank := 0L][col1 != "b", rank := frank(col2, ties.method="dense"), by=group][] </code></pre>

R-ranking values of a column by grouping, conditional to another variable

Tags:

r

I have such a data frame(df):

group col1 col2  
x      a    22    
x      a    23  
x      b    16  
x      b    18  
y      a    11  
y      a    12  
y      a    16  
y      a    45  
y      b    24

Desired output is:

group col1 col2 rank 
x      a    22  1  
x      a    23  2
x      b    16  0
x      b    18  0
y      a    11  1
y      a    12  2
y      a    16  3
y      a    45  4
y      b    24  0

Namely,

order col2 by group and col1
when col1="b" then rank is 0
rank values of col2 from smallest to largest

How can I do that by using R? I will be very glad for any help. Thanks a lot.

675

asked Mar 15 '15 12:03

oercim

2 Answers

You could try

library(dplyr)
 df %>%
    group_by(group, col1) %>% 
    mutate(rank=replace(min_rank(col2), col1=='b',0) )
#    group col1 col2 rank
#1     x    a   22    1
#2     x    a   23    2
#3     x    b   16    0
#4     x    b   18    0
#5     y    a   11    1
#6     y    a   12    2
#7     y    a   16    3
#8     y    a   45    4
#9     y    b   24    0

If you don't want gaps between ranks when there are ties, replace min_rank with dense_rank

Or, instead of replace

 res <- df %>% 
          group_by(group, col1) %>% 
          mutate(rank=(col1!='b')*min_rank(col2))

 as.data.frame(res) #would be `data.frame`
 #    group col1 col2 rank
 #1     x    a   22    1
 #2     x    a   23    2
 #3     x    b   16    0
 #4     x    b   18    0
 #5     y    a   11    1
 #6     y    a   12    2
 #7     y    a   16    3
 #8     y    a   45    4
 #9     y    b   24    0

132

answered Oct 19 '22 18:10

akrun

Or using data.table v>= 1.9.5

library(data.table)
setDT(df)[, rank := frank(col2, ties.method = "dense"),
             by = .(group, col1)][col1 == "b", rank := 0L][]

#    group col1 col2 rank
# 1:     x    a   22    1
# 2:     x    a   23    2
# 3:     x    b   16    0
# 4:     x    b   18    0
# 5:     y    a   11    1
# 6:     y    a   12    2
# 7:     y    a   16    3
# 8:     y    a   45    4
# 9:     y    b   24    0

Or like @Arun suggested, you can skip one grouping step if you will set b to zero first

dt[, rank := 0L][col1 != "b", rank := frank(col2, ties.method="dense"), by=group][]

answered Oct 19 '22 17:10

David Arenburg

Related questions
                            
                                Creating regular 15-minute time-series from irregular time-series
                            
                                Is there any way to use the Identify command with ggplot 2?
                            
                                Install R Packages without internet [duplicate]
                            
                                Histogram with "negative" logarithmic scale in R
                            
                                In R, Merge two data frames, fill down the blanks
                            
                                split string with regex
                            
                                Why doesn't the plyr package use my parallel backend?
                            
                                Simple method of counting non-NAs in column of data String [duplicate]
                            
                                Subset variables in data frame based on column type
                            
                                R calculate the standard error using bootstrap
                            
                                Passing large matrices to RcppArmadillo function without creating copy (advanced constructors)
                            
                                Efficient method to subset drop rows with NA values in R
                            
                                Count the number of pattern matches in a string
                            
                                How can I extract factor loadings from lavaan?
                            
                                mean( ,na.rm=TRUE) still returns NA
                            
                                Replace text that appears at the end of a string
                            
                                Use string as filter in dplyr?
                            
                                How to build a crossword-like plot for a boolean matrix
                            
                                R: find vector in list of vectors
                            
                                Search for and remove outliers from a dataframe grouped by a variable

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With