Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create a ranking variable with dplyr?

Tags:

r

dplyr

Suppose I have the following data

df = data.frame(name=c("A", "B", "C", "D"), score = c(10, 10, 9, 8)) 

I want to add a new column with the ranking. This is what I'm doing:

df %>% mutate(ranking = rank(score, ties.method = 'first')) #   name score ranking # 1    A    10       3 # 2    B    10       4 # 3    C     9       2 # 4    D     8       1 

However, my desired result is:

#   name score ranking # 1    A    10       1 # 2    B    10       1 # 3    C     9       2 # 4    D     8       3 

Clearly rank does not do what I have in mind. What function should I be using?

like image 698
Ignacio Avatar asked Sep 29 '14 18:09

Ignacio


People also ask

How do you rank a variable in R?

The ranking of a variable in an R data frame can be done by using rank function. For example, if we have a data frame df that contains column x then rank of values in x can be found as rank(df$x).

How do you rank variables?

To rank more than one variable, specify a variable list. After the variable list, you can specify the direction for ranking in parentheses. Specify A for ascending (smallest value gets smallest rank) or D for descending (largest value gets smallest rank). A is the default.

How does rank work in R?

rank returns a vector with the "rank" of each value. the number in the first position is the 9th lowest. order returns the indices that would put the initial vector x in order. The 27th value of x is the lowest, so 27 is the first element of order(x) - and if you look at rank(x) , the 27th element is 1 .

How do you find rank in R?

If we have a matrix with dimensions R x C, having R number of rows and C number of columns, and if R is less than C then the rank of the matrix would be R. To find the rank of a matrix in R, we can use rankMatrix function in Matrix package.


2 Answers

It sounds like you're looking for dense_rank from "dplyr" -- but applied in a reverse order than what rank normally does.

Try this:

df %>% mutate(rank = dense_rank(desc(score))) #   name score rank # 1    A    10    1 # 2    B    10    1 # 3    C     9    2 # 4    D     8    3 
like image 186
A5C1D2H2I1M1N2O1R2T1 Avatar answered Sep 24 '22 01:09

A5C1D2H2I1M1N2O1R2T1


Other solution when you need to apply the rank to all variables (not just one).

df = data.frame(name = c("A","B","C","D"),                 score=c(10,10,9,8), score2 = c(5,1,9,2))  select(df, -name) %>% mutate_all(funs(dense_rank(desc(.)))) 
like image 23
Pablo Casas Avatar answered Sep 23 '22 01:09

Pablo Casas