Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get ranks with no gaps when there are ties among values?

Tags:

r

When there are ties in the original data, is there a way to create a ranking without gaps in the ranks (consecutive, integer rank values)? Suppose:

x <-  c(10, 10, 10, 5, 5, 20, 20)
rank(x)
# [1] 4.0 4.0 4.0 1.5 1.5 6.5 6.5

In this case the desired result would be:

my_rank(x)
[1] 2 2 2 1 1 3 3

I've played with all the options for ties.method option (average, max, min, random), none of which are designed to provide the desired result.

Is it possible to acheive this with the rank() function?

like image 793
Brandon Bertelsen Avatar asked Feb 06 '11 19:02

Brandon Bertelsen


People also ask

How do you compute ranking if there are ties?

Tied observations are given the average of the ranks they would have received as if no ranks were tied. For example, if three teams are tied for first, they are also statistically tied for second and for third as well.

How do you rank within a group in Excel?

There is a formula to quickly rank values based on group. Select a blank cell next to the data, C2 for instance, type this formula, =SUMPRODUCT(($A$2:$A$11=A2)*(B2<$B$2:$B$11))+1 then drag autofill handle down to apply this formula to the cells you need.

How do you rank a tie in Excel?

The formula is =RANK(B4,$B$4:$B$13) . Notice that the formula does not return a 3 or a 9 because there are two sets of ties among the scores: Tom and Sophia both scored 245, while Mike and Nick both scored 138.

How do you rank in Excel without ties?

To rank list data without ties, you only need a formula. Select a blank cell that will place the ranking, type this formula =RANK($B2,$B$2:$B$9)+COUNTIF(B$2:B2,B2)-1, press Enter key, and drag the fill handle down to apply this formula.


2 Answers

Modified crayola solution but using match instead of merge:

x_unique <- unique(x)
x_ranks <- rank(x_unique)
x_ranks[match(x,x_unique)]

edit

or in a one-liner, as per @hadley 's comment:

match(x, sort(unique(x)))
like image 187
Marek Avatar answered Oct 19 '22 03:10

Marek


The "loopless" way to do it is to simply treat the vector as an ordered factor, then convert it to numeric:

> as.numeric( ordered( c( 10,10,10,10, 5,5,5, 10, 10 ) ) )
[1] 2 2 2 2 1 1 1 2 2
> as.numeric( ordered( c(0.5,0.56,0.76,0.23,0.33,0.4) ))
[1] 4 5 6 1 2 3
> as.numeric( ordered( c(1,1,2,3,4,5,8,8) ))
[1] 1 1 2 3 4 5 6 6

Update: Another way, that seems faster is to use findInterval and sort(unique()):

> x <- c( 10, 10, 10, 10, 5,5,5, 10, 10)
> findInterval( x, sort(unique(x)))
[1] 2 2 2 2 1 1 1 2 2

> x <- round( abs( rnorm(1000000)*10))
> system.time( z <- as.numeric( ordered( x )))
   user  system elapsed 
  0.996   0.025   1.021 
> system.time( z <- findInterval( x, sort(unique(x))))
   user  system elapsed 
  0.077   0.003   0.080 
like image 32
Prasad Chalasani Avatar answered Oct 19 '22 03:10

Prasad Chalasani