Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

adding a column to a data frame in R based on the rank of another column

Tags:

dataframe

r

Here is a reproducible example of my data. For the following data frame:

df <- data.frame(Subject = c('John', 'John', 'John', 'John','Mary', 'Mary', 'Mary', 'Mary'),
                 SNR = c(-4,-4,0,4,0,4,4,8))

I would like to add a column 'rank' that provides a ranking for SNR by Subject, so that it would look like this:

Subject   SNR   Rank
John      -4    1
John      -4    1
John       0    2
John       4    3
Mary       0    1
Mary       4    2
Mary       4    2
Mary       8    3

I have tried using:

dfNew <- transform(df, Rank = ave(SNR, Subject, FUN = function(x) rank(x, ties.method = "first")))

But I get the following:

Subject   SNR   Rank
John      -4    1
John      -4    2
John       0    3
John       4    4
Mary       0    1
Mary       4    2
Mary       4    3
Mary       8    4   

I have also tried using the different ties.method options, but none give me what I am looking for (i.e., ranking only from 1-3).

Any help would be much appreciated!

like image 214
Rmg Avatar asked Oct 12 '16 19:10

Rmg


People also ask

How do I rank a column in R?

The ranking of a variable in an R data frame can be done by using rank function. For example, if we have a data frame df that contains column x then rank of values in x can be found as rank(df$x).

Is there a rank function in R?

rank() function in R Language is used to return the sample ranks of the values of a vector. Equal values and missing values are handled in multiple ways.

How do you find rank in R?

If we have a matrix with dimensions R x C, having R number of rows and C number of columns, and if R is less than C then the rank of the matrix would be R. To find the rank of a matrix in R, we can use rankMatrix function in Matrix package.


3 Answers

Using aggregate and factor in base R:

ag <- aggregate(SNR~Subject, df, function(x) as.numeric(factor(x)))
df$rank <- c(t(ag[,-1]))

  Subject SNR rank
1    John  -4    1
2    John  -4    1
3    John   0    2
4    John   4    3
5    Mary   0    1
6    Mary   4    2
7    Mary   4    2
8    Mary   8    3
like image 106
989 Avatar answered Nov 15 '22 11:11

989


Another base R method:

transform(df1, Rank = ave(SNR, Subject, FUN = function(x) cumsum(c(TRUE, head(x, -1) != tail(x, -1)))))

gives:

  Subject SNR Rank
1    John  -4    1
2    John  -4    1
3    John   0    2
4    John   4    3
5    Mary   0    1
6    Mary   4    2
7    Mary   4    2
8    Mary   8    3

If your dataframe is not ordered yet, you should order it first with df1 <- df1[order(df1$SNR),] for this method to give the correct result.

like image 20
Jaap Avatar answered Nov 15 '22 12:11

Jaap


library(dplyr)    
df %>%
     arrange(Subject, SNR) %>%
     group_by(Subject) %>%
     mutate(rank=dense_rank(SNR))

of course credit to @rich-scriven for mentioning dense_rank()

like image 29
infominer Avatar answered Nov 15 '22 10:11

infominer