Here is a reproducible example of my data. For the following data frame:
df <- data.frame(Subject = c('John', 'John', 'John', 'John','Mary', 'Mary', 'Mary', 'Mary'),
SNR = c(-4,-4,0,4,0,4,4,8))
I would like to add a column 'rank' that provides a ranking for SNR by Subject, so that it would look like this:
Subject SNR Rank
John -4 1
John -4 1
John 0 2
John 4 3
Mary 0 1
Mary 4 2
Mary 4 2
Mary 8 3
I have tried using:
dfNew <- transform(df, Rank = ave(SNR, Subject, FUN = function(x) rank(x, ties.method = "first")))
But I get the following:
Subject SNR Rank
John -4 1
John -4 2
John 0 3
John 4 4
Mary 0 1
Mary 4 2
Mary 4 3
Mary 8 4
I have also tried using the different ties.method options, but none give me what I am looking for (i.e., ranking only from 1-3).
Any help would be much appreciated!
The ranking of a variable in an R data frame can be done by using rank function. For example, if we have a data frame df that contains column x then rank of values in x can be found as rank(df$x).
rank() function in R Language is used to return the sample ranks of the values of a vector. Equal values and missing values are handled in multiple ways.
If we have a matrix with dimensions R x C, having R number of rows and C number of columns, and if R is less than C then the rank of the matrix would be R. To find the rank of a matrix in R, we can use rankMatrix function in Matrix package.
Using aggregate
and factor
in base R:
ag <- aggregate(SNR~Subject, df, function(x) as.numeric(factor(x)))
df$rank <- c(t(ag[,-1]))
Subject SNR rank
1 John -4 1
2 John -4 1
3 John 0 2
4 John 4 3
5 Mary 0 1
6 Mary 4 2
7 Mary 4 2
8 Mary 8 3
Another base R method:
transform(df1, Rank = ave(SNR, Subject, FUN = function(x) cumsum(c(TRUE, head(x, -1) != tail(x, -1)))))
gives:
Subject SNR Rank
1 John -4 1
2 John -4 1
3 John 0 2
4 John 4 3
5 Mary 0 1
6 Mary 4 2
7 Mary 4 2
8 Mary 8 3
If your dataframe is not ordered yet, you should order it first with df1 <- df1[order(df1$SNR),]
for this method to give the correct result.
library(dplyr)
df %>%
arrange(Subject, SNR) %>%
group_by(Subject) %>%
mutate(rank=dense_rank(SNR))
of course credit to @rich-scriven for mentioning dense_rank()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With