I'm wondering if I'm missing something trivial here:
When ranking a vector like this containing NAs, there are four options of how to deal with the NAs:
x<-c(5, NA, 3, NA, 6, 9, 10, NA, 5, 7, 12)
rank(x, na.last=T)
# [1] 2.5 9.0 1.0 10.0 4.0 6.0 7.0 11.0 2.5 5.0 8.0
rank(x, na.last=F)
# [1] 5.5 1.0 4.0 2.0 7.0 9.0 10.0 3.0 5.5 8.0 11.0
rank(x, na.last=NA)
# [1] 2.5 1.0 4.0 6.0 7.0 2.5 5.0 8.0
rank(x, na.last="keep")
# [1] 2.5 NA 1.0 NA 4.0 6.0 7.0 NA 2.5 5.0 8.0
I am looking to keep and rank the NAs. For my purposes they should be ranked equally and last. In this situation the ties.method
to be used is ok to be the default "average". I'm looking for this result:
# [1] 2.5 10.0 1.0 10.0 4.0 6.0 7.0 10.0 2.5 5.0 8.0
From the ?rank help: "NA values are never considered to be equal: for na.last = TRUE and na.last = FALSE they are given distinct ranks in the order in which they occur in x."
So, it looks like what I want - i.e. to treat them equally and average their rank as a last rank is not possible through using rank
. Is this true - is there no simple way of getting this done via rank? Do I have to rely on a second line of code to re-insert the rank of the NAs after doing rank(x, na.last="keep")
?
I'm not sure if this is the most elegant solution, but you could replace the NA values so that they are always last, like so:
rank( replace(x, is.na(x), max(x,na.rm=TRUE) + 1) )
#[1] 2.5 10.0 1.0 10.0 4.0 6.0 7.0 10.0 2.5 5.0 8.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With