Here is a small data.frame:
e = data.frame(A=c(letters[1:5], 1:5))
I am a little bit confused regarding what's happening when I execute the following command:
unclass(e$A) %>% as.numeric()
I am getting the following output:
[1] 6 7 8 9 10 1 2 3 4 5
why a:e is treated as 6:10?
data.frame makes a factor, this can be seen by using str(e):
'data.frame': 10 obs. of 1 variable: $ A: Factor w/ 10 levels "1","2","3","4",..: 6 7 8 9 10 1 2 3 4 5
This factor has different levels, ordered alphabetically (where R sorts numbers before letters), levels(e$A):
[1] "1" "2" "3" "4" "5" "a" "b" "c" "d" "e"
as.numeric converts a factor to the indices of the levels, i.e. the first level gets value 1 (which means 1 remains 1) and the sixth level gets value 6 (which means "a" becomes 6).
In this case you actually already force this conversion with unclass(), which results in the numeric vector you see. The as.numeric then only also drops the levels attribute.
?Comparison tells us any comparison between character vectors (such as sorting them) are based on the collating sequence of the current locale.
Note: this is independent of the %>%.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With