How to reorder
columns of a data.frame
on the total amount of unique values per column? As an example:
var1 var2 var3
1 1 1
0 2 2
1 3 3
0 4 1
1 5 2
Is there a way to reorder this like var2, var3, var1
automatically (because the length of unique values are 5, 3, and 2 respectively, or the opposite, 2 3 5)?
In this case it is not that difficult to get what we want, but in my case I've many columns. Is there a way to do this type of sorting automatically?
Also, I'd prefer to have a solution that works on matrix
(in addition to data.frame
), independent of whether there are column names or not.
Something like this?
df[names(sort(sapply(df, function(x) length(unique(x))), decreasing = TRUE))]
# var2 var3 var1
# 1 1 1 1
# 2 2 2 0
# 3 3 3 1
# 4 4 1 0
# 5 5 2 1
If your input is a matrix
, then:
m[, names(sort(apply(m, 2, function(x)
length(unique(x))), decreasing = TRUE))]
should work.
# var2 var3 var1
# [1,] 1 1 1
# [2,] 2 2 0
# [3,] 3 3 1
# [4,] 4 1 0
# [5,] 5 2 1
Edit: your example in the post seems to have column names, but this one you gave in your comments doesn't. Please make sure to produce the example correctly.
X <- cbind(1, rnorm(10), 1:10)
Since you can't expect column names, you'll have to return indices. Try this (it'll work if you've column names or not, of course):
m[, sort(apply(X, 2, function(x)
length(unique(x))), decreasing = TRUE, index.return = TRUE)$ix]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With