I currently have the following dataframe:
datnotformeanfill<-
data.frame(b8=c(1,2,2,2,1,1),b7=rep(1,6),
b6=c(6,2,3,3,6,3),b5=c(6,3,3,3,4,3),
b4=c(rep(6,5),1),b3=rep(0,6),
b2=rep(1,6),b1=c(2,2,2,2,1,1))
> datnotformeanfill
b8 b7 b6 b5 b4 b3 b2 b1
1 1 1 6 6 6 0 1 2
2 2 1 2 3 6 0 1 2
3 2 1 3 3 6 0 1 2
4 2 1 3 3 6 0 1 2
5 1 1 6 4 6 0 1 1
6 1 1 3 3 1 0 1 1
I am trying to use a combination of the which
and unique
functions to return only the columns which have more than 1 unique value, but am not completely certain how to use these (or perhaps some other function(s)) to return the columns. Any help would be appreciated. Thank you!
We can use Filter
with function f = var
. It will check the variance for each column. If the column have only a single unique value, the variance will be '0'. This will be converted back to logical 'FALSE/TRUE' and used for subsetting the dataset.
Filter(var, datnotformeanfill)
# b8 b6 b5 b4 b1
# 1 1 6 6 6 2
# 2 2 2 3 6 2
# 3 2 3 3 6 2
# 4 2 3 3 6 2
# 5 1 6 4 6 1
# 6 1 3 3 1 1
Or another option is looping through the columns with sapply
and check the condition whether the length
of unique
elements are greater than 1. This returns a logical 'TRUE/FALSE' vector that can be used for subsetting as well.
datnotformeanfill[sapply(datnotformeanfill, function(x) length(unique(x))>1)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With