I currently have the following dataframe:
datnotformeanfill<-
data.frame(b8=c(1,2,2,2,1,1),b7=rep(1,6),
b6=c(6,2,3,3,6,3),b5=c(6,3,3,3,4,3),
b4=c(rep(6,5),1),b3=rep(0,6),
b2=rep(1,6),b1=c(2,2,2,2,1,1))
> datnotformeanfill
b8 b7 b6 b5 b4 b3 b2 b1
1 1 1 6 6 6 0 1 2
2 2 1 2 3 6 0 1 2
3 2 1 3 3 6 0 1 2
4 2 1 3 3 6 0 1 2
5 1 1 6 4 6 0 1 1
6 1 1 3 3 1 0 1 1
I am trying to use a combination of the which and unique functions to return only the columns which have more than 1 unique value, but am not completely certain how to use these (or perhaps some other function(s)) to return the columns. Any help would be appreciated. Thank you!
We can use Filter with function f = var. It will check the variance for each column. If the column have only a single unique value, the variance will be '0'. This will be converted back to logical 'FALSE/TRUE' and used for subsetting the dataset.
Filter(var, datnotformeanfill)
# b8 b6 b5 b4 b1
# 1 1 6 6 6 2
# 2 2 2 3 6 2
# 3 2 3 3 6 2
# 4 2 3 3 6 2
# 5 1 6 4 6 1
# 6 1 3 3 1 1
Or another option is looping through the columns with sapply and check the condition whether the length of unique elements are greater than 1. This returns a logical 'TRUE/FALSE' vector that can be used for subsetting as well.
datnotformeanfill[sapply(datnotformeanfill, function(x) length(unique(x))>1)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With