Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select columns that have more than 1 unique value [duplicate]

Tags:

r

I currently have the following dataframe:

datnotformeanfill<-
  data.frame(b8=c(1,2,2,2,1,1),b7=rep(1,6),
             b6=c(6,2,3,3,6,3),b5=c(6,3,3,3,4,3),
             b4=c(rep(6,5),1),b3=rep(0,6),
             b2=rep(1,6),b1=c(2,2,2,2,1,1))

> datnotformeanfill
  b8 b7 b6 b5 b4 b3 b2 b1
1  1  1  6  6  6  0  1  2
2  2  1  2  3  6  0  1  2
3  2  1  3  3  6  0  1  2
4  2  1  3  3  6  0  1  2
5  1  1  6  4  6  0  1  1
6  1  1  3  3  1  0  1  1

I am trying to use a combination of the which and unique functions to return only the columns which have more than 1 unique value, but am not completely certain how to use these (or perhaps some other function(s)) to return the columns. Any help would be appreciated. Thank you!

like image 349
costebk08 Avatar asked Jul 14 '15 14:07

costebk08


1 Answers

We can use Filter with function f = var. It will check the variance for each column. If the column have only a single unique value, the variance will be '0'. This will be converted back to logical 'FALSE/TRUE' and used for subsetting the dataset.

 Filter(var, datnotformeanfill)
 #    b8 b6 b5 b4 b1
 #  1  1  6  6  6  2
 #  2  2  2  3  6  2
 #  3  2  3  3  6  2
 #  4  2  3  3  6  2
 #  5  1  6  4  6  1
 #  6  1  3  3  1  1

Or another option is looping through the columns with sapply and check the condition whether the length of unique elements are greater than 1. This returns a logical 'TRUE/FALSE' vector that can be used for subsetting as well.

datnotformeanfill[sapply(datnotformeanfill, function(x) length(unique(x))>1)]
like image 117
akrun Avatar answered Sep 18 '22 22:09

akrun