Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removal of constant columns in R

Tags:

r

constants

I was using the prcomp function when I received this error

Error in prcomp.default(x, ...) :  cannot rescale a constant/zero column to unit variance 

I know I can scan my data manually but is there any function or command in R that can help me remove these constant variables? I know this is a very simple task, but I have never been across any function that does this.

Thanks,

like image 420
Error404 Avatar asked Feb 25 '13 14:02

Error404


People also ask

How do I remove certain columns in R?

The most easiest way to drop columns is by using subset() function. In the code below, we are telling R to drop variables x and z. The '-' sign indicates dropping variables. Make sure the variable names would NOT be specified in quotes when using subset() function.

How do I remove multiple columns in R?

We can delete multiple columns in the R dataframe by assigning null values through the list() function.

How do I remove columns from a column name in R?

In R, the easiest way to remove columns from a data frame based on their name is by using the %in% operator. This operator lets you specify the redundant column names and, in combination with the names() function, removes them from the data frame. Alternatively, you can use the subset() function or the dplyr package.


1 Answers

The problem here is that your column variance is equal to zero. You can check which column of a data frame is constant this way, for example :

df <- data.frame(x=1:5, y=rep(1,5)) df #   x y # 1 1 1 # 2 2 1 # 3 3 1 # 4 4 1 # 5 5 1  # Supply names of columns that have 0 variance names(df[, sapply(df, function(v) var(v, na.rm=TRUE)==0)]) # [1] "y"  

So if you want to exclude these columns, you can use :

df[,sapply(df, function(v) var(v, na.rm=TRUE)!=0)] 

EDIT : In fact it is simpler to use apply instead. Something like this :

df[,apply(df, 2, var, na.rm=TRUE) != 0] 
like image 78
juba Avatar answered Sep 23 '22 17:09

juba