Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to standardize a data frame which contains both numeric and factor variables

My data frame, my.data, contains both numeric and factor variables. I want to standardise just the numeric variables in this data frame.

> mydata2=data.frame(scale(my.data, center=T, scale=T))
Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric

Could the standardising work by doing this? I want to standardise the columns 8,9,10,11 and 12 but I think I have the wrong code.

mydata=data.frame(scale(flowdis3[,c(8,9,10,11,12)], center=T, scale=T,))

Thanks in advance

like image 282
wilga Avatar asked Apr 18 '16 14:04

wilga


People also ask

Do you need to standardize data for logistic regression?

You don't need to standardize unless your regression is regularized. However, it sometimes helps interpretability, and rarely hurts.

Do you need to standardize data for linear regression?

All the linear models but linear regression actually require normalization. Lasso, Ridge and Elastic Net regressions are powerful models, but they require normalization because the penalty coefficients are the same for all the variables.


1 Answers

Here is one option to standardize

 mydata[] <- lapply(mydata, function(x) if(is.numeric(x)){
                     scale(x, center=TRUE, scale=TRUE)
                      } else x)
like image 108
akrun Avatar answered Oct 03 '22 07:10

akrun