I have a data.frame called mydata and a vector ids containing indices of the columns in the data.frame that I would like to convert to factors. Now the following code solves the problem
for(i in ids) mydata[, i]<-as.factor(mydata[, i])
Now I wanted to clean this code up by using apply instead of an explicit for-loop.
mydata[, ids]<-apply(mydata[, ids], 2, as.factor)
However, the last statement gives me a data.frame where the types are character instead of factors. I fail to see the distinction between these two lines of code. Why do they not produce the same result?
Kind regards, Michael
The result of apply
is a vector or array or list of values (see ?apply
).
For your problem, you should use lapply
instead:
data(iris)
iris[, 2:3] <- lapply(iris[, 2:3], as.factor)
str(iris)
'data.frame': 150 obs. of 5 variables:
$ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
$ Sepal.Width : Factor w/ 23 levels "2","2.2","2.3",..: 15 10 12 11 16 19 14 14 9 11 ...
$ Petal.Length: Factor w/ 43 levels "1","1.1","1.2",..: 5 5 4 6 5 8 5 6 5 6 ...
$ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
$ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
Notice that this is one place where lapply
will be much faster than a for
loop. In general a loop and lapply will have similar performance, but the <-.data.frame
operation is very slow. By using lapply
one avoids the <-
operation in each iteration, and replaces it with a single assign. This is much faster.
That is because apply() works completely different. It will first carry out the function as.factor in a local environment, collect the results from that, and then try to merge them in to an array and not a dataframe. This array is in your case a matrix. R meets different factors and has no other way to cbind them than to convert them to character first. That character matrix is used to fill up your dataframe.
You can use lapply for that (see Andrie's answer) or colwise from the plyr function.
require(plyr)
Df[,ids] <- colwise(as.factor)(Df[,ids])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With