Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to assign within apply family?

Tags:

dataframe

r

apply

I have data.frame that contains several factors and i want to rename factor levels for all of these factors. E.g.:

mydf <- data.frame(col1 = as.factor(c("A","A",NA,NA)),col2 = as.factor(c("A",NA,NA,"A")))
mydf <- as.data.frame(lapply(mydf,addNA))

Note that the real life example has way more than just two columns. Hence I would like to use apply to assign other level names to all of these columns, just like in:

levels(mydf$col1) <- c("1","0") 

I tried the following but it did not work…

 apply(mydf,1,function(x) levels(x) <- c("1","0"))

I am not really surprised it doesn't work but I have no better ideas right now. Should I use with maybe?

EDIT: I realized I made a mistake here by oversimplifying things. I used addNA to account for the fact, that NAs should not handled as NAs anymore. Thus I also want to relabel them. This doesn't work with Andrie's suggestion and returns the following error message:

 labels = c("1",  : invalid labels; length 2 should be 1 or 1  

Note that I updated my example df.

like image 248
Matt Bannert Avatar asked Feb 27 '12 11:02

Matt Bannert


1 Answers

You can change levels by reference using setattr() from packages bit or data.table. This avoids copying the whole dataset, and since you said you have a lot of columns ...

require(bit)          # Either package
require(data.table)   #
setattr(mydf[[1]],"levels",c("1","0"))
setattr(mydf[[2]],"levels",c("1","0"))

That can be done in a simple for loop which is very fast. It is your responsibility to ensure that you replace the levels vector with a vector of the same length, otherwise the factor may no longer be valid. And, you have to replace the whole levels vector with this method. There is an internal way in data.table to replace particular level names by reference, but probably no need to go that far.

like image 167
Matt Dowle Avatar answered Sep 22 '22 15:09

Matt Dowle