Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R Assign (or copy) column classes from a data frame to another

I produced a large data frame (1700+obs,159 variables) with a function that collects info from a website. Usually, the function finds numeric values for some columns, and thus they're numeric. Sometimes, however, it finds some text, and converts the whole column to text.

I have one df whose column classes are correct, and I would like to "paste" those classes to a new, incorrect df.

Say, for example:

dfCorrect<-data.frame(x=c(1,2,3,4),y=as.factor(c("a","b","c","d")),z=c("bar","foo","dat","dot"),stringsAsFactors = F)
str(dfCorrect)
'data.frame':   4 obs. of  3 variables:
 $ x: num  1 2 3 4
 $ y: Factor w/ 4 levels "a","b","c","d": 1 2 3 4
 $ z: chr  "bar" "foo" "dat" "dot"

## now I have my "wrong" data frame:
dfWrong<-as.data.frame(sapply(dfCorrect,paste,sep=""))
str(dfWrong)
'data.frame':   4 obs. of  3 variables:
 $ x: Factor w/ 4 levels "1","2","3","4": 1 2 3 4
 $ y: Factor w/ 4 levels "a","b","c","d": 1 2 3 4
 $ z: Factor w/ 4 levels "bar","dat","dot",..: 1 4 2 3

I wanted to copy the classes of each column of dfCorrect into dfWrong, but haven't found how to do it properly. I've tested:

dfWrong1<-dfWrong
dfWrong1[0,]<-dfCorrect[0,]
str(dfWrong1) ## bad result
'data.frame':   4 obs. of  3 variables:
 $ x: Factor w/ 4 levels "1","2","3","4": 1 2 3 4
 $ y: Factor w/ 4 levels "a","b","c","d": 1 2 3 4
 $ z: Factor w/ 4 levels "bar","dat","dot",..: 1 4 2 3

dfWrong1<-dfWrong
str(dfWrong1)<-str(dfCorrect)
'data.frame':   4 obs. of  3 variables:
 $ x: num  1 2 3 4
 $ y: Factor w/ 4 levels "a","b","c","d": 1 2 3 4
 $ z: chr  "bar" "foo" "dat" "dot"
Error in str(dfWrong1) <- str(dfCorrect) : 
  could not find function "str<-"

With this small matrix I could go by hand, but what about larger ones? Is there a way to "copy" the classes from one df to another without having to know the individual classes (and indexes) of each column?

Expected final result (after properly "pasting" classes):

all.equal(sapply(dfCorrect,class),sapply(dfWrong,class))
[1] TRUE
like image 499
PavoDive Avatar asked Dec 08 '14 15:12

PavoDive


1 Answers

You could try this:

dfWrong[] <- mapply(FUN = as,dfWrong,sapply(dfCorrect,class),SIMPLIFY = FALSE)

...although my first instinct is to agree with Oliver that if it were me I'd try to ensure the correct class at the point you're reading the data.

like image 165
joran Avatar answered Nov 03 '22 13:11

joran