Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

back and forth to dummy variables in R

So, I've been using R on and off for two years now and been trying to get this whole idea of vectorization. Since I deal a lot with dummy variables from multiple response sets from surveys I thought it would be interesting to learn with this case.

The idea is to go from multiple responses to dummy variables (and back), for example: "Of these 8 different chocolates, which are your favorite ones (choose up to 3) ?"

Sometimes we code this as dummy variables (1 for person likes "Cote d'Or", 0 for person doesn't like it), with 1 variable per option, and some times as categorical (1 for person likes "Cote d'Or", 2 for person likes "Lindt", and so on), with 3 variables for the 3 choices.

So, basically I can end up with one a matrix which lines are like

1,0,0,1,0,0,1,0

Or a matrix with lines like

1,4,7

And the idea, as mentioned, is to go from one to the other. So far I got a loop solution for each case and a vectorized solution for going from dummy to categorical. I would appreciate any further insigh into this matter and a vectorized solution for the categorical to dummy step.

DUMMY TO NOT DUMMY

vecOrig<-matrix(0,nrow=18,ncol=8)  # From this one
vecDest<-matrix(0,nrow=18,ncol=3)  # To this one

# Populating the original matrix.
# I'm pretty sure this could have been added to the definition of the matrix, 
# but I kept getting repeated numbers.
# How would you vectorize this?
for (i in 1:length(vecOrig[,1])){               
vecOrig[i,]<-sample(vec)
}

# Now, how would you vectorize this following step... 
for(i in 1:length(vecOrig[,1])){            
  vecDest[i,]<-grep(1,vecOrig[i,])
}

# Vectorized solution, I had to transpose it for some reason.
vecDest2<-t(apply(vecOrig,1,function(x) grep(1,x)))   

NOT DUMMY TO DUMMY

matOrig<-matrix(0,nrow=18,ncol=3)  # From this one
matDest<-matrix(0,nrow=18,ncol=8)  # To this one.

# We populate the origin matrix. Same thing as the other case. 
for (i in 1:length(matOrig[,1])){         
  matOrig[i,]<-sample(1:8,3,FALSE)
}

# this works, but how to make it vectorized?
for(i in 1:length(matOrig[,1])){          
  for(j in matOrig[i,]){
    matDest[i,j]<-1
  }
}

# Not a clue of how to vectorize this one. 
# The 'model.matrix' solution doesn't look neat.
like image 894
fioghual Avatar asked Oct 06 '22 15:10

fioghual


1 Answers

Vectorized solutions:

Dummy to not dummy

vecDest <- t(apply(vecOrig == 1, 1, which))

Not dummy to dummy (back to the original)

nCol <- 8

vecOrig <- t(apply(vecDest, 1, replace, x = rep(0, nCol), values = 1))
like image 185
Sven Hohenstein Avatar answered Oct 10 '22 04:10

Sven Hohenstein