Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apply CASE WHEN in sqldf statement for manipulating multiple columns

I have a dataframe datwe with 37 columns. I am interested in converting the integer values(1,2,99) in columns 23 to 35 to character values('Yes','No','NA').

datwe$COL23 <- sqldf("SELECT CASE COL23 WHEN 1 THEN 'Yes'
                                        WHEN 2 THEN 'No'
                                        WHEN 99 THEN 'NA'
                                   ELSE 'Name ittt' 
                              END as newCol
                              FROM datwe")$newCol

I have been using the above sqldf statements to convert each column separately. I was wondering if there is any other smart way to do this, perhaps apply functions ?

If you require any reproducible data for building dataframe datwe, I will add it here. Thanks.

Edit: Example datwe

set.seed(12)
data.frame(replicate(37,sample(c(1,2,99),10,rep=TRUE)))
like image 876
Prradep Avatar asked Jun 09 '15 07:06

Prradep


1 Answers

Not sure why you used sqldf, see this example:

#dummy data
set.seed(12)
datwe <- data.frame(replicate(37,sample(c(1,2,99),10,rep=TRUE)))

#convert to Yes/No
res <- as.data.frame(
  sapply(datwe[,23:37], function(i)
    ifelse(i==1, "Yes",
           ifelse(i==2, "No",
                  ifelse(i==99,NA,"Name itttt")))))

#update dataframe
datwe <- cbind(datwe[, 1:22],res)

#output, just showing first 2 columns
datwe[,23:24]
#     X23  X24
# 1    No  Yes
# 2   Yes  Yes
# 3   Yes   No
# 4    No   No
# 5   Yes   No
# 6   Yes  Yes
# 7  <NA>   No
# 8    No   No
# 9   Yes <NA>
#10    No <NA>

EDIT: Using sqldf within a for loop with an external variable:

library(sqldf)

#dummy data
set.seed(12)
datwe <- data.frame(replicate(37,sample(c(1,2,99),10,rep=TRUE)))

#sqldf within a loop
for(myCol in paste0("X",23:37))
  datwe[,myCol] <- 
   fn$sqldf("SELECT CASE $myCol
                    WHEN 1 THEN 'Yes'
                    WHEN 2 THEN 'No' 
                    WHEN 99 THEN 'NA' 
                    ELSE 'Name ittt' 
                    END as newCol
             FROM datwe")$newCol

#check output, showing only 2 columns
datwe[,23:24]
#    X23 X24
# 1   No Yes
# 2  Yes Yes
# 3  Yes  No
# 4   No  No
# 5  Yes  No
# 6  Yes Yes
# 7   NA  No
# 8   No  No
# 9  Yes  NA
# 10  No  NA
like image 199
zx8754 Avatar answered Nov 08 '22 15:11

zx8754