I have a sparse data table that looks like this:
 data = data.table(
    var1 = c("a","",""),
    var2 = c("","","c"),
    var3 = c("a","b",""),
    var4 = c("","b","")
)
      var1 var2 var3 var4
    1:    a         a     
    2:              b    b
    3:         c     
I would like to add a column that contains a string of zeros and ones indicating which variables are present in any row, like this:
  var1 var2 var3 var4  concat
1:    a         a      1|0|1|0
2:              b    b 0|0|1|1
3:         c           0|1|0|0
I can get to this with the following command:
data[, concat := paste(
           as.integer(var1 != ""),
           as.integer(var2 != ""),
           as.integer(var3 != ""),
           as.integer(var4 != ""),
           sep = "|")]
However, if I have hundreds of variables, I would rather use some sort of computation to get to the desired expression. Perhaps something based on paste0("var",1:4), or at least a vector of column names. Any suggestions?
Go to the Data tab > Data Tools group, click the What-If Analysis button, and then click Data Table… In the Data Table dialog window, click in the Column Input cell box (because our Investment values are in a column), and select the variable cell referenced in your formula.
To pick out single or multiple columns use the select() function. The select() function expects a dataframe as it's first input ('argument', in R language), followed by the names of the columns you want to extract with a comma between each name.
By using the same cbin() function you can add multiple columns to the DataFrame in R.
Same basic approach as above:
data[ , concat := apply(.SD, 1, function(x) paste(+(x == ""), collapse = "|"))][]
#    var1 var2 var3 var4  concat
# 1:    a         a      0|1|0|1
# 2:              b    b 1|1|0|0
# 3:         c           1|0|1|1
                        Variation not requiring any grouping by each row or apply-ing on each row.
data[, concat := do.call(paste, c(lapply(.SD, function(x) (x!="")+0 ), sep="|")) ]
#   var1 var2 var3 var4  concat
#1:    a         a      1|0|1|0
#2:              b    b 0|0|1|1
#3:         c           0|1|0|0
                         data$concat <- apply(apply(data, 2, function(x) ifelse(x == "", 0, 1)), 1, function(x) paste(x, collapse="|"))
Breakdown:
1) For each column in data, check if element is empty, if so return 0, else 1
apply(data, 2, function(x) ifelse(x == "", 0, 1))
Let's call the return from (1) the variable concat. For each row of concat, paste everything together with a pipe (|) separating them. Set the new column of data to equal this. 
apply(concat, 1, function(x) paste(x, collapse="|"))
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With