Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Subtract every column from each other column in a R data.table

Tags:

r

data.table

Let's say I have a data.table

set.seed(1) # to make the example reproducible
ex<-data.table(AAA=runif(100000),
               BBB=runif(100000),
               CCC=runif(100000),
               DDD=runif(100000),
               FLAG=c(rep(c("a","b","c","d","e"),200000)))

I want to subtract from column AAA every other column, then from BBB every remaining column (except FLAG) and so on so that the output will look like...

ex[,list(AAA_BBB=AAA-BBB,
         AAA_CCC=AAA-CCC,
         AAA_DDD=AAA-DDD,
         BBB_CCC=BBB-CCC,
         BBB_DDD=BBB-DDD,
         CCC_DDD=CCC-DDD)]

Is there a data.table syntax that can do this cleanly without knowing how many columns there are or what their names are?

like image 913
Dean MacGregor Avatar asked Jun 04 '13 14:06

Dean MacGregor


People also ask

How do you subtract columns in R?

To minus one column from another in an R matrix, we first need to read the matrix as a data frame using as. data. frame then find minus the columns using minus sign and accessing the column of the data frame.

How do you subtract data in R?

To do this, we simply need to use minus sign. For example, if we have data-frames df1 and df2 then the subtraction can be found as df1-df2.


2 Answers

Looping over the combinations within data.table:

comblist <- combn(names(ex)[-5],2,FUN=list)
res2 <- ex[,lapply(comblist,function(x) get(x[1])-get(x[2]))]

setnames(res2,names(res2),sapply(comblist,paste,collapse="_"))
like image 58
Roland Avatar answered Oct 25 '22 05:10

Roland


A solution with combn and apply:

cc <- combn(colnames(ex)[1:4], 2)
apply(cc, 2, function(x)ex[[x[1]]]-ex[[x[2]]])

gives for the first 5 rows:

             [,1]         [,2]       [,3]        [,4]        [,5]         [,6]
 [1,] -0.43500930 -0.520148152  0.1602265 -0.08513885  0.59523580  0.680374655
 [2,] -0.32964090 -0.153303302 -0.3807295  0.17633760 -0.05108855 -0.227426149
 [3,]  0.25991705 -0.079679566  0.2040904 -0.33959662 -0.05582670  0.283769917
 [4,]  0.35585252  0.153083047  0.2382553 -0.20276948 -0.11759719  0.085172292
 [5,] -0.67081018 -0.116543468 -0.3413471  0.55426671  0.32946305 -0.224803663

Edit

As Arun suggested, combn can take a function argument, so a better solution is

res <- combn(colnames(ex)[1:4], 2, function(x) ex[[x[1]]] - ex[[x[2]]])
colnames(res) <- combn(colnames(ex)[1:4], 2, paste, collapse="_")
as.data.table(res)

            AAA_BBB     AAA_CCC     AAA_DDD     BBB_CCC     BBB_DDD     CCC_DDD
      1: -0.4350093 -0.52014815  0.16022650 -0.08513885  0.59523580  0.68037465
      2: -0.3296409 -0.15330330 -0.38072945  0.17633760 -0.05108855 -0.22742615
      3:  0.2599171 -0.07967957  0.20409035 -0.33959662 -0.05582670  0.28376992
      4:  0.3558525  0.15308305  0.23825534 -0.20276948 -0.11759719  0.08517229
      5: -0.6708102 -0.11654347 -0.34134713  0.55426671  0.32946305 -0.22480366
     ---                                                                       
 999996: -0.8450458 -0.47951267 -0.30333929  0.36553310  0.54170648  0.17617338
 999997: -0.5778393 -0.01784418 -0.24353237  0.55999516  0.33430697 -0.22568819
 999998:  0.7127352  0.82554276  0.01258673  0.11280758 -0.70014846 -0.81295604
 999999: -0.6693544 -0.42335069 -0.81080852  0.24600375 -0.14145408 -0.38745783
1000000: -0.8511655 -0.23341818 -0.15830584  0.61774732  0.69285966  0.07511234
like image 38
user1981275 Avatar answered Oct 25 '22 05:10

user1981275