Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lookup without looping

Tags:

r

I have two tables, and I'm trying to look up values from one to add to values in the other. Currently I am using two for loops but they run slowly. I am new to R and know I should avoid looping to speed things up but I can't figure out how.

Table1 (several thousand rows, 37 columns):

type cat1 cat2 cat3 ... cat36
   1    2    3    2         7
   3    6    2    1         9
   2    4    6    7         4
   3    5    7    8         2
   5    2    2    9         1
   4    3    1    2         3
   1    8    1    4         4
...

Table2 (36 rows, 5 columns):

      type1 type2 type3 type4 type5
cat1      2     3     4     3     8
cat2      8     5     5     2     6
cat3      7     5     1     3     5
...
cat36     4     7     2     8     9

I want to modify each value in Table1 by adding the appropriate value (matching among the 5 types and 36 categories) from Table2. Here are the desired results:

type cat1 cat2 cat3 ... cat36
   1    4   11    9        11
   3   10    7    2        11
   2    7   11   12        11
   3    9   12    9         4
   5   10    8   14        10
   4    6    3    5        11
   1   10    9   11         8
...

Here is my current (slow) code:

for (i in 1:36) {
    for (j in 1:nrow(Table1)) {
        Table1[j,i+1] = Table1[j,i+1] + Table2[i,Table1[j,1]]
    }
}
like image 360
Jimmy Purnell Avatar asked Jan 27 '12 17:01

Jimmy Purnell


2 Answers

The type column in Table1 indicates which column in Table2 to add to the rows in Table1. So use the "type" column as an index to the Table2 rows, and then transpose the resulting matrix so you can add rows to rows:

Table3 <- cbind(Table1[ , "type"], 
                t(Table2[ , Table1[ , "type"] ]) + Table1[ , -1])

(I'm assuming that Table1 and Table2 are matrices. If they are data frames you can use Table1$type instead of Table1[,"type"]).

like image 97
Tyler Avatar answered Nov 17 '22 04:11

Tyler


I would transform both tables to be "long" instead of two differing "wide" formats. After making both tables long you could perform a merge (R data.frame analog to an SQL join) on the two tables and then trivially sum the values.

Here's a similar example:

## creating some synthetic data
df1 <- data.frame(type=sample(1:4, 100, replace=TRUE), cat1=sample(1:4, 100, replace=TRUE), cat2=sample(1:4, 100, replace=TRUE),cat3=sample(1:4, 100, replace=TRUE),cat4=sample(1:4, 100, replace=TRUE) )
df2 <- data.frame(cat=1:4, type1=sample(1:4,4), type2=sample(1:4,4), type3=sample(1:4,4), type4=sample(1:4,4) )

require(reshape)

## rearrange df1
m1 <- melt(df1, id.vars="type")
m1$cat <- substr(m1$variable, 4,4)
m1$variable <- NULL

## rearrange df2
m2 <- melt(df2, id.vars="cat")
m2$type <- substr(m2$variable, 5, 5)
m2$value2 <- m2$value
m2$variable <- NULL
m2$value <- NULL

## now that they are laid out the same, they can be merged
df3 <- merge(m1, m2)
df3$newSum <- df3$value + df3$value2
like image 26
JD Long Avatar answered Nov 17 '22 06:11

JD Long