Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Utilizing several reference values in a lookup table

Tags:

r

data.table

Using data.table in R, I have two data tables. One is my actual dataset and the other is a lookup table.

Dataset:

dt <- data.table(Name = c("John", "Bob", "Sue", "Trish"), Ref = c("a", "A0", "BA", "c"))

**Name**    **Ref**  
John        a  
Bob         A0  
Sue         BA  
Trish       c

Lookup table:

lookup <- data.table(Ref1 = c("a", "b", "c"), Ref2 = c("A0", "AA", "BA"), Value = 1:3)

**Ref1** **Ref2** **Value**
a        A0       1
b        aa       2
c        BA       3

As you can see there are two different references for each value in the lookup table. I would like to either translate the Ref2 values to Ref1 and then do the lookup or do all lookups directly.

My approach so far has been to split dt so each split contains each style of formatting, then lookup the values and finally rbind.

dt1 <- dt[dt[, nchar(Ref) == 2,],]
dt2 <- dt[dt[, nchar(Ref) == 1,],]

setnames(lookup, c("Ref1", "Ref", "Value"))
setkey(lookup, Ref)

dt1Merged <- merge(dt1, lookup, by = "Ref")[,c(1:2, 4), with = F]

setnames(lookup, c("Ref", "Ref2", "Value"))
setkey(lookup, Ref)

dt2Merged <- merge(dt2, lookup, by = "Ref")[,c(1:2, 4), with = F]

result <- rbind(dt1Merged, dt2Merged)

This approach requires a lot of code and is not very efficient. Is there a more efficient method to do this? As this is only a small piece in a larger set of operations it would be preferable if the answer either uses data.table or is quickly converted to this format.

like image 885
Christian Ravn Avatar asked Mar 17 '23 05:03

Christian Ravn


1 Answers

Maybe you can first melt your lookup table to have all the references in one column and the corresponding values next to them, and then merge the data.tables

newlookup=melt(lookup,id.vars="Value")
setnames(newlookup, c("Value", "oldRef", "Ref"))
merge(dt,newlookup,by="Ref")
like image 55
NicE Avatar answered Apr 02 '23 07:04

NicE