Using data.table in R, I have two data tables. One is my actual dataset and the other is a lookup table.
Dataset:
dt <- data.table(Name = c("John", "Bob", "Sue", "Trish"), Ref = c("a", "A0", "BA", "c"))
**Name**    **Ref**  
John        a  
Bob         A0  
Sue         BA  
Trish       c
Lookup table:
lookup <- data.table(Ref1 = c("a", "b", "c"), Ref2 = c("A0", "AA", "BA"), Value = 1:3)
**Ref1** **Ref2** **Value**
a        A0       1
b        aa       2
c        BA       3
As you can see there are two different references for each value in the lookup table. I would like to either translate the Ref2 values to Ref1 and then do the lookup or do all lookups directly.
My approach so far has been to split dt so each split contains each style of formatting, then lookup the values and finally rbind.
dt1 <- dt[dt[, nchar(Ref) == 2,],]
dt2 <- dt[dt[, nchar(Ref) == 1,],]
setnames(lookup, c("Ref1", "Ref", "Value"))
setkey(lookup, Ref)
dt1Merged <- merge(dt1, lookup, by = "Ref")[,c(1:2, 4), with = F]
setnames(lookup, c("Ref", "Ref2", "Value"))
setkey(lookup, Ref)
dt2Merged <- merge(dt2, lookup, by = "Ref")[,c(1:2, 4), with = F]
result <- rbind(dt1Merged, dt2Merged)
This approach requires a lot of code and is not very efficient. Is there a more efficient method to do this? As this is only a small piece in a larger set of operations it would be preferable if the answer either uses data.table or is quickly converted to this format.
Maybe you can first melt your lookup table to have all the references in one column and the corresponding values next to them, and then merge the data.tables
newlookup=melt(lookup,id.vars="Value")
setnames(newlookup, c("Value", "oldRef", "Ref"))
merge(dt,newlookup,by="Ref")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With