Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mapping values between data frames R

Tags:

dataframe

r

let's create example data:

df <- data.frame(date=c("2017-01-01","2017-01-02", "2017-01-03", "2017-01-04", "2017-01-05"), X1=c("A", "B", "C", "D", "F"),
                 X2=c("B", "A", "D", "F", "C"))
df2 <- data.frame(date=c("2017-01-01","2017-01-02", "2017-01-03", "2017-01-04", "2017-01-05"), 
                  A=c("3", "4", "2", "1", "5"),
                  B=c("6", "2", "5", "1", "1"),
                  C=c("1", "4", "5", "2", "3"),
                  D=c("67", "67", "63", "61", "62"),
                  F=c("31", "33", "35", "31", "38"))

So I have two data frames and I want to match values from df2 to df by date and X1 and X2 and create new variables for those. What makes this tricky for me is that matched values in df2 are in colnames. End result should look like this:

> result
        date X1 X2 Var1 Var2
1 2017-01-01  A  B    3    6
2 2017-01-02  B  A    2    4
3 2017-01-03  C  D    5   63
4 2017-01-04  D  F   61   31
5 2017-01-05  F  C   38    3

result <- data.frame(date=c("2017-01-01","2017-01-02", "2017-01-03", "2017-01-04", "2017-01-05"), 
                     X1=c("A", "B", "C", "D", "F"),
                     X2=c("B", "A", "D", "F", "C"),
                     Var1=c("3", "2", "5", "61", "38"),
                     Var2=c("6", "4", "63", "31", "3"))

I wanted to use mapvalues, but couldn't figure it out. Second thought was to go long format (melt) with df2 and try then, but failed there as well.

Ok, here is my best try, just feels that there could be more efficient way, if you have to create multiple (>50) new variables to data frame.

df2.long <- melt(df2, id.vars = c("date"))

df$Var1 <- na.omit(merge(df, df2.long, by.x = c("date", "X1"), by.y = c("date", "variable"), all.x = FALSE, all.y = TRUE))[,4]
df$Var2 <- na.omit(merge(df, df2.long, by.x = c("date", "X2"), by.y = c("date", "variable"), all.x = FALSE, all.y = TRUE))[,5]
like image 379
Hakki Avatar asked Feb 16 '17 14:02

Hakki


1 Answers

Using dplyr and tidyr:

df2_m <- group_by(df2, date) %>% 
    gather('X1', 'var', -date)

left_join(df, df2_m) %>% 
    left_join(df2_m, by = c('date', 'X2' = 'X1')) %>%
    rename(Var1 = var.x, Var2 = var.y) -> result
like image 197
GGamba Avatar answered Oct 07 '22 09:10

GGamba