Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Looking up values without loop in R

Tags:

r

I need to look up a value in a data frame based on multiple criteria in another data frame. Example

A=
Country Year Number
USA     1994 455
Canada  1997 342
Canada  1998 987

must have added a column by the name of "rate" coming from

B=
Year   USA   Canada
1993   21    654
1994   41    321
1995   56    789
1996   85    123
1997   65    456
1998   1     999

So that the final data frame is

C=
Country Year Number  Rate
USA     1994 455     41
Canada  1997 342     456
Canada  1998 987     999

In other words: Look up year and country from A in B and result is C. I would like to do this without a loop. I would like a general approach, such that I would be able to look up based on more than two criteria.

like image 393
tkoz_dk Avatar asked Jun 25 '26 06:06

tkoz_dk


2 Answers

Here's another way using data.table that doesn't require converting the 2nd data table to long form:

require(data.table) # 1.9.6+
A[B, Rate := get(Country), by=.EACHI, on="Year"]
#    Country Year Number Rate
# 1:     USA 1994    455   41
# 2:  Canada 1997    342  456
# 3:  Canada 1998    987  999

where A and B are data.tables, and Country is of character type.

like image 67
Arun Avatar answered Jun 28 '26 05:06

Arun


We can melt the second dataset from 'wide' to 'long' format, merge with the first dataset to get the expected output.

library(reshape2)
res <- merge(A, melt(B, id.var='Year'), 
        by.x=c('Country', 'Year'), by.y=c('variable', 'Year'))
names(res)[4] <- 'Rate'
res
#   Country Year Number Rate
#1  Canada 1997    342   456
#2  Canada 1998    987   999
#3     USA 1994    455    41

Or we can use gather from tidyr and right_join to get this done.

library(dplyr)
library(tidyr)
gather(B, Country,Rate, -Year) %>%
                       right_join(., A)
#  Year Country Rate Number
#1 1994     USA   41    455
#2 1997  Canada  456    342
#3 1998  Canada  999    987

Or as @DavidArenburg mentioned in the comments, this can be also done with data.table. We convert the 'data.frame' to 'data.table' (setDT(A)), melt the second dataset and join on 'Year', and 'Country'.

library(data.table)#v1.9.6+
setDT(A)[melt(setDT(B), 1L, variable = "Country", value = "Rate"), 
                on = c("Country", "Year"), 
                nomatch = 0L]

#    Country Year Number Rate
# 1:     USA 1994    455   41
# 2:  Canada 1997    342  456
# 3:  Canada 1998    987  999

Or a shorter version (if we are not too picky no variable names)

setDT(A)[melt(B, 1L), on = c(Country = "variable", Year = "Year"), nomatch = 0L]
like image 41
akrun Avatar answered Jun 28 '26 05:06

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!