I have a df with thousands of tickers for different future contracts. They have the abbreviated name (which appears later) and the long name (which I want to have in other df)
full_list <- structure(
list(
Ticker = c("AC", "AIC", "BBS", "BO", "C", "DF"),
Long_Name = c("Ethanol -- CBOT", "DJ UBS Commodity Index -- CBOT", "South American Soybeans -- CBOT", "Soybean Oil -- CBT", "Corn -- CBT", "Dow Jones Industrial Average -- CBT")
),
.Names = c("Ticker", "Long_Name"),
row.names = c(NA, 6L),
class = "data.frame"
)
This df has the list that I receive daily. I have to go and lookup the abbreviated name and match it to the long name.
replace <- structure(
list(
Type = c("F", "F", "F", "F", "F", "F"),
Location = c("US", "US", "US", "US", "US", "US"),
Symbol = c("BO", "C", "DF", "AIC", "AC", "BBS"),
Month = c("V13", "U13", "U13", "U13", "U13", "U13")
),
.Names = c("Type", "Location", "Symbol", "Month"),
row.names = c(NA, 6L),
class = "data.frame"
)
What I am looking for R to do is take replace$Symbol column and find those values in full_list$Ticker column and add a column, replace$Long_Name, where the respective full_list$Long_Name is copied over. Hope this makes sense. I understand the column names are difficult to follow.
This would be an easy VLookup in excel but I have a script I will use on a daily basis almost completed in R.
merge
them:
> merge(full_list, replace, by.x="Ticker", by.y="Symbol")
Ticker Long_Name Type Location Month
1 AC Ethanol -- CBOT F US U13
2 AIC DJ UBS Commodity Index -- CBOT F US U13
3 BBS South American Soybeans -- CBOT F US U13
4 BO Soybean Oil -- CBT F US V13
5 C Corn -- CBT F US U13
6 DF Dow Jones Industrial Average -- CBT F US U13
You could use match
- which gives the index of where the first argument falls in the second argument. For example:
arg1 <- c("red","blue")
arg2 <- c("blue","red")
> match(arg1,arg2)
[1] 2 1
Then just create a new column in your replace data frame (note - you should call it something else, because replace is actually a function in r) using the full_list data frame with the matched symbols.
replace$Long_Name <- full_list$Long_Name[match(replace$Symbol,full_list$Ticker)]
> replace
Type Location Symbol Month Long_Name
1 F US BO V13 Soybean Oil -- CBT
2 F US C U13 Corn -- CBT
3 F US DF U13 Dow Jones Industrial Average -- CBT
4 F US AIC U13 DJ UBS Commodity Index -- CBOT
5 F US AC U13 Ethanol -- CBOT
6 F US BBS U13 South American Soybeans -- CBOT
If it's a big data set you may benefit from an environment lookup:
library(qdap)
replace$Long_Name <- lookup(replace$Symbol, full_list)
## > replace
## Type Location Symbol Month Long_Name
## 1 F US BO V13 Soybean Oil -- CBT
## 2 F US C U13 Corn -- CBT
## 3 F US DF U13 Dow Jones Industrial Average -- CBT
## 4 F US AIC U13 DJ UBS Commodity Index -- CBOT
## 5 F US AC U13 Ethanol -- CBOT
## 6 F US BBS U13 South American Soybeans -- CBOT
Obligatory data.table
answer
library(data.table)
full_list <- data.table(full_list, key='Symbol')
replace <- data.table(replace, key='Ticker')
replace[full_list]
FWIW on a data set above about 1e5 rows a keyed data.table
will be significantly faster than the other approaches listed (except for the qdap
version, I haven't tried that).
merge timings can be found here
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With