I have looked pretty much everywhere and cannot find the answer to this; R equivalent of VLOOKUP on Excel. VLOOKUP allows me to look up for a specific value throughout a column and apply it to each row of my data frame.
In this case I want to find the country a particular city is in (from a database) and return the name of the country in a new column.
So I have this database:
countries <- c("UK", "US", "RUS")
cities <- c("LDN", "NY", "MOSC")
db <- cbind(countries, cities)
db
countries cities
[1,] "UK" "LDN"
[2,] "US" "NY"
[3,] "RUS" "MOSC"
And want to find the country those cities are in (replace NA) based on the db above:
df
countries cities
[1,] NA "LDN"
[2,] NA "NY"
[3,] NA "MOSC"
I have absolutely no idea how to go about this on R.
Method 2: Using dplyr To Perform VLOOKUP We can use the inner join function of the dplyr library in R to perform similar to the VLOOKUP function.
The VLOOKUP Function always returns the first match. In order to return duplicate values (or the nth match) we need: A new unique identifier to differentiate all duplicate values. A helper column containing a list of unique IDs that will serve as the new lookup column (first column) of the table array.
In our very first method, we'll use the VLOOKUP function to find duplicates. The VLOOKUP function can look up a value in the leftmost column of a data table and returns the corresponding value from another column that is located on the right side of the table.
Here we learn the top 4 alternatives to the VLOOKUP function including INDEX MATCH combination, LOOKUP function, XLOOKUP function and Filter function with examples and downloadable Excel template. You may also look at these useful functions –
We can replicate this function using base R or the dplyr package: The following examples show how to use each of these functions in R to replicate the VLOOKUP function from Excel. The following code shows how to perform a function similar to VLOOKUP in base R by using the merge () function:
The VLOOKUP function can look up a value in the leftmost column of a data table and returns the corresponding value from another column that is located on the right side of the table. Here, our lookup value will be from Column D and will find the duplicates from Column C. If a duplicate is found then it will show the state name.
VLOOKUP can only work from left to right. MATCH function will return the row number. INDEX + MATCH and LOOKUP functions do not require column number, unlike VLOOKUP, require column number to fetch the data even though the required column is already selected.
You are performing a join which in R is performed using the function merge
merge(db, df)
Using the dplyr
package allows more natural verbs:
library(dplyr)
inner_join(db, df)
or perhaps (if you want non-matches to be shown; see ?left_join
for further information):
left_join(db, df)
Here's another approach:
library(qdapTools)
df[, 1] <- df[, 2] %l% db[, 2:1]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With