Advanced R dicusses the idea of using character subsetting for lookup tables.
x <- c("m", "f", "u", "f", "f", "m", "m")
lookup <- c(m = "Male", f = "Female", u = NA)
lookup[x]
#> m f u f f m m
#> "Male" "Female" NA "Female" "Female" "Male" "Male"
Created on 2019-03-04 by the reprex package (v0.2.1)
However, this idea does not work for numeric lookups, because names
is a special attribute that is required to be a character vector.
What is the simple equivalent solution for numeric lookups, that does not require a data.frame
?
I want to avoid a data.frame
solution, because the mapping between keys and values is only based on order as opposed to the more transparent 3 = 'Excellent', 2 = 'Good', 1 = 'Poor'
.
A solution using data.frame
is suggested by the paragraph following character lookup tables.
grades <- c(1, 2, 2, 3, 1)
info <- data.frame(
grade = 3:1,
desc = c("Excellent", "Good", "Poor"),
fail = c(F, F, T)
)
info[grades, 'desc']
#> [1] Excellent Good Good Poor Excellent
#> Levels: Excellent Good Poor
Created on 2019-03-04 by the reprex package (v0.2.1)
If your keys will only be positive integers, you can use the index value as suggested by Soren in their answer to this question: https://stackoverflow.com/a/54990917
If not, you can still use the names
based strategy you described above by storing your numbers in names(lookup)
as character and then using as.character
to convert a vector of numeric keys into the right form for matching:
y <- c(1, -2, 1.3, -5)
lookup_num <- c('1' = 'Cat', '-2' = 'Dog', '1.3' = 'Fish', '-5' = 'Hedgehog')
lookup_num[as.character(y)]
1 -2 1.3 -5
"Cat" "Dog" "Fish" "Hedgehog"
One possible downside of this approach is that, since the numbers will be dealt with as strings, it won't properly match 0.0 with 0, or 3.00 with 3, so you'd need to make sure your numeric values are clean.
If performance is not a huge concern, you can reverse the order of key and value, putting your numeric key as the value and the character lookup value as the name, and then use sapply
to look up each key:
lookup_num <- c('Cat' = 1, 'Dog' = -2, 'Fish' = 1.3, 'Hedgehog' = -5)
keys <- c(-2, 1.3, -2, 1)
sapply(keys, function(x) which(lookup_num == x))
Dog Fish Dog Cat
2 3 2 1
This has the advantage of using numeric matching which resists problems caused by variable numeric formatting, and gives you a lot of flexibility on how you match (for example, you could do: abs(lookup_num - x) < 0.1
to add wiggle room in your numeric matching)
The downside is that is has a pretty bad time complexity, but if your list of keys and/or lookup table are not huge, you won't notice at all.
You could consider using a lookup function instead. For example, here's a simple helper function that creates a lookup function for you:
create.lookup = function(name, value) {
function(lookup.name) value[match(lookup.name, name)]
}
An example of using this:
grades <- c(1, 2, 2, 3, 1)
lookup = create.lookup(c(3, 2, 1), c("Excellent", "Good", "Poor"))
lookup(grades)
# [1] "Poor" "Good" "Good" "Excellent" "Poor"
Also works with negative and non-integer values
grades <- c(2, 1.1, 2, -3, 1.1)
lookup = create.lookup(c(1.1, 2, -3), c("Excellent", "Good", "Poor"))
lookup(grades)
# [1] "Good" "Excellent" "Good" "Poor" "Excellent"
And it still works even if the numbers are written differently
grades <- c(2.000, 1.10, 2, -3e0, 001.1)
lookup(grades)
# [1] "Good" "Excellent" "Good" "Poor" "Excellent"
As an added bonus, the same method also works for character-type lookups, thus providing a single method for the various use cases
grades <- c('p', 'g', 'g', 'e', 'p')
lookup = create.lookup(c('e', 'g', 'p'), c("Excellent", "Good", "Poor"))
lookup(grades)
# [1] "Poor" "Good" "Good" "Excellent" "Poor"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With