Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Change values in multiple columns of a dataframe using a lookup table

I am trying to change the value of a number of columns at once using a lookup table. They all use the same lookup table. I know how to do this for just one column -- I'd just use a merge, but am having trouble with multiple columns.

Below is an example dataframe and an example lookup table. My actual data is much larger (~10K columns with 8 rows).

example <- data.frame(a = seq(1,5), b = seq(5,1), c=c(1,4,3,2,5))

lookup <- data.frame(number = seq(1,5), letter = LETTERS[seq(1,5)])

Ideally, I would end up with a dataframe which looks like this:

example_of_ideal_output <- data.frame(a = LETTERS[seq(1,5)], b = LETTERS[seq(5,1)], c=LETTERS[c(1,4,3,2,5)])

Of course, in my actual data the dataframe is numbers, but the lookup table is a lot more complicated, so I can't just use a function like LETTERS to solve things.

Thank you in advance!

like image 524
verybadatthis Avatar asked Jul 26 '15 03:07

verybadatthis


2 Answers

Here's a solution that works on each column successively using lapply():

as.data.frame(lapply(example,function(col) lookup$letter[match(col,lookup$number)]));
##   a b c
## 1 A E A
## 2 B D D
## 3 C C C
## 4 D B B
## 5 E A E

Alternatively, if you don't mind switching over to a matrix, you can achieve a "more vectorized" solution, as a matrix will allow you to call match() and index lookup$letter just once for the entire input:

matrix(lookup$letter[match(as.matrix(example),lookup$number)],nrow(example));
##      [,1] [,2] [,3]
## [1,] "A"  "E"  "A"
## [2,] "B"  "D"  "D"
## [3,] "C"  "C"  "C"
## [4,] "D"  "B"  "B"
## [5,] "E"  "A"  "E"

(And of course you can coerce back to data.frame via as.data.frame() afterward, although you'll have to restore the column names as well if you want them, which can be done with setNames(...,names(example)). But if you really want to stick with a data.frame, my first solution is probably preferable.)

like image 143
bgoldst Avatar answered Oct 14 '22 07:10

bgoldst


Using dplyr

f <- function(x)setNames(lookup$letter, lookup$number)[x] 
library(dplyr)
example %>% 
  mutate_each(funs(f))
#  a b c
#1 A E A
#2 B D D
#3 C C C
#4 D B B
#5 E A E

Or with data.table

library(data.table)
setDT(example)[, lapply(.SD, f), ]
#   a b c
#1: A E A
#2: B D D
#3: C C C
#4: D B B
#5: E A E
like image 44
ExperimenteR Avatar answered Oct 14 '22 07:10

ExperimenteR