Just learning dplyr (and R) and I do not understand why this fails or what the correct approach to this is. I am looking for a general explanation rather than something specific to this contrived dataset.
Assume I have 3 files sizes with multipliers and I'd like to combine them into a single numeric column.
require(dplyr)
m <- data.frame(
K = 1E3,
M = 1E6,
G = 1E9
)
s <- data.frame(
size = 1:3,
mult = c('K', 'M', 'G')
)
Now I want to multiply the size by it's multiplier so I tried:
mutate(s, total = size * m[[mult]])
#Error in .subset2(x, i, exact = exact) :
# recursive indexing failed at level 2
which throws an error. I also tried:
mutate(s, total = size * as.numeric(m[mult]))
#1 1 K 1e+06
#2 2 M 2e+09
#3 3 G 3e+03
which is worse than an error (wrong answer)!
I tried a lot of other permutations but could not find the answer.
Thanks in advance!
Edit:
(or should this be another question)
akrun's answer worked great and I thought I understood but if I
rbind(s, c(4, NA))
then update the mutate to
mutate(s, total = size *
ifelse(is.na(mult), 1,
unlist(m[as.character(mult)])
it falls apart again with an "undefined columns selected"
The 'mult' column is 'factor' class. Convert it to 'character' for subsetting the 'm', `unlist' and then multiply with 'size'
mutate(s, new= size*unlist(m[as.character(mult)]))
# size mult new
#1 1 K 1e+03
#2 2 M 2e+06
#3 3 G 3e+09
If we look at how the 'factor' columns act based on the 'levels'
m[s$mult]
# M G K
#1 1e+06 1e+09 1000
We get the same order of output by using match between the names(m) and levels(s$mult)
m[match(names(m), levels(s$mult))]
# M G K
#1 1e+06 1e+09 1000
So, this might be the reason why you got a different result
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With