In the following example
x <- data.frame(code = 7:9, food = c('banana', 'apple', 'popcorn'))
y <- data.frame(food = c('banana', 'apple', 'popcorn'),
isfruit = c('fruit', 'fruit', 'not fruit'))
I would like to do x <- merge(x, y)
, but the problem is that merge()
reorders the columns so that the by
column (food) comes first. How can I prevent this and have merge(x, y)
use the same column order of x and just insert the new variable (isFruit) as the third column (i.e., "code, food, isFruit" instead of "food, code, isFruit")?
I've tried this, to no avail:
merge(x, y, sort = F)
My workaround is to do this afterward
x <- x[c(2, 1, 3)]
Here's a generic version of your base workaround:
merge(x, y)[, union(names(x), names(y))]
plyr
makes this easy:
x <- data.frame(code = 7:9, food = c('banana', 'apple', 'popcorn'))
y <- data.frame(food = c('banana', 'apple', 'popcorn'),
isfruit = c('fruit', 'fruit', 'not fruit'))
library(plyr)
join(x,y)
#GOOD
#Joining by: food
# code food isfruit
#1 7 banana fruit
#2 8 apple fruit
#3 9 popcorn not fruit
#BAD
# merge(x,y)
# food code isfruit
#1 apple 8 fruit
#2 banana 7 fruit
#3 popcorn 9 not fruit
You can wrap it in your custom function. For example :
merge.keep <- function(...,ord=union(names(x), names(y)))merge(...)[ord]
then for example:
merge.keep(x,y)
code food isfruit
1 8 apple fruit
2 7 banana fruit
3 9 popcorn not fruit
EDIT I use @Eddi idea to set default values of ord.
If you only bring in one column and want to append it last then maybe merge
is overkill and you can just do an assingment with a match
-[
indexing approach:
> x$isfruit <- y$isfruit[match(y$food, x$food)]
> x
code food isfruit
1 7 banana fruit
2 8 apple fruit
3 9 popcorn not fruit
(There are no switches to throw in the merge function to do what you ask.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With