Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr `left_join()` does not work with a character objects as the LHS variable [duplicate]

I can join two datasets that contain two variables with different names using dplyr::left_join(..., by = c("name1" = "name2").

I want to join using character objects, left_join(..., by = c(nameOb1 = nameOb2). Oddly: this works for by = c("name1", nameOb2), but not for by = c(nameOb1, "name2").

Why is this?

Replication of my issue below. Many thanks.

Generate data

    orig <- tibble(name1 = c("a", "b", "c"),
                   n     = c(10, 20, 30))  

    tojoin <- tibble(name2 = c("a", "b", "c"),
                     pc    = c(.4, .1, .2))    

Works: using character strings for the by arguments

    left_join(orig, tojoin, by = c("name1" = "name2"))

    # A tibble: 3 x 3
      name1     n    pc
      <chr> <dbl> <dbl>
    1 a        10   0.4
    2 b        20   0.1
    3 c        30   0.2

Does not work: using object as the character string for the first by argument

    firstname <- "name1"

    left_join(orig, tojoin, by = c(firstname = "name2"))

    # Error: `by` can't contain join column `firstname` which is missing from LHS
    # Call `rlang::last_error()` to see a backtrace

Works: using object as the character string for the second by argument

    secondname <- "name2"

    left_join(orig, tojoin, by = c("name1" = secondname))

    # A tibble: 3 x 3
      name1     n    pc
      <chr> <dbl> <dbl>
    1 a        10   0.4
    2 b        20   0.1
    3 c        30   0.2

Packages:

dplyr 0.8.0.1

like image 736
wfmackey Avatar asked Feb 22 '19 09:02

wfmackey


1 Answers

Hy, the 'left_join' function needs a named character vector in the by argument. In your second try:

firstname <- "name1"
left_join(orig, tojoin, by = c(firstname = "name2"))

You set the name of the character vector to firstname which does not work for the join. For solving this you can first generate a named character vector and pass it then to the by argument of the join function

firstname <- "name1"
join_cols = c("name2")
names(join_cols) <- firstname

dplyr::left_join(orig, tojoin, by = join_cols)
like image 137
Freakazoid Avatar answered Oct 23 '22 07:10

Freakazoid