Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In dplyr, what are the intrinsic differences between setdiff and anti_join?

Tags:

r

dplyr

I'm still working through the lessons on DataCamp for R, so please forgive me if this question seems naïve.

Consider the following (very contrived) sample:

library(dplyr)
library(tibble)

type <- c("Dog", "Cat", "Cat", "Cat")
name <- c("Ella", "Arrow", "Gabby", "Eddie")
pets = tibble(name, type)

name <- c("Ella", "Arrow", "Dog")
type <- c("Dog", "Cat", "Calvin")
favorites = tibble(name, type)

anti_join(favorites, pets, by = "name")
setdiff(favorites, pets, by = "name")

Both of these return exactly the same data:

> anti_join(favorites, pets, by = "name")
# A tibble: 1 × 2
   name   type
  <chr>  <chr>
1   Dog Calvin

> setdiff(favorites, pets, by = "name")
# A tibble: 1 × 2
   name   type
  <chr>  <chr>
1   Dog Calvin

The documentation for each of them seems to indicate only a subtle difference: that setdiff returns rows, but anti_join does not. From my testing, this doesn't appear to be the case.

Can someone explain to me the true differences between these two, and perhaps provide a better example that illustrates the differences more clearly? (This is an area where DataCamp hasn't been particularly helpful.)

like image 696
Mike Hofer Avatar asked Oct 20 '17 17:10

Mike Hofer


Video Answer


1 Answers

Both subset the first parameter, but setdiff requires the columns to be the same:

library(dplyr)

setdiff(mtcars, mtcars[1:30, ])
#>    mpg cyl disp  hp drat   wt qsec vs am gear carb
#> 1 15.0   8  301 335 3.54 3.57 14.6  0  1    5    8
#> 2 21.4   4  121 109 4.11 2.78 18.6  1  1    4    2

setdiff(mtcars, mtcars[1:30, 1:6])
#> Error in setdiff_data_frame(x, y): not compatible: Cols in x but not y: `carb`, `gear`, `am`, `vs`, `qsec`.

whereas anti_join is a join, so doesn't:

anti_join(mtcars, mtcars[1:30, 1:3])
#> Joining, by = c("mpg", "cyl", "disp")
#>    mpg cyl disp  hp drat   wt qsec vs am gear carb
#> 1 15.0   8  301 335 3.54 3.57 14.6  0  1    5    8
#> 2 21.4   4  121 109 4.11 2.78 18.6  1  1    4    2
like image 84
alistaire Avatar answered Sep 28 '22 05:09

alistaire