As I understand it, setdiff()
compares two vectors and gives the elements that occur in one vector but do not occur in the other. If that's so, then given these vectors...
thing1 <- c(1,2,3)
thing2 <- c(2,3,4)
thing3 <- c(1,2,3)
...here's my results.
setdiff(thing1,thing2)
> [1] 1
setdiff(thing2,thing3)
> [1] 4
setdiff(thing1,thing3)
> numeric(0)
Shouldn't the comparison of thing1
and thing2
produce the same result as comparing thing2
and thing3
? How to achieve an 'outer join' sort of result (symmetric set difference) where we can see all the elements that are missing if we unioned thing1
and thing2
? Prefer to know functionality in R base, but would also appreciate data.tables
approach. Thanks in advance.
setdiff
provides asymmetric difference. In this case, it does what it says on the tin.
Shouldn't the comparison of
thing1
andthing2
produce the same result as comparingthing2
andthing3
?
Well, no. But it will produce the same results as comparing thing3
and thing2
. The order matters. Consider your first two examples:
The first example asks, what is in thing1
that is not in thing2
?
> setdiff(thing1, thing2)
[1] 1
You could try the reverse, what is in thing2
that is not in thing1
?
> setdiff(thing2, thing1)
[1] 4
But it looks to me like the question you're asking is:
What elements of
thing1
andthing2
are not shared?
Which is the same as:
What elements are in the union of
thing1
andthing2
, but not in the intersection of the two?
> setdiff(union(thing1, thing2), intersect(thing1, thing2))
[1] 1 4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With