Consider the following data frame:
first_name last_name
1 Al Smith
2 Al Jones
3 Jeff Thompson
4 Scott Thompson
5 Terry Dactil
6 Pete Zah
data <- data.frame(first_name=c("Al","Al","Jeff","Scott","Terry","Pete"),
last_name=c("Smith","Jones","Thompson","Thompson","Dactil","Zah"))
In this data frame, there are three ways that first_name is related to last_name:
I want to be able to quickly identify each of the three cases and output them to a data frame. So, the resulting data frames would be:
One to one
first_name last_name
1 Terry Dactil
2 Pete Zah
One to many
first_name last_name
1 Al Smith
2 Al Jones
Many to one
first_name last_name
1 Jeff Thompson
2 Scott Thompson
I would like to do this within the dplyr package.
In general, you can check if a value is duplicated using the duplicated
function (as mentioned by @RichardScriven in a comment on your question). However, by default this function doesn't mark the first instance of an element that appears multiple times as duplicated:
duplicated(c(1, 1, 1, 2))
# [1] FALSE TRUE TRUE FALSE
Since you also want to pick up these cases, you generally would want to run duplicated
on each vector twice, once forward and once backwards:
duplicated(c(1, 1, 1, 2)) | duplicated(c(1, 1, 1, 2), fromLast=TRUE)
# [1] TRUE TRUE TRUE FALSE
I find this to be a lot of typing, so I'll define a helper function that checks if an element appears more than once:
d <- function(x) duplicated(x) | duplicated(x, fromLast=TRUE)
Now the logic you want is all simple one-liners:
# One to one
data[!d(data$first_name) & !d(data$last_name),]
# first_name last_name
# 5 Terry Dactil
# 6 Pete Zah
# One to many
data[d(data$first_name) & !d(data$last_name),]
# first_name last_name
# 1 Al Smith
# 2 Al Jones
# Many to one
data[!d(data$first_name) & d(data$last_name),]
# first_name last_name
# 3 Jeff Thompson
# 4 Scott Thompson
Note that you could also define d
without the help of duplicated
using the table
function:
d <- function(x) table(x)[x] > 1
While this alternate definition is slightly more succinct, I also find it less readable.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With