I'm trying to filter a data.frame with family information. It looks like this:
+--------+-------+---------+
| name | dad | mom |
+--------+-------+---------+
| john | bert | ernie |
| quincy | adam | eve |
| anna | david | goliath |
| daniel | bert | ernie |
| sandra | adam | linda |
+--------+-------+---------+
Now I want to know if every person who has the same dad, also has the same mom. I've been over this for an hour now trying different approaches, but i keep getting stuck. Also, i'd like to use an R-approach and not a long sequence of functions or for-loops that technically does what i want, without learning anything new.
My expected output:
+--------+------+-------+
| name | dad | mom |
+--------+------+-------+
| quincy | adam | eve |
| sandra | adam | linda |
+--------+------+-------+
Essentially I want to have a data.frame with dads and moms who have kids from multiple partners.
So far my approach has been:
My code up to now:
fraternals <- split(kinship, kinship$father)
fraternals <- fraternals[-which(lapply(fraternals, function(x) if(nrow(x) == 1) { output TRUE }))]
but that doesn't run because r says i can not use TRUE in that way.
One dplyr possibility could be:
df %>%
group_by(dad) %>%
filter(n_distinct(mom) != 1)
name dad mom
<chr> <chr> <chr>
1 quincy adam eve
2 sandra adam linda
If you don't want to filter but want to see this information:
df %>%
group_by(dad) %>%
mutate(cond = n_distinct(mom) != 1)
name dad mom cond
<chr> <chr> <chr> <lgl>
1 john bert ernie FALSE
2 quincy adam eve TRUE
3 anna david goliath FALSE
4 daniel bert ernie FALSE
5 sandra adam linda TRUE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With