I have a data set which looks like as follows:
A B C
liver 5 RX
blood 9 DK
liver 7 DK
intestine 5 RX
blood 3 DX
blood 1 DX
skin 2 RX
skin 2 DX
I want to keep only the duplicated (not triplicates or so on) entries based on A
. Meaning if values in A
are duplicate it should print the entire row.
The ideal output will look like:
A B C
liver 5 RX
liver 7 DK
skin 2 RX
skin 2 DX
I tried using the following code with dplyr
df %>% group_by(A) %>% filter(n() >= 1)
Could someone please help me here?
You can do:
df %>%
group_by(A) %>%
filter(n() == 2)
A B C
<chr> <int> <chr>
1 liver 5 RX
2 liver 7 DK
3 skin 2 RX
4 skin 2 DX
Or a more verbose way to do the same:
df %>%
add_count(A) %>%
filter(n == 2) %>%
select(-n)
Or:
df %>%
group_by(A) %>%
filter(max(row_number()) == 2)
Considering you may want duplicated cases based on "A" column that are otherwise unique:
df %>%
group_by(A) %>%
distinct() %>%
filter(n() == 2)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With