I tried using the code presented here to find ALL duplicated elements with dplyr
like this:
library(dplyr)
mtcars %>%
mutate(cyl.dup = cyl[duplicated(cyl) | duplicated(cyl, from.last = TRUE)])
How can I convert code presented here to find ALL duplicated elements with dplyr
? My code above just throws an error? Or even better, is there another function that will achieve this more succinctly than the convoluted x[duplicated(x) | duplicated(x, from.last = TRUE)])
approach?
Use group_by , filter and duplicated Functions to Remove Duplicate Rows by Column in R. Another solution to remove duplicate rows by column values is to group the data frame with the column variable and then filter elements using filter and duplicated functions.
Remove Duplicate rows in R using Dplyr – distinct () function. Distinct function in R is used to remove duplicate rows in R using Dplyr package. Dplyr package in R is provided with distinct() function which eliminate duplicates rows with single variable or with multiple variable.
duplicated() in R The duplicated() is a built-in R function that determines which elements of a vector or data frame are duplicates of elements with smaller subscripts and returns a logical vector indicating which elements (rows) are duplicates.
I guess you could use filter
for this purpose:
mtcars %>%
group_by(carb) %>%
filter(n()>1)
Small example (note that I added summarize()
to prove that the resulting data set does not contain rows with duplicate 'carb'. I used 'carb' instead of 'cyl' because 'carb' has unique values whereas 'cyl' does not):
mtcars %>% group_by(carb) %>% summarize(n=n())
#Source: local data frame [6 x 2]
#
# carb n
#1 1 7
#2 2 10
#3 3 3
#4 4 10
#5 6 1
#6 8 1
mtcars %>% group_by(carb) %>% filter(n()>1) %>% summarize(n=n())
#Source: local data frame [4 x 2]
#
# carb n
#1 1 7
#2 2 10
#3 3 3
#4 4 10
Another solution is to use janitor
package:
mtcars %>% get_dupes(wt)
We can find duplicated elements with dplyr as follows.
library(dplyr)
# Only duplicated elements
mtcars %>%
filter(duplicated(.[["carb"]])
# All duplicated elements
mtcars %>%
filter(carb %in% unique(.[["carb"]][duplicated(.[["carb"]])]))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With