<code>duplicated</code> has a <code>fromLast</code> argument. The "Example" section of <code>?duplicated</code> shows you how to use it. Just call <code>duplicated</code> twice, once with <code>fromLast=FALSE</code> and once with <code>fromLast=TRUE</code> and take the rows where either are <code>TRUE</code>. <hr> Some late Edit: You didn't provide a reproducible example, so here's an illustration kindly contributed by @jbaums <pre class="prettyprint"><code>vec <- c("a", "b", "c","c","c") vec[duplicated(vec) | duplicated(vec, fromLast=TRUE)] ## [1] "c" "c" "c" </code></pre> <hr> Edit: And an example for the case of a data frame: <pre class="prettyprint"><code>df <- data.frame(rbind(c("a","a"),c("b","b"),c("c","c"),c("c","c"))) df[duplicated(df) | duplicated(df, fromLast=TRUE), ] ## X1 X2 ## 3 c c ## 4 c c </code></pre> You need to assemble the set of <code>duplicated</code> values, apply <code>unique</code>, and then test with <code>%in%</code>. As always, a sample problem will make this process come alive. <pre class="prettyprint"><code>> vec <- c("a", "b", "c","c","c") > vec[ duplicated(vec)] [1] "c" "c" > unique(vec[ duplicated(vec)]) [1] "c" > vec %in% unique(vec[ duplicated(vec)]) [1] FALSE FALSE TRUE TRUE TRUE </code></pre> Duplicated rows in a dataframe could be obtained with <code>dplyr</code> by doing <pre class="prettyprint"><code>library(tidyverse) df = bind_rows(iris, head(iris, 20)) # build some test data df %>% group_by_all() %>% filter(n()>1) %>% ungroup() </code></pre> To exclude certain columns <code>group_by_at(vars(-var1, -var2))</code> could be used instead to group the data. If the row indices and not just the data is actually needed, you could add them first as in: <pre class="prettyprint"><code>df %>% add_rownames %>% group_by_at(vars(-rowname)) %>% filter(n()>1) %>% pull(rowname) </code></pre> I've had the same question, and if I'm not mistaken, this is also an answer. <pre class="prettyprint"><code>vec[col %in% vec[duplicated(vec$col),]$col] </code></pre> Dunno which one is faster, though, the dataset I'm currently using isn't big enough to make tests which produce significant time gaps.

Finding ALL duplicate rows, including "elements with smaller subscripts"

Tags:

duplicated has a fromLast argument. The "Example" section of ?duplicated shows you how to use it. Just call duplicated twice, once with fromLast=FALSE and once with fromLast=TRUE and take the rows where either are TRUE.

Some late Edit: You didn't provide a reproducible example, so here's an illustration kindly contributed by @jbaums

vec <- c("a", "b", "c","c","c") 
vec[duplicated(vec) | duplicated(vec, fromLast=TRUE)]
## [1] "c" "c" "c"

Edit: And an example for the case of a data frame:

df <- data.frame(rbind(c("a","a"),c("b","b"),c("c","c"),c("c","c")))
df[duplicated(df) | duplicated(df, fromLast=TRUE), ]
##   X1 X2
## 3  c  c
## 4  c  c

You need to assemble the set of duplicated values, apply unique, and then test with %in%. As always, a sample problem will make this process come alive.

> vec <- c("a", "b", "c","c","c")
> vec[ duplicated(vec)]
[1] "c" "c"
> unique(vec[ duplicated(vec)])
[1] "c"
>  vec %in% unique(vec[ duplicated(vec)]) 
[1] FALSE FALSE  TRUE  TRUE  TRUE

Duplicated rows in a dataframe could be obtained with dplyr by doing

library(tidyverse)
df = bind_rows(iris, head(iris, 20)) # build some test data
df %>% group_by_all() %>% filter(n()>1) %>% ungroup()

To exclude certain columns group_by_at(vars(-var1, -var2)) could be used instead to group the data.

If the row indices and not just the data is actually needed, you could add them first as in:

df %>% add_rownames %>% group_by_at(vars(-rowname)) %>% filter(n()>1) %>% pull(rowname)

I've had the same question, and if I'm not mistaken, this is also an answer.

vec[col %in% vec[duplicated(vec$col),]$col]

Dunno which one is faster, though, the dataset I'm currently using isn't big enough to make tests which produce significant time gaps.

Related questions
                            
                                Printing newlines with print() in R
                            
                                adding x and y axis labels in ggplot2
                            
                                Check if the number is integer
                            
                                To find whether a column exists in data frame or not
                            
                                Split delimited strings in a column and insert as new rows [duplicate]
                            
                                Limit ggplot2 axes without removing data (outside limits): zoom
                            
                                How to change language settings in R
                            
                                Extracting numbers from vectors of strings
                            
                                Split comma-separated strings in a column into separate rows
                            
                                How to search for "R" materials? [closed]
                            
                                Unable to install R package in Ubuntu 11.04 [closed]
                            
                                Combine two or more columns in a dataframe into a new column with a new name
                            
                                Convert a row of a data frame to vector
                            
                                Applying a function to every row of a table using dplyr?
                            
                                How to force R to use a specified factor level as reference in a regression?
                            
                                What does "S3 methods" mean in R?
                            
                                How to reorder data.table columns (without copying)
                            
                                Can dplyr join on multiple columns or composite key?
                            
                                How can I change the Y-axis figures into percentages in a barplot?
                            
                                Getting LaTeX into R Plots

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Finding ALL duplicate rows, including "elements with smaller subscripts"

Tags:

r

r-faq

duplicates

Recent Activity

Donate For Us