Remove all unique rows

Tags:

I am trying to figure out how to remove all unique rows, from a data frame, but if it has a duplicate, I want that to stay in. For Example - I want all columns from this with col1 the same:

df<-data.frame(col1=c(rep("a",3),"b","c",rep("d",3)),col2=c("A","B","C",rep("A",3),"B","C"),col3=c(3,3,1,4,4,3,2,1))
df
  col1 col2 col3
1    a    A    3
2    a    B    3
3    a    C    1
4    b    A    4
5    c    A    4
6    d    A    3
7    d    B    2
8    d    C    1

subset(df,duplicated(col1))
  col1 col2 col3
2    a    B    3
3    a    C    1
7    d    B    2
8    d    C    1

But I want to have rows 1,2,3,6,7,8 since they all have the same col 1. How do I get 1 and 6 to be included? Or, conversely, how do I remove rows that do not have a duplicate?

246

asked Feb 21 '14 22:02

user1775563

2 Answers

Another option:

subset(df,duplicated(col1) | duplicated(col1, fromLast=TRUE))

122

answered Sep 28 '22 02:09

Matthew Plourde

Try:

> tdf <- table(df$col1)
a b c d 
3 1 1 3 

df[df$col1 %in% names(tdf)[tdf>1],]
> df
  col1 col2 col3
1    a    A    3
2    a    B    3
3    a    C    1
6    d    A    3
7    d    B    2
8    d    C    1

answered Sep 28 '22 00:09

harkmug

Related questions
                            
                                How to plot a Stacked and grouped bar chart in ggplot?
                            
                                How to italicize one category in a legend in ggplot2
                            
                                Different colors with gradient for subgroups on a treemap ggplot2 R
                            
                                How to center boxes on top of lines in the legend of a plot?
                            
                                R: Create duplicate rows based on a variable (dplyr preferred) [duplicate]
                            
                                Check if column value is in between (range) of two other column values
                            
                                How to summarize a list of combination
                            
                                Paste together two data frames element by element in R
                            
                                Saving a list of plots by their names()
                            
                                R: Find missing columns, add to data frame if missing
                            
                                How to add data by columns in csv file using R?
                            
                                plot line behind barplot
                            
                                check if a program is installed
                            
                                How to solve the 'ymax not defined'?
                            
                                Producing a new dataframe from an old dataframe?
                            
                                How to extract a number into digits using R?
                            
                                How to get the "code for creating a variable" from a data.frame
                            
                                R, merge multiple rows of text data frame into one cell
                            
                                ctree() - How to get the list of splitting conditions for each terminal node?
                            
                                What is the equivalent of the SumIf function in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Remove all unique rows

Tags:

dataframe

r

duplicates

user1775563

People also ask

2 Answers

Matthew Plourde

harkmug

Recent Activity

Donate For Us