Find how many times duplicated rows repeat in R data frame [duplicate]

Tags:

r

I have a data frame like the following example

a = c(1, 1, 1, 2, 2, 3, 4, 4) b = c(3.5, 3.5, 2.5, 2, 2, 1, 2.2, 7) df <-data.frame(a,b)

I can remove duplicated rows from R data frame by the following code, but how can I find how many times each duplicated rows repeated? I need the result as a vector.

unique(df)

df[!duplicated(df), ]

476

asked Aug 13 '13 05:08

rose

2 Answers

Here is solution using function ddply() from library plyr

library(plyr) ddply(df,.(a,b),nrow)    a   b V1 1 1 2.5  1 2 1 3.5  2 3 2 2.0  2 4 3 1.0  1 5 4 2.2  1 6 4 7.0  1

124

answered Sep 30 '22 01:09

Didzis Elferts

You could always kill two birds with the one stone:

aggregate(list(numdup=rep(1,nrow(df))), df, length) # or even: aggregate(numdup ~., data=transform(df,numdup=1), length) # or even: aggregate(cbind(df[0],numdup=1), df, length)    a   b numdup 1 3 1.0      1 2 2 2.0      2 3 4 2.2      1 4 1 2.5      1 5 1 3.5      2 6 4 7.0      1

answered Sep 30 '22 00:09

thelatemail

Related questions
                            
                                How do I strip dollar signs ($) from data/ escape special characters in R?
                            
                                linear regression "NA" estimate just for last coefficient
                            
                                Is there a way to knitr markdown straight out of your workspace using RStudio?
                            
                                Create new column with dplyr mutate and substring of existing column
                            
                                Change plot title sizes in a facet_wrap multiplot
                            
                                Use filter in dplyr conditional on an if statement in R
                            
                                Saving and loading data.frames [duplicate]
                            
                                How to access to specify file in subfolder without change working directory In R?
                            
                                Install binary zipped R package via command line
                            
                                Check whether two vectors contain the same (unordered) elements in R
                            
                                How to remove duplicated column names in R?
                            
                                Transpose / reshape dataframe without "timevar" from long to wide format
                            
                                Add (subtract) months without exceeding the last day of the new month
                            
                                Should I avoid programming packages with pipe operators?
                            
                                Count unique values for every column
                            
                                Replacing occurrences of a number in multiple columns of data frame with another value in R
                            
                                Easy way of counting precision, recall and F1-score in R
                            
                                How to plot dendrograms with large datasets?
                            
                                Calculating cumulative sum for each row
                            
                                Creating arbitrary panes in ggplot2

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With