Find duplicated rows (based on 2 columns) in Data Frame in R

Tags:

I have a data frame in R which looks like:

| RIC    | Date                | Open   | |--------|---------------------|--------| | S1A.PA | 2011-06-30 20:00:00 | 23.7   | | ABC.PA | 2011-07-03 20:00:00 | 24.31  | | EFG.PA | 2011-07-04 20:00:00 | 24.495 | | S1A.PA | 2011-07-05 20:00:00 | 24.23  |

I want to know if there's any duplicates regarding to the combination of RIC and Date. Is there a function for that in R?

665

asked Aug 08 '11 18:08

user802231

2 Answers

You can always try simply passing those first two columns to the function duplicated:

duplicated(dat[,1:2])

assuming your data frame is called dat. For more information, we can consult the help files for the duplicated function by typing ?duplicated at the console. This will provide the following sentences:

Determines which elements of a vector or data frame are duplicates of elements with smaller subscripts, and returns a logical vector indicating which elements (rows) are duplicates.

So duplicated returns a logical vector, which we can then use to extract a subset of dat:

ind <- duplicated(dat[,1:2]) dat[ind,]

or you can skip the separate assignment step and simply use:

dat[duplicated(dat[,1:2]),]

156

answered Oct 06 '22 06:10

joran

dplyr is so much nicer for this sort of thing:

library(dplyr) yourDataFrame %>%     distinct(RIC, Date, .keep_all = TRUE)

(the ".keep_all is optional. if not used, it will return only the deduped 2 columns. when used, it returns the deduped whole data frame)

answered Oct 06 '22 06:10

Guy Manova

Related questions
                            
                                predict.lm() in a loop. warning: prediction from a rank-deficient fit may be misleading
                            
                                wrap long text in kable table column
                            
                                R's read.csv prepending 1st column name with junk text [duplicate]
                            
                                Removing multiple columns from R data.table with parameter for columns to remove
                            
                                R Shiny: Download existing file
                            
                                Can I create an empty ggplot2 plot in R?
                            
                                Add a horizontal line to plot and legend in ggplot2
                            
                                Listing R Package Dependencies Without Installing Packages
                            
                                Convert list of vectors to data frame
                            
                                Pass a vector of variable names to arrange() in dplyr
                            
                                How to check if a sequence of numbers is monotonically increasing (or decreasing)?
                            
                                R list files with multiple conditions
                            
                                Memory Allocation "Error: cannot allocate vector of size 75.1 Mb" [duplicate]
                            
                                dplyr mutate rowwise max of range of columns
                            
                                ggplot2 legend items in a single horizontal row
                            
                                rbind multiple data sets [duplicate]
                            
                                Deleting rows that are duplicated in one column based on the conditions of another column
                            
                                How to determine if you have an internet connection in R
                            
                                Remove empty documents from DocumentTermMatrix in R topicmodels?
                            
                                How to remove the % lines in xtable table output by Knitr

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Find duplicated rows (based on 2 columns) in Data Frame in R

Tags:

dataframe

r

duplicates

user802231

People also ask

2 Answers

joran

Guy Manova

Recent Activity

Donate For Us