I want to remove duplicate values based upon matches in 2 columns in a dataframe, <code>v2</code> & <code>v4</code> must match between rows to be removed. <pre class="prettyprint"><code>> df v1 v2 v3 v4 v5 1 7 1 A 100 98 2 7 2 A 100 97 3 8 1 C NA 80 4 8 1 C 78 75 5 8 1 C 78 62 6 9 3 C 75 75 </code></pre> For a result of <pre class="prettyprint"><code>> df v1 v2 v3 v4 v5 1 7 1 A 100 98 2 8 1 C NA 80 3 8 1 C 78 75 4 9 3 C 75 75 </code></pre> I know I want something like: <pre class="prettyprint"><code>df[!duplicated(df[v2] && df[v4]),] </code></pre> but this doesn't work. This question is specifically about dataframes, for those who have a data.table, see Filtering out duplicated/non-unique rows in data.table.

This will give you the desired result: <pre class="prettyprint"><code>df [!duplicated(df[c(1,4)]),] </code></pre>

remove duplicate values based on 2 columns

Tags:

r

duplicates

I want to remove duplicate values based upon matches in 2 columns in a dataframe, v2 & v4 must match between rows to be removed.

> df

   v1  v2  v3   v4  v5
1  7   1   A  100  98 
2  7   2   A  100  97
3  8   1   C   NA  80
4  8   1   C   78  75
5  8   1   C   78  62
6  9   3   C   75  75

For a result of

> df

   v1  v2  v3   v4  v5
1  7   1   A  100  98 
2  8   1   C   NA  80
3  8   1   C   78  75
4  9   3   C   75  75

I know I want something like:

df[!duplicated(df[v2] && df[v4]),]

but this doesn't work.

This question is specifically about dataframes, for those who have a data.table, see Filtering out duplicated/non-unique rows in data.table.

838

asked Jan 20 '16 20:01

lmcshane

1 Answers

This will give you the desired result:

df [!duplicated(df[c(1,4)]),]

180

answered Oct 11 '22 01:10

Wyldsoul

Related questions
                            
                                ggplot2 draw dashed lines of same colour as solid lines belonging to different groups
                            
                                Sort a named list in R
                            
                                Transposing data frames
                            
                                When using ggplot in R, how do I remove margins surrounding the plot area?
                            
                                subsetting in data.table
                            
                                How to add/subtract time from a POSIXlt time while keeping its class in R?
                            
                                Remove an element of a list by name
                            
                                How to export R matrix object to .txt file
                            
                                writing to a dataframe from a for-loop in R
                            
                                Make readline wait for input in R
                            
                                How to color sliderbar (sliderInput)?
                            
                                how to rename a variable using a dynamic name and dplyr?
                            
                                How to delete a column in R dataframe [duplicate]
                            
                                Resizing image in R
                            
                                Filling in missing (blanks) in a data table, per category - backwards and forwards
                            
                                installing package from a local .tar.gz file on Linux [duplicate]
                            
                                How do I print the variance of an lm in R without computing from the Standard Error by hand?
                            
                                Partially color histogram in R
                            
                                regular expression excluding word in R
                            
                                Create appendix with R-code in rmarkdown/knitr

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With