I have a dataset in R which looks like this: <pre class="prettyprint"><code> x1 x2 x3 1: A Away 2 2: A Home 2 3: B Away 2 4: B Away 1 5: B Home 2 6: B Home 1 7: C Away 1 8: C Home 1 </code></pre> Based on the values in columns x1 and x2, I want to remove the duplicate rows. I have tried the following: <pre class="prettyprint"><code>df[!duplicated(df[,c('x1', 'x2')]),] </code></pre> It should remove rows 4 and 6. But unfortunately it is not working, as it returns exactly the same data, with the duplicates still present in the dataset. What do I have to use in order to remove rows 4 and 6?

I'd just do: <pre class="prettyprint"><code>unique(df, by=c("x1", "x2")) # where df is a data.table </code></pre> This'd have been quite obvious if you'd just looked at <code>?unique</code>. PS: given the syntax in your Q, I wonder if you are aware of the basic differences between data.table and data.frame's syntax. I suggest you read the vignettes first.

Remove duplicated rows (based on 2 columns) in R

Tags:

r

duplicates

data.table

I have a dataset in R which looks like this:

    x1 x2  x3
1:  A Away  2
2:  A Home  2
3:  B Away  2
4:  B Away  1
5:  B Home  2
6:  B Home  1
7:  C Away  1
8:  C Home  1

Based on the values in columns x1 and x2, I want to remove the duplicate rows. I have tried the following:

df[!duplicated(df[,c('x1', 'x2')]),]

It should remove rows 4 and 6. But unfortunately it is not working, as it returns exactly the same data, with the duplicates still present in the dataset. What do I have to use in order to remove rows 4 and 6?

514

asked Jul 28 '16 13:07

sander

1 Answers

I'd just do:

unique(df, by=c("x1", "x2")) # where df is a data.table

This'd have been quite obvious if you'd just looked at ?unique.

PS: given the syntax in your Q, I wonder if you are aware of the basic differences between data.table and data.frame's syntax. I suggest you read the vignettes first.

185

answered Sep 21 '22 23:09

Arun

Related questions
                            
                                mapply for all arguments' combinations [R]
                            
                                Heatmap of regression lines
                            
                                Determine if DT datatable is clicked in shiny app
                            
                                Write data.frame to CSV file and use theire variable name as file name
                            
                                r: Plotting each column against each column
                            
                                Plotting a multiple logistic regression for binary and continuous values in R
                            
                                How to increase the font size of labels on pie chart
                            
                                Change column names in DT package right before output to Shiny app
                            
                                Rounding currency formatted numbers in DT
                            
                                How do I estimate the parameters of a bivariate normal distribution in R from real data?
                            
                                Add a progress bar to boot function in R
                            
                                jsonlite for R gives error when trying to install
                            
                                R: find first non-NA observation in data.table column by group
                            
                                Gradient colored datatable rows in R Shiny
                            
                                Exact Positioning of multiple plots in ggplot2 with grid.arrange
                            
                                focusing the cursor in textArea after clicking an action button in shiny
                            
                                Correctly Converting Numbers to Colors in R (in Matrices or Otherwise)
                            
                                standard error binary variable R
                            
                                Hide R code chunks from outline view in RStudio
                            
                                RStudio Required Package Versions Could Not Be Found

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With