Removing duplicate combinations (irrespective of order)

Tags:

combinations

I have a data frame of integers that is a subset of all of the n choose 3 combinations of 1...n. E.g., for n=5, it is something like:

      [,1] [,2] [,3]
 [1,]    1    2    3
 [2,]    1    2    4
 [3,]    1    2    5
 [4,]    1    3    4
 [5,]    1    3    5
 [6,]    1    4    5
 [7,]    2    1    3
 [8,]    2    1    4
 [9,]    2    1    5
[10,]    2    3    4
[11,]    2    3    5
[12,]    2    4    5
[13,]    3    1    2
[14,]    3    1    4
[15,]    3    1    5
[16,]    3    2    4
[17,]    3    2    5
[18,]    3    4    5
[19,]    4    1    2
[20,]    4    1    3
[21,]    4    1    5
[22,]    4    2    3
[23,]    4    2    5
[24,]    4    3    5
[25,]    5    1    2
[26,]    5    1    3
[27,]    5    1    4
[28,]    5    2    3
[29,]    5    2    4
[30,]    5    3    4

What I'd like to do is remove any rows with duplicate combinations, irrespective of ordering. E.g., [1,] 1 2 3 is the same as [1,] 2 1 3 is the same as [1,] 3 1 2.

unique, duplicated, &c. don't seem to take this into account. Also, I am working with quite a large amount of data (n is ~750), so it ought to be a pretty fast operation. Are there any base functions or packages that can do this?

467

asked Jan 27 '12 02:01

seanimo

1 Answers

Sort within the rows first, then use duplicated, see below:

# example data    
dat = matrix(scan('data.txt'), ncol = 3, byrow = TRUE)
# Read 90 items

dat[ !duplicated(apply(dat, 1, sort), MARGIN = 2), ]
#       [,1] [,2] [,3]
#  [1,]    1    2    3
#  [2,]    1    2    4
#  [3,]    1    2    5
#  [4,]    1    3    4
#  [5,]    1    3    5
#  [6,]    1    4    5
#  [7,]    2    3    4
#  [8,]    2    3    5
#  [9,]    2    4    5
# [10,]    3    4    5

answered Sep 30 '22 15:09

John Colby

Related questions
                            
                                R 2.14 - detect packages without namespace
                            
                                Run a bash script from an R script
                            
                                R package xtable, how to create a latextable with multiple rows and columns from R
                            
                                why does knitr caching fail for data.table `:=`?
                            
                                Reduce cell width and font size of table using pandoc.table()
                            
                                Convert day of year to date
                            
                                Print an R vector vertically
                            
                                R: use min() within dplyr::mutate()
                            
                                Formula interface for glmnet
                            
                                How to apply dplyr filter to list of data frames?
                            
                                Non-equi join using data.table: column missing from the output
                            
                                Delete rows based on multiple conditions with dplyr
                            
                                Print the current random seed so that I can enter it with set.seed() later
                            
                                How to quickly export data from R to SQL Server
                            
                                Separate y axis for different facets in ggplot
                            
                                What does the "More Columns than Column Names" error mean?
                            
                                R Markdown HTML Number Figures
                            
                                Variable importance with ranger
                            
                                List to integer or double in R
                            
                                Using column numbers not names in lm()

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With