I have a list of vectors, say:
li <- list( c(1, 2, 3),
c(1, 2, 3, 4),
c(2, 3, 4),
c(5, 6, 7, 8, 9, 10, 11, 12),
numeric(0),
c(5, 6, 7, 8, 9, 10, 11, 12, 13)
)
And I would like to remove all the vectors that are already contained in others (bigger or equal), as well as all the empty vectors
In this case, I would be left with only the list
1 2 3 4
5 6 7 8 9 10 11 12 13
Is there any useful function for achieving this?
Thanks in advance
First you should sort the list by vector length, such that in the excision loop it is guaranteed that each lower-index vector is shorter than each higher-index vector, so a one-way setdiff()
is all you need.
l <- list(1:3, 1:4, 2:4, 5:12, double(), 5:13 );
ls <- l[order(sapply(l,length))];
i <- 1; while (i <= length(ls)-1) if (length(ls[[i]]) == 0 || any(sapply((i+1):length(ls),function(i2) length(setdiff(ls[[i]],ls[[i2]]))) == 0)) ls[[i]] <- NULL else i <- i+1;
ls;
## [[1]]
## [1] 1 2 3 4
##
## [[2]]
## [1] 5 6 7 8 9 10 11 12 13
Here's a slight alternative, replacing the any(sapply(...))
with a second while-loop. The advantage is that the while-loop can break prematurely if it finds any superset in the remainder of the list.
l <- list(1:3, 1:4, 2:4, 5:12, double(), 5:13 );
ls <- l[order(sapply(l,length))];
i <- 1; while (i <= length(ls)-1) if (length(ls[[i]]) == 0 || { j <- i+1; res <- F; while (j <= length(ls)) if (length(setdiff(ls[[i]],ls[[j]])) == 0) { res <- T; break; } else j <- j+1; res; }) ls[[i]] <- NULL else i <- i+1;
ls;
## [[1]]
## [1] 1 2 3 4
##
## [[2]]
## [1] 5 6 7 8 9 10 11 12 13
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With