Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cleanest way to remove common list elements across multiple lists in python

Tags:

python

list

set

I have n lists of numbers. I want to make sure that each list contains unique elements to that particular list. I.e. There are no "shared" duplicates across any of the rest.
This is really easy to do with two lists, but a little trickier with n lists.

e.g.   
mylist = [
[1, 2, 3, 4],
[2, 5, 6, 7],
[4, 2, 8, 9]
]

becomes:

mylist = [
[1, 3],
[5, 6, 7],
[8, 9]
]
like image 538
LittleBobbyTables Avatar asked Mar 05 '12 23:03

LittleBobbyTables


3 Answers

from collections import Counter
from itertools import chain

mylist = [
    [1,2,3,4],
    [2,5,6,7,7],
    [4,2,8,9]
]

counts = Counter(chain(*map(set,mylist)))

[[i for i in sublist if counts[i]==1] for sublist in mylist]
#[[1, 3], [5, 6, 7, 7], [8, 9]]
like image 150
dugres Avatar answered Oct 30 '22 12:10

dugres


This does it in linear time, 2 passes. I'm assuming you want to preserve duplicates within a list; if not, this can be simplified a bit:

>>> import collections, itertools
>>> counts = collections.defaultdict(int)
>>> for i in itertools.chain.from_iterable(set(l) for l in mylist):
...     counts[i] += 1
... 
>>> for l in mylist:
...     l[:] = (i for i in l if counts[i] == 1)
... 
>>> mylist
[[1, 3], [5, 6, 7], [8, 9]]
like image 2
senderle Avatar answered Oct 30 '22 14:10

senderle


Since you don't care about order, you can easily remove duplicates using set subtraction and converting back to list. Here it is in a monster one-liner:

>>> mylist = [
... [1, 2, 3, 4],
... [2, 5, 6, 7],
... [4, 2, 8, 9]
... ]
>>> mynewlist = [list(set(thislist) - set(element for sublist in mylist for element in sublist if sublist is not thislist)) for thislist in mylist]
>>> mynewlist
[[1, 3], [5, 6, 7], [8, 9]]

Note: This is not very efficient because duplicates are recomputed for each row. Whether this is a problem or not depends on your data size.

like image 1
wim Avatar answered Oct 30 '22 13:10

wim