Ok so I have two lists:
x = [1, 2, 3, 4]
y = [1, 1, 2, 5, 6]
I compare them in such a way so I get the following output:
x = [3, 4]
y = [1, 5, 6]
The basic is idea to go through each list and compare them. If they have an element in common remove that element. But only one of that element not all of them. If they don't have an element in common leave it. Two identical lists would become x = [], y = []
Here is my rather hacked up and pretty lame solution. I hoping other can recommend a better and / or more pythonic way of doing this. 3 loops seems excessive...
done = True
while not done:
done = False
for x in xlist:
for y in ylist:
if x == y:
xlist.remove(x)
ylist.remove(y)
done = False
print xlist, ylist
Thanks as always for taking the time to read this question. XOXO
The difference between two lists (say list1 and list2) can be found using the following simple function. By Using the above function, the difference can be found using diff(temp2, temp1) or diff(temp1, temp2) . Both will give the result ['Four', 'Three'] .
Method 6: Use symmetric_difference to Find the Difference Between Two Lists in Python. The elements that are either in the first set or the second set are returned using the symmetric_difference() technique. The intersection, unlike the shared items of the two sets, is not returned by this technique.
It's possible that the data structure you are looking for is the multiset (or "bag"), and if so, a good way to implement it in Python is to use collections.Counter
:
>>> from collections import Counter
>>> x = Counter([1, 2, 3, 4])
>>> y = Counter([1, 1, 2, 5, 6])
>>> x - y
Counter({3: 1, 4: 1})
>>> y - x
Counter({1: 1, 5: 1, 6: 1})
If you want to convert the Counter
objects back to lists with multiplicity, you can use the elements
method:
>>> list((x - y).elements())
[3, 4]
>>> list((y - x).elements())
[1, 5, 6]
If you don't care about order, use collections.Counter
to do it in one line:
>>> Counter(x)-Counter(y)
Counter({3: 1, 4: 1})
>>> Counter(y)-Counter(x)
Counter({1: 1, 5: 1, 6: 1})
If you care about order, you can probably iterate through your lists grabbing elements from the above dictionaries:
def prune(seq, toPrune):
"""Prunes elements from front of seq in O(N) time"""
remainder = Counter(seq)-Counter(toPrune)
R = []
for x in reversed(seq):
if remainder.get(x):
remainder[x] -= 1
R.insert(0,x)
return R
Demo:
>>> prune(x,y)
[3, 4]
>>> prune(y,x)
[1, 1, 5, 6]
To build on Gareth's answer:
>>> a = Counter([1, 2, 3, 4])
>>> b = Counter([1, 1, 2, 5, 6])
>>> (a - b).elements()
[3, 4]
>>> (b - a).elements()
[1, 5, 6]
Benchmark code:
from collections import Counter
from collections import defaultdict
import random
# short lists
#a = [1, 2, 3, 4, 7, 8, 9]
#b = [1, 1, 2, 5, 6, 8, 8, 10]
# long lists
a = []
b = []
for i in range(0, 1000):
q = random.choice((1, 2, 3, 4))
if q == 1:
a.append(i)
elif q == 2:
b.append(i)
elif q == 3:
a.append(i)
b.append(i)
else:
a.append(i)
b.append(i)
b.append(i)
# Modifies the lists in-place! Naughty! And it doesn't actually work, to boot.
def original(xlist, ylist):
done = False
while not done:
done = True
for x in xlist:
for y in ylist:
if x == y:
xlist.remove(x)
ylist.remove(y)
done = False
return xlist, ylist # not strictly necessary, see above
def counter(xlist, ylist):
x = Counter(xlist)
y = Counter(ylist)
return ((x-y).elements(), (y-x).elements())
def nasty(xlist, ylist):
x = sum(([i]*(xlist.count(i)-ylist.count(i)) for i in set(xlist)),[])
y = sum(([i]*(ylist.count(i)-xlist.count(i)) for i in set(ylist)),[])
return x, y
def gnibbler(xlist, ylist):
d = defaultdict(int)
for i in xlist: d[i] += 1
for i in ylist: d[i] -= 1
return [k for k,v in d.items() for i in range(v)], [k for k,v in d.items() for i in range(-v)]
# substitute algorithm to test in the call
for x in range(0, 100000):
original(list(a), list(b))
Running the Insufficiently Rigorous Benchmarks[tm] (short lists are the original ones, long lists are randomly generated lists approximately 1000 entries long with a mix of matches and repeats, times given in multipliers of the Original algorithm):
100K iterations, short lists 1K iterations, long lists
Original 1.0 1.0
Counter 9.3 0.06
Nasty 2.9 1.4
Gnibbler 2.4 0.02
Note 1: The creation of the Counter object seems to overshadow the actual algorithm at small list sizes.
Note 2: Original and gnibbler are the same at list lengths of approximately 35, above which gnibbler (and Counter) are faster.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With