Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding intersection/difference between python lists

Tags:

python

list

numpy

I have two python lists:

a = [('when', 3), ('why', 4), ('throw', 9), ('send', 15), ('you', 1)]

b = ['the', 'when', 'send', 'we', 'us']

I need to filter out all the elements from a that are similar to those in b. Like in this case, I should get:

c = [('why', 4), ('throw', 9), ('you', 1)]

What should be the most effective way?

like image 257
khan Avatar asked Feb 23 '13 09:02

khan


People also ask

How do you find the difference between two lists in Python?

The difference between two lists (say list1 and list2) can be found using the following simple function. By Using the above function, the difference can be found using diff(temp2, temp1) or diff(temp1, temp2) . Both will give the result ['Four', 'Three'] .


3 Answers

A list comprehension should work:

c = [item for item in a if item[0] not in b]

Or with a dictionary comprehension:

d = dict(a)
c = {key: value for key in d.iteritems() if key not in b}
like image 52
Blender Avatar answered Oct 26 '22 06:10

Blender


A list comprehension will work.

a = [('when', 3), ('why', 4), ('throw', 9), ('send', 15), ('you', 1)]
b = ['the', 'when', 'send', 'we', 'us']
filtered = [i for i in a if not i[0] in b]

>>>print(filtered)
[('why', 4), ('throw', 9), ('you', 1)]
like image 38
Octipi Avatar answered Oct 26 '22 07:10

Octipi


in is nice, but you should use sets at least for b. If you have numpy, you could also try np.in1d of course, but if it is faster or not, you should probably try.

# ruthless copy, but use the set...
b = set(b)
filtered = [i for i in a if not i[0] in b]

# with numpy (note if you create the array like this, you must already put
# the maximum string length, here 10), otherwise, just use an object array.
# its slower (likely not worth it), but safe.
a = np.array(a, dtype=[('key', 's10'), ('val', int)])
b = np.asarray(b)

mask = ~np.in1d(a['key'], b)
filtered = a[mask]

Sets also have have the methods difference, etc. which probably are not to useful here, but in general probably are.

like image 35
seberg Avatar answered Oct 26 '22 06:10

seberg