I have two lists that contain many of the same items, including duplicate items. I want to check which items in the first list are not in the second list. For example, I might have one list like this:
l1 = ['a', 'b', 'c', 'b', 'c']
and one list like this:
l2 = ['a', 'b', 'c', 'b']
Comparing these two lists I would want to return a third list like this:
l3 = ['c']
I am currently using some terrible code that I made a while ago that I'm fairly certain doesn't even work properly shown below.
def list_difference(l1,l2): for i in range(0, len(l1)): for j in range(0, len(l2)): if l1[i] == l1[j]: l1[i] = 'damn' l2[j] = 'damn' l3 = [] for item in l1: if item!='damn': l3.append(item) return l3
How can I better accomplish this task?
We can club the Python sort() method with the == operator to compare two lists. Python sort() method is used to sort the input lists with a purpose that if the two input lists are equal, then the elements would reside at the same index positions.
The difference between two lists (say list1 and list2) can be found using the following simple function. By Using the above function, the difference can be found using diff(temp2, temp1) or diff(temp1, temp2) .
You need to create a Trie for each of your 150K lists. Then you can check whether a given word exists in the list in O(W) time. where W is the max length of the word. Then you can loop through the list of 400 words and check whether each work is in the 150K word list.
You didn't specify if the order matters. If it does not, you can do this in >= Python 2.7:
l1 = ['a', 'b', 'c', 'b', 'c'] l2 = ['a', 'b', 'c', 'b'] from collections import Counter c1 = Counter(l1) c2 = Counter(l2) diff = c1-c2 print list(diff.elements())
Create Counters for both lists, then subtract
one from the other.
from collections import Counter a = [1,2,3,1,2] b = [1,2,3,1] c = Counter(a) c.subtract(Counter(b))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With