This one is causing me a headache, and I am having trouble to find a solution with a for-loop.
Basically, my data looks like this:
short_list = [ [1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12] ]
long_list = [ [1, 2, 3, 4, 5], [2, 3, 4, 5, 6], [6, 7, 8, 9, 10], [9, 10, 11, 12, 13] ]
I would need to know how many times each number from each row in the short_list appears in each row of the long_list, and the comparison is NOT needed when both list indices are the same, because they come from the same data set.
Example: I need to know the occurrence of each number in [1, 2, 3] in the long_list rows [2, 3, 4, 5, 6], [6, 7, 8, 9, 10] and [9, 10, 11, 12, 13]. And then continue with the next data row in short_list, etc.
Use the izip() Function to Iterate Over Two Lists in Python It iterates over the lists until the smallest of them gets exhausted. It then zips or maps the elements of both lists together and returns an iterator object. It returns the elements of both lists mapped together according to their index.
Iterating over a list can also be achieved using a while loop. The block of code inside the loop executes until the condition is true. A loop variable can be used as an index to access each element.
Looping without a for loopGet an iterator from the given iterable. Repeatedly get the next item from the iterator. Execute the body of the for loop if we successfully got the next item. Stop our loop if we got a StopIteration exception while getting the next item.
Here's one way to do it. It's straight off the top of my head, so there is probably a much better way to do it.
from collections import defaultdict
short_list = [ [1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12] ]
long_list = [ [1, 2, 3, 4, 5], [2, 3, 4, 5, 6], [6, 7, 8, 9, 10], [9, 10, 11, 12, 13] ]
occurrences = defaultdict(int)
for i, sl in enumerate(short_list):
for j, ll in enumerate(long_list):
if i != j:
for n in sl:
occurrences[n] += ll.count(n)
>>> occurrences
defaultdict(<class 'int'>, {1: 0, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 0, 8: 0, 9: 1, 10: 1, 11: 0, 12: 0})
Note that enumerate()
is used to provide indices while iterating. The indices are compared to ensure that sub-lists at the same relative position are not compared.
The result is a dictionary keyed by items from the short list with the values being the total count of that item in the long list sans the sublist with the same index.
This is a brute-force solution. I've amended the input data to make the results more interesting:
from collections import Counter
from toolz import concat
short_list = [ [1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12] ]
long_list = [ [1, 2, 3, 4, 5], [2, 3, 4, 5, 6], [6, 7, 8, 9, 10], [2, 3, 11, 12, 13] ]
for idx, i in enumerate(short_list):
long_list_filtered = (x for x in concat(long_list[:idx] + long_list[idx+1:]) if x in set(i)))
print(idx, Counter(long_list_filtered))
# 0 Counter({2: 2, 3: 2})
# 1 Counter({4: 1, 5: 1, 6: 1})
# 2 Counter()
# 3 Counter({10: 1})
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With