I've got a list of integers and I want to be able to identify contiguous blocks of duplicates: that is, I want to produce an order-preserving list of duples where each duples contains (int_in_question, number of occurrences).
For example, if I have a list like:
[0, 0, 0, 3, 3, 2, 5, 2, 6, 6]
I want the result to be:
[(0, 3), (3, 2), (2, 1), (5, 1), (2, 1), (6, 2)]
I have a fairly simple way of doing this with a for-loop, a temp, and a counter:
result_list = [] current = source_list[0] count = 0 for value in source_list: if value == current: count += 1 else: result_list.append((current, count)) current = value count = 1 result_list.append((current, count))
But I really like python's functional programming idioms, and I'd like to be able to do this with a simple generator expression. However I find it difficult to keep sub-counts when working with generators. I have a feeling a two-step process might get me there, but for now I'm stumped.
Is there a particularly elegant/pythonic way to do this, especially with generators?
Using the groupby function, we can group the together occurring elements as one and can remove all the duplicates in succession and just let one element be in the list. This function can be used to keep the element and delete the successive elements with the use of slicing.
Python list can contain duplicate elements.
>>> from itertools import groupby >>> L = [0, 0, 0, 3, 3, 2, 5, 2, 6, 6] >>> grouped_L = [(k, sum(1 for i in g)) for k,g in groupby(L)] >>> # Or (k, len(list(g))), but that creates an intermediate list >>> grouped_L [(0, 3), (3, 2), (2, 1), (5, 1), (2, 1), (6, 2)]
Batteries included, as they say.
Suggestion for using sum
and generator expression from JBernardo; see comment.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With