I have a list of lists in python and I need to find how many times each sub-list has occurred. Here is a sample,
from collections import Counter
list1 = [[ 1., 4., 2.5], [ 1., 2.66666667, 1.33333333],
[ 1., 2., 2.], [ 1., 2.66666667, 1.33333333], [ 1., 4., 2.5],
[ 1., 2.66666667, 1.33333333]]
c = Counter(x for x in iter(list1))
print c
I above code will work, if the elements of the list were hashable (say int), but in this case they are lists and I get an error
TypeError: unhashable type: 'list'
How can I count these lists so I get something like
[ 1., 2.66666667, 1.33333333], 3
[ 1., 4., 2.5], 2
[ 1., 2., 2.], 1
Just convert the lists to tuple
:
>>> c = Counter(tuple(x) for x in iter(list1))
>>> c
Counter({(1.0, 2.66666667, 1.33333333): 3, (1.0, 4.0, 2.5): 2, (1.0, 2.0, 2.0): 1})
Remember to do the same for lookup:
>>> c[tuple(list1[0])]
2
Counter returns a dictionary like object which it's keys must be hashable. And since lists are not hashable you can convert them to tuple
using map
function:
>>> Counter(map(tuple, list1))
Counter({(1.0, 2.66666667, 1.33333333): 3, (1.0, 4.0, 2.5): 2, (1.0, 2.0, 2.0): 1})
Note that using map
will perform slightly better than a generator expression because by passing a generator expression to Counter()
python will get the values from generator function by itself, since using built-in function map
has more performance in terms of execution time1.
# Use generator expression
~ $ python -m timeit --setup "list1 = [[ 1., 4., 2.5], [ 1., 2.66666667, 1.33333333],[ 1., 2., 2.], [ 1., 2.66666667, 1.33333333], [ 1., 4., 2.5],[ 1., 2.66666667, 1.33333333]] ;from collections import Counter" "Counter(tuple(x) for x in iter(list1))"
100000 loops, best of 3: 9.86 usec per loop
# Use map
~ $ python -m timeit --setup "list1 = [[ 1., 4., 2.5], [ 1., 2.66666667, 1.33333333],[ 1., 2., 2.], [ 1., 2.66666667, 1.33333333], [ 1., 4., 2.5],[ 1., 2.66666667, 1.33333333]] ;from collections import Counter" "Counter(map(tuple, list1))"
100000 loops, best of 3: 7.92 usec per loop
From PEP 0289 -- Generator Expressions:
The semantics of a generator expression are equivalent to creating an anonymous generator function and calling it. For example:
g = (x**2 for x in range(10)) print g.next()
is equivalent to:
def __gen(exp): for x in exp: yield x**2 g = __gen(iter(range(10))) print g.next()
Note that since generator expressions are better in terms of memory use, if you are dealing with large data you'd better use generator expression instead of map function.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With