I think list comprehensions may give me this, but I'm not sure: any elegant solutions in Python (2.6) in general for selecting unique objects in a list and providing a count?
(I've defined an __eq__
to define uniqueness on my object definition).
So in RDBMS-land, something like this:
CREATE TABLE x(n NUMBER(1));
INSERT INTO x VALUES(1);
INSERT INTO x VALUES(1);
INSERT INTO x VALUES(1);
INSERT INTO x VALUES(2);
SELECT COUNT(*), n FROM x
GROUP BY n;
Which gives:
COUNT(*) n
==========
3 1
1 2
So , here's my equivalent list in Python:
[1,1,1,2]
And I want the same output as the SQL SELECT gives above.
EDIT: The example I gave here was simplified, I'm actually processing lists of user-defined object-instances: just for completeness I include the extra code I needed to get the whole thing to work:
import hashlib
def __hash__(self):
md5=hashlib.md5()
[md5.update(i) for i in self.my_list_of_stuff]
return int(md5.hexdigest(),16)
The __hash__
method was needed to get the set
conversion to work (I opted for the list-comprehension idea that works in 2.6 [despite the fact that I learnt that involves an inefficiency (see comments) - my data set is small enough for that not be an issue]). my_list_of_stuff
above is a list of (Strings) on my object definition.
A Python list comprehension consists of brackets containing the expression, which is executed for each element along with the for loop to iterate over each element in the Python list. Python List comprehension provides a much more short syntax for creating a new list based on the values of an existing list.
One main benefit of using a list comprehension in Python is that it's a single tool that you can use in many different situations. In addition to standard list creation, list comprehensions can also be used for mapping and filtering. You don't have to use a different approach for each scenario.
List comprehensions are faster than for loops to create lists. But, this is because we are creating a list by appending new elements to it at each iteration.
List comprehensions are a syntactic form in Python allowing the programmer to loop over and transform an iterable in one line.
Lennart Regebro provided a nice one-liner that does what you want:
>>> values = [1,1,1,2]
>>> print [(x,values.count(x)) for x in set(values)]
[(1, 3), (2, 1)]
As S.Lott mentions, a defaultdict can do the same thing.
>>> from collections import Counter
>>> Counter([1,1,1,2])
Counter({1: 3, 2: 1})
Counter only available in py3.1, inherits from the dict
.
Not easily doable as a list comprehension.
from collections import defaultdict
def group_by( someList ):
counts = defaultdict(int)
for value in someList:
counts[value.aKey] += 1
return counts
This is a very Pythonic solution. But not a list comprehension.
You can use groupby
from the itertools
module:
Make an iterator that returns consecutive keys and groups from the iterable. The key is a function computing a key value for each element. If not specified or is None, key defaults to an identity function and returns the element unchanged. Generally, the iterable needs to already be sorted on the same key function.
>>> a = [1,1,1,2]
>>> [(len(list(v)), key) for (key, v) in itertools.groupby(sorted(a))]
[(3, 1), (1, 2)]
I would assume its runtime is worse than the dict
-based solutions by SilentGhost or S.Lott since it has to sort the input sequence, but you should time that yourself. It is a list comprehension, though. It should be faster than Adam Bernier's solution, since it doesn't have to do repeated linear scans of the input sequence. If needed, the sorted
call can be avoided by sorting the input sequence in-line.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With