I think list comprehensions may give me this, but I'm not sure: any elegant solutions in Python (2.6) in general for selecting unique objects in a list and providing a count? (I've defined an <code>__eq__</code> to define uniqueness on my object definition). So in RDBMS-land, something like this: <pre class="prettyprint"><code>CREATE TABLE x(n NUMBER(1)); INSERT INTO x VALUES(1); INSERT INTO x VALUES(1); INSERT INTO x VALUES(1); INSERT INTO x VALUES(2); SELECT COUNT(*), n FROM x GROUP BY n; </code></pre> Which gives: <pre class="prettyprint"><code>COUNT(*) n ========== 3 1 1 2 </code></pre> So , here's my equivalent list in Python: <pre class="prettyprint"><code>[1,1,1,2] </code></pre> And I want the same output as the SQL SELECT gives above. EDIT: The example I gave here was simplified, I'm actually processing lists of user-defined object-instances: just for completeness I include the extra code I needed to get the whole thing to work: <pre class="prettyprint"><code>import hashlib def __hash__(self): md5=hashlib.md5() [md5.update(i) for i in self.my_list_of_stuff] return int(md5.hexdigest(),16) </code></pre> The <code>__hash__</code> method was needed to get the <code>set</code> conversion to work (I opted for the list-comprehension idea that works in 2.6 [despite the fact that I learnt that involves an inefficiency (see comments) - my data set is small enough for that not be an issue]). <code>my_list_of_stuff</code> above is a list of (Strings) on my object definition.

Lennart Regebro provided a nice one-liner that does what you want: <pre class="prettyprint"><code>>>> values = [1,1,1,2] >>> print [(x,values.count(x)) for x in set(values)] [(1, 3), (2, 1)] </code></pre> As S.Lott mentions, a defaultdict can do the same thing.

<pre class="prettyprint"><code>>>> from collections import Counter >>> Counter([1,1,1,2]) Counter({1: 3, 2: 1}) </code></pre> Counter only available in py3.1, inherits from the <code>dict</code>.

You can use <code>groupby</code> from the <code>itertools</code> module: <blockquote> Make an iterator that returns consecutive keys and groups from the iterable. The key is a function computing a key value for each element. If not specified or is None, key defaults to an identity function and returns the element unchanged. Generally, the iterable needs to already be sorted on the same key function. </blockquote> <pre class="prettyprint"><code>>>> a = [1,1,1,2] >>> [(len(list(v)), key) for (key, v) in itertools.groupby(sorted(a))] [(3, 1), (1, 2)] </code></pre> I would assume its runtime is worse than the <code>dict</code>-based solutions by SilentGhost or S.Lott since it has to sort the input sequence, but you should time that yourself. It is a list comprehension, though. It should be faster than Adam Bernier's solution, since it doesn't have to do repeated linear scans of the input sequence. If needed, the <code>sorted</code> call can be avoided by sorting the input sequence in-line.

Can Python's list comprehensions (ideally) do the equivalent of 'count(*)...group by...' in SQL?

Tags:

python

list

count

python-2.6

I think list comprehensions may give me this, but I'm not sure: any elegant solutions in Python (2.6) in general for selecting unique objects in a list and providing a count?

(I've defined an __eq__ to define uniqueness on my object definition).

So in RDBMS-land, something like this:

CREATE TABLE x(n NUMBER(1));
INSERT INTO x VALUES(1);
INSERT INTO x VALUES(1);
INSERT INTO x VALUES(1);
INSERT INTO x VALUES(2);

SELECT COUNT(*), n FROM x
GROUP BY n;

Which gives:

COUNT(*) n
==========
3        1
1        2

So , here's my equivalent list in Python:

[1,1,1,2]

And I want the same output as the SQL SELECT gives above.

EDIT: The example I gave here was simplified, I'm actually processing lists of user-defined object-instances: just for completeness I include the extra code I needed to get the whole thing to work:

import hashlib

def __hash__(self):
    md5=hashlib.md5()
    [md5.update(i) for i in self.my_list_of_stuff]
    return int(md5.hexdigest(),16)

The __hash__ method was needed to get the set conversion to work (I opted for the list-comprehension idea that works in 2.6 [despite the fact that I learnt that involves an inefficiency (see comments) - my data set is small enough for that not be an issue]). my_list_of_stuff above is a list of (Strings) on my object definition.

877

asked Jan 27 '10 16:01

monojohnny

4 Answers

Lennart Regebro provided a nice one-liner that does what you want:

>>> values = [1,1,1,2]
>>> print [(x,values.count(x)) for x in set(values)]
[(1, 3), (2, 1)]

As S.Lott mentions, a defaultdict can do the same thing.

answered Oct 13 '22 01:10

mechanical_meat

>>> from collections import Counter
>>> Counter([1,1,1,2])
Counter({1: 3, 2: 1})

Counter only available in py3.1, inherits from the dict.

answered Oct 13 '22 00:10

SilentGhost

Not easily doable as a list comprehension.

from collections import defaultdict
def group_by( someList ):
    counts = defaultdict(int)
    for value in someList:
        counts[value.aKey] += 1
    return counts

This is a very Pythonic solution. But not a list comprehension.

answered Oct 13 '22 00:10

S.Lott

You can use groupby from the itertools module:

Make an iterator that returns consecutive keys and groups from the iterable. The key is a function computing a key value for each element. If not specified or is None, key defaults to an identity function and returns the element unchanged. Generally, the iterable needs to already be sorted on the same key function.

>>> a = [1,1,1,2]
>>> [(len(list(v)), key) for (key, v) in itertools.groupby(sorted(a))]
[(3, 1), (1, 2)]

I would assume its runtime is worse than the dict-based solutions by SilentGhost or S.Lott since it has to sort the input sequence, but you should time that yourself. It is a list comprehension, though. It should be faster than Adam Bernier's solution, since it doesn't have to do repeated linear scans of the input sequence. If needed, the sorted call can be avoided by sorting the input sequence in-line.

answered Oct 13 '22 00:10

Torsten Marek

Related questions
                            
                                Where does Python first look for files?
                            
                                how to change [1,2,3,4] to '1234' using python
                            
                                Getting file extension using pattern matching in python
                            
                                What's the most elegant way to write this for loop in Python?
                            
                                Calculating minimum length among the lists inside a list
                            
                                Tkinter.PhotoImage doesn't not support png image
                            
                                Django default=datetime.now() in models always saves same datetime after uwsgi reset
                            
                                Handle notifications in Python + Selenium Chrome WebDriver
                            
                                For-loops in Python
                            
                                How to convince boss to substitute Java/Netbeans Platform for Python/PyQt? [closed]
                            
                                Pyramid of asterisks program in Python
                            
                                Why do hyphens in module names generate syntax error?
                            
                                How to write If statements for all 2^N boolean conditions (python)
                            
                                Automatically add key to Python dict
                            
                                Printing a Python list with hex elements
                            
                                Sublime Text 2/3 shortcut to show/hide/toggle tabs
                            
                                Merge numpy arrays returned from loop
                            
                                Storing values from loop in a list or tuple in Python
                            
                                How to repeat the numbers in a list in python?
                            
                                Alter all values in a Python list of lists?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Can Python's list comprehensions (ideally) do the equivalent of 'count(*)...group by...' in SQL?

Tags:

python

list

count

python-2.6

monojohnny

People also ask

4 Answers

mechanical_meat

SilentGhost

S.Lott

Torsten Marek

Recent Activity

Donate For Us