I have a list as follows. <pre class="prettyprint"><code>[['Andrew', '1', '9'], ['Peter', '1', '10'], ['Andrew', '1', '8'], ['Peter', '1', '11'], ['Sam', '4', '9'], ['Andrew', '2', '2']] </code></pre> I would like sum up the last column grouped by the other columns.The result is like this <pre class="prettyprint"><code>[['Andrew', '1', '17'], ['Peter', '1', '21'], ['Sam', '4', '9'], ['Andrew', '2', '2']] </code></pre> which is still a list. In real practice, I would always like to sum up the last column grouped by many other columns. Is there a way I can do this in Python? Much appreciated.

Op1 You can pass a index <code>sum</code> and add tolist convert back to list <pre class="prettyprint"><code>pd.DataFrame(L).\ set_index([0,1])[2].astype(int).sum(level=[0,1]).\ reset_index().values.tolist() Out[78]: [['Andrew', '1', 17], ['Peter', '1', 21], ['Sam', '4', 9], ['Andrew', '2', 2]] </code></pre> <hr> Op2 For list of list you can using <code>groupby</code> from <code>itertools</code> <pre class="prettyprint"><code>from itertools import groupby [k+[sum(int(v) for _,_, v in g)] for k, g in groupby(sorted(l), key = lambda x: [x[0],x[1]])] Out[98]: [['Andrew', '1', 17], ['Andrew', '2', 2], ['Peter', '1', 21], ['Sam', '4', 9]] </code></pre>

Create to <code>DataFrame</code> and aggregate third column converted to integers by first and second columns, last convert back to <code>list</code>s: <pre class="prettyprint"><code>df = pd.DataFrame(L) L = df[2].astype(int).groupby([df[0], df[1]]).sum().reset_index().values.tolist() print (L) [['Andrew', '1', 17], ['Andrew', '2', 2], ['Peter', '1', 21], ['Sam', '4', 9]] </code></pre> And solution with defaultdict, python 3.x only: <pre class="prettyprint"><code>from collections import defaultdict d = defaultdict(int) #https://stackoverflow.com/a/10532492 for *head, tail in L: d[tuple(head)] += int(tail) d = [[*i, j] for i, j in sorted(d.items())] print (d) [['Andrew', '1', 17], ['Andrew', '2', 2], ['Peter', '1', 21], ['Sam', '4', 9]] </code></pre>

How to sum a column grouped by other columns in a list?

Tags:

python

list

pandas

dataframe

pandas-groupby

I have a list as follows.

[['Andrew', '1', '9'], ['Peter', '1', '10'], ['Andrew', '1', '8'], ['Peter', '1', '11'], ['Sam', '4', '9'], ['Andrew', '2', '2']]

I would like sum up the last column grouped by the other columns.The result is like this

[['Andrew', '1', '17'], ['Peter', '1', '21'], ['Sam', '4', '9'], ['Andrew', '2', '2']]

which is still a list.

In real practice, I would always like to sum up the last column grouped by many other columns. Is there a way I can do this in Python? Much appreciated.

665

asked Mar 28 '18 13:03

Deepleeqe

4 Answers

dynamically grouping by all columns except the last one:

In [24]: df = pd.DataFrame(data)

In [25]: df.groupby(df.columns[:-1].tolist(), as_index=False).agg(lambda x: x.astype(int).sum()).values.tolist()
Out[25]: [['Andrew', '1', 17], ['Andrew', '2', 2], ['Peter', '1', 21], ['Sam', '4', 9]]

193

answered Oct 08 '22 04:10

MaxU - stop WAR against UA

This is an O(n) solution via collections.defaultdict, adaptable to any number of keys.

If your desired output is a list, then this may be preferable to a solution via Pandas, which requires conversion to and from a non-standard type.

from collections import defaultdict

lst = [['Andrew', '1', '9'], ['Peter', '1', '10'], ['Andrew', '1', '8'],
       ['Peter', '1', '11'], ['Sam', '4', '9'], ['Andrew', '2', '2']]

d = defaultdict(int)

for *keys, val in lst:
    d[tuple(keys)] += int(val)

res = [[*k, v] for k, v in sorted(d.items())]

Result

[['Andrew', '1', 17], ['Andrew', '2', 2], ['Peter', '1', 21], ['Sam', '4', 9]]

Explanation

Cycle through your list of lists, define keys / value and add to your defaultdict of lists.
Use a list comprehension to convert dictionary to desired output.

answered Oct 08 '22 03:10

jpp

Op1

You can pass a index sum and add tolist convert back to list

pd.DataFrame(L).\
   set_index([0,1])[2].astype(int).sum(level=[0,1]).\
        reset_index().values.tolist()
Out[78]: [['Andrew', '1', 17], ['Peter', '1', 21], ['Sam', '4', 9], ['Andrew', '2', 2]]

Op2

For list of list you can using groupby from itertools

from itertools import groupby
[k+[sum(int(v) for _,_, v in g)] for k, g in groupby(sorted(l), key = lambda x: [x[0],x[1]])]
Out[98]: [['Andrew', '1', 17], ['Andrew', '2', 2], ['Peter', '1', 21], ['Sam', '4', 9]]

answered Oct 08 '22 04:10

BENY

Create to DataFrame and aggregate third column converted to integers by first and second columns, last convert back to lists:

df = pd.DataFrame(L)
L = df[2].astype(int).groupby([df[0], df[1]]).sum().reset_index().values.tolist()
print (L)
[['Andrew', '1', 17], ['Andrew', '2', 2], ['Peter', '1', 21], ['Sam', '4', 9]]

And solution with defaultdict, python 3.x only:

from collections import defaultdict

d = defaultdict(int)
#https://stackoverflow.com/a/10532492
for *head, tail in L:
    d[tuple(head)] += int(tail)

d = [[*i, j] for i, j in sorted(d.items())]
print (d)
[['Andrew', '1', 17], ['Andrew', '2', 2], ['Peter', '1', 21], ['Sam', '4', 9]]

answered Oct 08 '22 02:10

jezrael

Related questions
                            
                                Converting a list into comma-separated string with "and" before the last item - Python 2.7
                            
                                Sunrise and Sunset time in Python
                            
                                dropping empty columns in pandas 0.23+ [duplicate]
                            
                                Anaconda Error - module 'brotli' has no attribute 'error'
                            
                                3d game with Python, starting from nothing [closed]
                            
                                No hosts found: Fabric
                            
                                opencv python osx
                            
                                Ruby’s “method_missing” in Python [duplicate]
                            
                                What's the most efficient way to convert a MySQL result set to a NumPy array?
                            
                                Remove all inline styles using BeautifulSoup
                            
                                Calling AppleScript from Python without using osascript or appscript?
                            
                                Truth value of a string in python
                            
                                Python Django custom template tags register.assignment_tag not working
                            
                                convert rgba color codes 255,255,255,255 to kivy color codes in 1,1,1,1
                            
                                class labels in Pandas scattermatrix
                            
                                TypeError: sequence of byte string values expected, value of type str found
                            
                                Remove all hex characters from string in Python
                            
                                In Jupyter Lab, execute editor code in Python console
                            
                                How to create a nested dictionary from a list in Python?
                            
                                python difference between round and int

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With