I have a dataframe <code>df</code>: <pre class="prettyprint"><code> 2019 2020 2021 2022 A 1 10 15 15 31 2 5 4 7 9 3 0.3 0.4 0.4 0.7 4 500 600 70 90 B 1 10 15 15 31 2 5 4 7 9 3 0.3 0.4 0.4 0.7 4 500 600 70 90 C 1 10 15 15 31 2 5 4 7 9 3 0.3 0.4 0.4 0.7 4 500 600 70 90 D 1 10 15 15 31 2 5 4 7 9 3 0.3 0.4 0.4 0.7 4 500 600 70 90 </code></pre> I am trying to group by the level 1 index, <code>1, 2, 3, 4</code> and assign different aggregation functions for those <code>1, 2, 3, 4</code> indexes so that <code>1</code> is aggregated by <code>sum</code>, <code>2</code> by <code>mean</code>, and so on. So that the end result would look like this: <pre class="prettyprint"><code> 2019 2020 2021 2022 1 40 ... ... # sum 2 5 ... ... # mean 3 0.3 ... ... # mean 4 2000 ... ... # sum </code></pre> I tried: <pre class="prettyprint"><code>df.groupby(level = 1).agg({'1':'sum', '2':'mean', '3':'sum', '4':'mean'}) </code></pre> But I get that none of <code>1, 2, 3, 4</code> are in columns which they are not, so I am not sure how should I proceed with this problem.

You could use <code>apply</code> with a custom function as follows: <pre class="prettyprint"><code>import numpy as np aggs = {1: np.sum, 2: np.mean, 3: np.mean, 4: np.sum} def f(x): func = aggs.get(x.name, np.sum) return func(x) df.groupby(level=1).apply(f) </code></pre> The above code uses sum by default so <code>1</code> and <code>4</code> could be removed from <code>aggs</code> without any different results. In this way, only groups that should be handled differently from the rest need to be specified. Result: <pre class="prettyprint"><code> 2019 2020 2021 2022 1 40.0 60.0 60.0 124.0 2 5.0 4.0 7.0 9.0 3 0.3 0.4 0.4 0.7 4 2000.0 2400.0 280.0 360.0 </code></pre>

Grouping & aggregating on level 1 index & assigning different aggregation functions using pandas

Tags:

python

pandas

pandas-groupby

I have a dataframe df:

                2019            2020            2021        2022
A       1       10              15              15          31
        2       5               4               7           9
        3       0.3             0.4             0.4         0.7
        4       500             600             70          90
B       1       10              15              15          31
        2       5               4               7           9
        3       0.3             0.4             0.4         0.7
        4       500             600             70          90
C       1       10              15              15          31
        2       5               4               7           9
        3       0.3             0.4             0.4         0.7
        4       500             600             70          90
D       1       10              15              15          31
        2       5               4               7           9
        3       0.3             0.4             0.4         0.7
        4       500             600             70          90

I am trying to group by the level 1 index, 1, 2, 3, 4 and assign different aggregation functions for those 1, 2, 3, 4 indexes so that 1 is aggregated by sum, 2 by mean, and so on. So that the end result would look like this:

            2019            2020            2021        2022
1           40              ...             ...         # sum
2           5               ...             ...         # mean
3           0.3             ...             ...         # mean
4           2000            ...             ...         # sum

I tried:

df.groupby(level = 1).agg({'1':'sum', '2':'mean', '3':'sum', '4':'mean'})

But I get that none of 1, 2, 3, 4 are in columns which they are not, so I am not sure how should I proceed with this problem.

656

asked Jan 03 '22 08:01

Jonas Palačionis

2 Answers

You could use apply with a custom function as follows:

import numpy as np

aggs = {1: np.sum, 2: np.mean, 3: np.mean, 4: np.sum}
def f(x):
    func = aggs.get(x.name, np.sum)
    return func(x)
     
df.groupby(level=1).apply(f)

The above code uses sum by default so 1 and 4 could be removed from aggs without any different results. In this way, only groups that should be handled differently from the rest need to be specified.

Result:

      2019    2020   2021    2022               
1     40.0    60.0   60.0   124.0
2      5.0     4.0    7.0     9.0
3      0.3     0.4    0.4     0.7
4   2000.0  2400.0  280.0   360.0

answered Oct 23 '22 16:10

Shaido

Just in case you were after avoiding for loops. Slice and group by index and agg conditionally.

df1 = (
        df.groupby([df.index.get_level_values(level=1)]).agg(
            lambda x: x.sum() if x.index.get_level_values(level=1).isin([1,4]).any() else x.mean())
        
         
      )
df1



    2019    2020   2021   2022
1    40.0    60.0   60.0  124.0
2     5.0     4.0    7.0    9.0
3     0.3     0.4    0.4    0.7
4  2000.0  2400.0  280.0  360.0

answered Oct 23 '22 16:10

wwnde

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Grouping & aggregating on level 1 index & assigning different aggregation functions using pandas

Tags:

python

pandas

pandas-groupby

Jonas Palačionis

People also ask

2 Answers

Shaido

wwnde

Recent Activity

Donate For Us

Grouping & aggregating on level 1 index & assigning different aggregation functions using pandas

Tags:

python

pandas

pandas-groupby

Jonas Palačionis

People also ask

2 Answers

Shaido

wwnde

Related questions

Recent Activity

Donate For Us