Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python, Pandas: average every 2 rows together

pretty basic question, but was wondering:

What is the 'proper' way to average every 2 rows together in pandas Dataframe, and thus end up with only half the number of rows?

Note that this is different than the rolling_mean since it reduces the number of entries.

like image 580
AimForClarity Avatar asked Sep 29 '22 22:09

AimForClarity


1 Answers

A fast way to do it:

>>> s = pd.Series(range(10))
>>> s
0    0
1    1
2    2
3    3
4    4
5    5
6    6
7    7
8    8
9    9
>>> ((s + s.shift(-1)) / 2)[::2]
0    0.5
2    2.5
4    4.5
6    6.5
8    8.5

The "proper way" I guess would be something like:

>> a = s.index.values
>>> idx = np.array([a, a]).T.flatten()[:len(a)]
>>> idx
[0 0 1 1 2 2 3 3 4 4]
>>> s.groupby(idx).mean()
0    0.5
2    2.5
4    4.5
6    6.5
8    8.5

But it is ~2x slower and gets worse with increasing size.

like image 153
elyase Avatar answered Oct 03 '22 07:10

elyase