How to apply linregress in Pandas bygroup

Question

I would like to apply a scipy.stats.linregress within Pandas ByGroup. I had looked through the documentation but all I could see was how to apply something to a single column like

grouped.agg(np.sum)

or a function like

grouped.agg('D' : lambda x: np.std(x, ddof=1))

But how do I apply a linregress which has TWO inputs X and Y?

Andy Hayden · Accepted Answer

The linregress function, as well as many other scipy/numpy functions, accepts "array-like" X and Y, both Series and DataFrame could qualify.

For example:

from scipy.stats import linregress
X = pd.Series(np.arange(10))
Y = pd.Series(np.arange(10))

In [4]: linregress(X, Y)
Out[4]: (1.0, 0.0, 1.0, 4.3749999999999517e-80, 0.0)

In fact, being able to use scipy (and numpy) functions is one of pandas killer features!

So if you have a DataFrame you can use linregress on its columns (which are Series):

linregress(df['col_X'], df['col_Y'])

and if using a groupby you can similarly apply (to each group):

grouped.apply(lambda x: linregress(x['col_X'], x['col_Y']))

How to apply linregress in Pandas bygroup

Tags:

python

pandas

user1911866

1 Answers

Andy Hayden

Recent Activity

Donate For Us

How to apply linregress in Pandas bygroup

Tags:

python

pandas

user1911866

1 Answers

Andy Hayden

Related questions

Recent Activity

Donate For Us