<h3>Basic Problem:</h3> <p>I have several 'past' and 'present' variables that I'd like to perform a simple percent change '<em>row-wise</em>' on. For example: <code>((exports_now - exports_past)/exports_past))</code>.</p> <p>These two questions accomplish this but when I try a similar method I get an error that my function deltas gets an unknown parameter <code>axis</code>. </p> <ul> <li>How to apply a function to two columns of Pandas dataframe</li> <li>Pandas: How to use apply function to multiple columns</li> </ul> <h3>Data Example :</h3> <pre class="prettyprint"><code>exports_ past exports_ now imports_ past imports_ now ect.(6 other pairs) .23 .45 .43 .22 1.23 .13 .21 .47 .32 .23 0 0 .41 .42 .93 .23 .66 .43 .22 .21 0 .12 .47 .21 1.23 </code></pre> <p>Following the answer in the first question, </p> <h3>My solution is to use a function like this:</h3> <pre class="prettyprint"><code>def deltas(row): ''' simple pct change ''' if int(row[0]) == 0 and int(row[1]) == 0: return 0 elif int(row[0]) == 0: return np.nan else: return ((row[1] - row[0])/row[0]) </code></pre> <h3>And apply the function like this:</h3> <pre class="prettyprint"><code>df['exports_delta'] = df.groupby(['exports_past', 'exports_now']).apply(deltas, axis=1) </code></pre> <p>This generates this error : <code>TypeError: deltas() got an unexpected keyword argument 'axis'</code> Any Ideas on how to get around the axis parameter error? Or a more elegant way to calculate the pct change? The kicker with my problem is that I needs be able to apply this function across several different column pairs, so hard coding the column names like the answer in 2nd question is undesirable. Thanks!</p>

<p>Consider using the <code>pct_change</code> Series/DataFrame method to do this.</p> <pre class="prettyprint"><code>df.pct_change() </code></pre> <p>The confusion stems from two different (but equally named) <code>apply</code> functions, one on Series/DataFrame and one on groupby.</p> <pre class="prettyprint"><code>In [11]: df Out[11]: 0 1 2 0 1 1 1 1 2 2 2 </code></pre> <p>The DataFrame apply method takes an axis argument:</p> <pre class="prettyprint"><code>In [12]: df.apply(lambda x: x[0] + x[1], axis=0) Out[12]: 0 3 1 3 2 3 dtype: int64 In [13]: df.apply(lambda x: x[0] + x[1], axis=1) Out[13]: 0 2 1 4 dtype: int64 </code></pre> <p>The groupby apply doesn't, and the kwarg is passed to the function:</p> <pre class="prettyprint"><code>In [14]: g.apply(lambda x: x[0] + x[1]) Out[14]: 0 2 1 4 dtype: int64 In [15]: g.apply(lambda x: x[0] + x[1], axis=1) TypeError: <lambda>() got an unexpected keyword argument 'axis' </code></pre> <p>Note: that groupby <em>does</em> have an axis argument, so you can use it there, if you really want to:</p> <pre class="prettyprint"><code>In [16]: g1 = df.groupby(0, axis=1) In [17]: g1.apply(lambda x: x.iloc[0, 0] + x.iloc[1, 0]) Out[17]: 0 1 3 2 3 dtype: int64 </code></pre>

Pandas: df.groupby(x, y).apply() across multiple columns parameter error

Basic Problem:

I have several 'past' and 'present' variables that I'd like to perform a simple percent change 'row-wise' on. For example: ((exports_now - exports_past)/exports_past)).

These two questions accomplish this but when I try a similar method I get an error that my function deltas gets an unknown parameter axis.

How to apply a function to two columns of Pandas dataframe
Pandas: How to use apply function to multiple columns

Data Example :

exports_ past    exports_ now    imports_ past    imports_ now    ect.(6 other pairs)
   .23               .45             .43             .22              1.23
   .13               .21             .47             .32               .23
    0                 0              .41             .42               .93
   .23               .66             .43             .22               .21
    0                .12             .47             .21              1.23

Following the answer in the first question,

My solution is to use a function like this:

def deltas(row):
    '''
    simple pct change
    '''
    if int(row[0]) == 0 and int(row[1]) == 0:
        return 0
    elif int(row[0]) == 0:
        return np.nan
    else:
        return ((row[1] - row[0])/row[0])

And apply the function like this:

df['exports_delta'] = df.groupby(['exports_past', 'exports_now']).apply(deltas, axis=1)

This generates this error : TypeError: deltas() got an unexpected keyword argument 'axis' Any Ideas on how to get around the axis parameter error? Or a more elegant way to calculate the pct change? The kicker with my problem is that I needs be able to apply this function across several different column pairs, so hard coding the column names like the answer in 2nd question is undesirable. Thanks!

924

asked Jul 31 '13 14:07

agconti

1 Answers

Consider using the pct_change Series/DataFrame method to do this.

df.pct_change()

The confusion stems from two different (but equally named) apply functions, one on Series/DataFrame and one on groupby.

In [11]: df
Out[11]:
   0  1  2
0  1  1  1
1  2  2  2

The DataFrame apply method takes an axis argument:

In [12]: df.apply(lambda x: x[0] + x[1], axis=0)
Out[12]:
0    3
1    3
2    3
dtype: int64

In [13]: df.apply(lambda x: x[0] + x[1], axis=1)
Out[13]:
0    2
1    4
dtype: int64

The groupby apply doesn't, and the kwarg is passed to the function:

In [14]: g.apply(lambda x: x[0] + x[1])
Out[14]:
0    2
1    4
dtype: int64

In [15]: g.apply(lambda x: x[0] + x[1], axis=1)
TypeError: <lambda>() got an unexpected keyword argument 'axis'

Note: that groupby does have an axis argument, so you can use it there, if you really want to:

In [16]: g1 = df.groupby(0, axis=1)

In [17]: g1.apply(lambda x: x.iloc[0, 0] + x.iloc[1, 0])
Out[17]:
0
1    3
2    3
dtype: int64

answered Sep 21 '22 03:09

Andy Hayden

Related questions
                            
                                Python class variable int vs array
                            
                                How to deal with combination of text and numeric features?
                            
                                Finding integer distance between two intervals
                            
                                Python: Try/Except around whole module
                            
                                Is is safe to use a function accepts kwargs keyword arguments that are not identifiers?
                            
                                Extracting numeric data Python
                            
                                Editing a single line in a large text file
                            
                                unit testing python how tos [closed]
                            
                                Subsequent "list" commands in ipdb
                            
                                How to use chomedriver with a proxy for selenium webdriver?
                            
                                python, twistedmatrix... logging through socket
                            
                                Python Etiquette: Importing Modules
                            
                                How to get prepared query which is sent to db
                            
                                Django runserver bound to 0.0.0.0, how can I get which IP took the request?
                            
                                Using a dictionary to replace column values on given index numbers on a pandas dataframe
                            
                                Pandas accessing last non-null value
                            
                                Does WebDriverWait override ImplicitlyWait when both are used?
                            
                                Why re2 result different from re module in Python?
                            
                                django view called twice
                            
                                List of tuples in django template

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas: df.groupby(x, y).apply() across multiple columns parameter error

Tags:

python

pandas