I'd like to apply a function with multiple returns to a <code>pandas DataFrame</code> and put the results in separate new columns in that <code>DataFrame</code>. So given something like this: <pre class="prettyprint"><code>import pandas as pd df = pd.DataFrame(data = {'a': [1, 2, 3], 'b': [4, 5, 6]}) def add_subtract(a, b): return (a + b, a - b) </code></pre> The goal is a single command that calls <code>add_subtract</code> on <code>a</code> and <code>b</code> to create two new columns in <code>df</code>: <code>sum</code> and <code>difference</code>. I thought something like this might work: <pre class="prettyprint"><code>(df['sum'], df['difference']) = df.apply( lambda row: add_subtract(row['a'], row['b']), axis=1) </code></pre> But it yields this error: <blockquote> ----> 9 lambda row: add_subtract(row['a'], row['b']), axis=1) ValueError: too many values to unpack (expected 2) </blockquote> EDIT: In addition to the below answers, pandas apply function that returns multiple values to rows in pandas dataframe shows that the function can be modified to return a list or <code>Series</code>, i.e.: <pre class="prettyprint"><code>def add_subtract_list(a, b): return [a + b, a - b] df[['sum', 'difference']] = df.apply( lambda row: add_subtract_list(row['a'], row['b']), axis=1) </code></pre> or <pre class="prettyprint"><code>def add_subtract_series(a, b): return pd.Series((a + b, a - b)) df[['sum', 'difference']] = df.apply( lambda row: add_subtract_series(row['a'], row['b']), axis=1) </code></pre> both work (the latter being equivalent to Wen's accepted answer).

Adding <code>pd.Series</code> <pre class="prettyprint"><code>df[['sum', 'difference']] = df.apply( lambda row: pd.Series(add_subtract(row['a'], row['b'])), axis=1) df </code></pre> yields <pre class="prettyprint"><code> a b sum difference 0 1 4 5 -3 1 2 5 7 -3 2 3 6 9 -3 </code></pre>

Create multiple pandas DataFrame columns from applying a function with multiple returns

Tags:

python

pandas

I'd like to apply a function with multiple returns to a pandas DataFrame and put the results in separate new columns in that DataFrame.

So given something like this:

import pandas as pd

df = pd.DataFrame(data = {'a': [1, 2, 3], 'b': [4, 5, 6]})

def add_subtract(a, b):
  return (a + b, a - b)

The goal is a single command that calls add_subtract on a and b to create two new columns in df: sum and difference.

I thought something like this might work:

(df['sum'], df['difference']) = df.apply(
    lambda row: add_subtract(row['a'], row['b']), axis=1)

But it yields this error:

----> 9 lambda row: add_subtract(row['a'], row['b']), axis=1)

ValueError: too many values to unpack (expected 2)

EDIT: In addition to the below answers, pandas apply function that returns multiple values to rows in pandas dataframe shows that the function can be modified to return a list or Series, i.e.:

def add_subtract_list(a, b):
  return [a + b, a - b]

df[['sum', 'difference']] = df.apply(
    lambda row: add_subtract_list(row['a'], row['b']), axis=1)

def add_subtract_series(a, b):
  return pd.Series((a + b, a - b))

df[['sum', 'difference']] = df.apply(
    lambda row: add_subtract_series(row['a'], row['b']), axis=1)

both work (the latter being equivalent to Wen's accepted answer).

285

asked Dec 26 '17 20:12

Max Ghenis

1 Answers

Adding pd.Series

df[['sum', 'difference']] = df.apply(
    lambda row: pd.Series(add_subtract(row['a'], row['b'])), axis=1)
df

yields

   a  b  sum  difference
0  1  4    5          -3
1  2  5    7          -3
2  3  6    9          -3

142

answered Sep 30 '22 13:09

BENY

Related questions
                            
                                filtering a 3D numpy array according to 2D numpy array
                            
                                Python regex to match multiple times, store results separately
                            
                                reshaping image feed to tensorflow
                            
                                confused about the `copy` attribution of `numpy.astype`
                            
                                is it bad practice to call dictConfig more than once?
                            
                                TypeError: 'DataFrameReader' object is not callable
                            
                                Open new gnome-terminal and run command
                            
                                Use "Flatten" or "Reshape" to get 1D output of unknown input shape in keras
                            
                                Very Large and Very Sparse Non Negative Matrix factorization
                            
                                Boto3 read a file content from S3 key line by line
                            
                                Pygame button single click [duplicate]
                            
                                How to get value of a cell at position (row,column) with openpyxl?
                            
                                Counting a number of same words between two columns in python pandas
                            
                                Finding False-True transitions in a numpy array
                            
                                Matrix multiplication on 4D numpy arrays
                            
                                Matrix dimensions not matching in back propagation
                            
                                how to map over a function with multiple arguments in python
                            
                                Assign values to a numpy array for each row with specified columns
                            
                                ImportError: No module named path
                            
                                How to fix the width/height to specific column/row in QGridLayout?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With