Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create multiple pandas DataFrame columns from applying a function with multiple returns

Tags:

python

pandas

I'd like to apply a function with multiple returns to a pandas DataFrame and put the results in separate new columns in that DataFrame.

So given something like this:

import pandas as pd

df = pd.DataFrame(data = {'a': [1, 2, 3], 'b': [4, 5, 6]})

def add_subtract(a, b):
  return (a + b, a - b)

The goal is a single command that calls add_subtract on a and b to create two new columns in df: sum and difference.

I thought something like this might work:

(df['sum'], df['difference']) = df.apply(
    lambda row: add_subtract(row['a'], row['b']), axis=1)

But it yields this error:

----> 9 lambda row: add_subtract(row['a'], row['b']), axis=1)

ValueError: too many values to unpack (expected 2)

EDIT: In addition to the below answers, pandas apply function that returns multiple values to rows in pandas dataframe shows that the function can be modified to return a list or Series, i.e.:

def add_subtract_list(a, b):
  return [a + b, a - b]

df[['sum', 'difference']] = df.apply(
    lambda row: add_subtract_list(row['a'], row['b']), axis=1)

or

def add_subtract_series(a, b):
  return pd.Series((a + b, a - b))

df[['sum', 'difference']] = df.apply(
    lambda row: add_subtract_series(row['a'], row['b']), axis=1)

both work (the latter being equivalent to Wen's accepted answer).

like image 285
Max Ghenis Avatar asked Dec 26 '17 20:12

Max Ghenis


People also ask

How do you return multiple columns from pandas using the apply function?

Return Multiple Columns from pandas apply() You can return a Series from the apply() function that contains the new data. pass axis=1 to the apply() function which applies the function multiply to each row of the DataFrame, Returns a series of multiple columns from pandas apply() function.

How do I apply a function to all columns in pandas?

Using pandas. DataFrame. apply() method you can execute a function to a single column, all and list of multiple columns (two or more).


1 Answers

Adding pd.Series

df[['sum', 'difference']] = df.apply(
    lambda row: pd.Series(add_subtract(row['a'], row['b'])), axis=1)
df

yields

   a  b  sum  difference
0  1  4    5          -3
1  2  5    7          -3
2  3  6    9          -3
like image 142
BENY Avatar answered Sep 30 '22 13:09

BENY