I have a dataframe: <pre class="prettyprint"><code>df = pd.DataFrame( {'number': ['10', '20' , '30', '40'], 'condition': ['A', 'B', 'A', 'B']}) df = number condition 0 10 A 1 20 B 2 30 A 3 40 B </code></pre> I want to apply a function to each element within the number column, as follows: <pre class="prettyprint"><code> df['number'] = df['number'].apply(lambda x: func(x)) </code></pre> BUT, even though I apply the function to the number column, I want the function to also make reference to the <code>condition</code> column i.e. in pseudo code: <pre class="prettyprint"><code>func(n): #if the value in corresponding condition column is equal to some set of values: # do some stuff to n using the value in condition # return new value for n </code></pre> For a single number, and an example function I would write: <pre class="prettyprint"><code>number = 10 condition = A def func(num, condition): if condition == A: return num*3 if condition == B: return num*4 func(number,condition) = 15 </code></pre> How can I incorporate the same function to my <code>apply</code> statement written above? i.e. making reference to the value within the condition column, while acting on the value within the number column? Note: I have read through the docs on <code>np.where()</code>, <code>pandas.loc()</code> and <code>pandas.index()</code> but I just cannot figure out how to put it into practice. I am struggling with the syntax for referencing the other column from within the function, as I need access to both the values in the <code>number</code> and <code>condition</code> column. As such, my expected output is: <pre class="prettyprint"><code>df = number condition 0 30 A 1 80 B 2 90 A 3 160 B </code></pre> UPDATE: The above was far too vague. Please see the following: <pre class="prettyprint"><code>df1 = pd.DataFrame({'Entries':['man','guy','boy','girl'],'Conflict':['Yes','Yes','Yes','No']}) Entries Conflict 0 "man" "Yes" 1 "guy" "Yes" 2 "boy" "Yes" 3 "girl" "No def funcA(d): d = d + 'aaa' return d def funcB(d): d = d + 'bbb' return d df1['Entries'] = np.where(df1['Conflict'] == 'Yes', funcA, funcB) Output: {'Conflict': ['Yes', 'Yes', 'Yes', 'Np'], 'Entries': array(<function funcB at 0x7f4acbc5a500>, dtype=object)} </code></pre> How can I apply the above np.where statement to take a pandas series as mentioned in the comments, and produce the desired output shown below: Desired Output: <pre class="prettyprint"><code> Entries Conflict 0 "manaaa" "Yes" 1 "guyaaa" "Yes" 2 "boyaaa" "Yes" 3 "girlbbb" "No </code></pre>

As the question was in regard to the apply function to a dataframe column for the same row, it seems more accurate to use the pandas <code>apply</code> funtion in combination with <code>lambda</code>: <pre class="prettyprint"><code>import pandas as pd df = pd.DataFrame({'number': [10, 20 , 30, 40], 'condition': ['A', 'B', 'A', 'B']}) def func(number,condition): multiplier = {'A': 2, 'B': 4} return number * multiplier[condition] df['new_number'] = df.apply(lambda x: func(x['number'], x['condition']), axis=1) </code></pre> In this example, <code>lambda</code> takes the columns 'number' and 'condition' of the dataframe df and applies these columns of the same row to the function func with <code>apply</code>. This returns the following result: <pre class="prettyprint"><code>df Out[10]: condition number new_number 0 A 10 20 1 B 20 80 2 A 30 60 3 B 40 160 </code></pre> For the UPDATE case its also possible to use the pandas <code>apply</code> function: <pre class="prettyprint"><code>df1 = pd.DataFrame({'Entries':['man','guy','boy','girl'],'Conflict':['Yes','Yes','Yes','No']}) def funcA(d): d = d + 'aaa' return d def funcB(d): d = d + 'bbb' return d df1['Entries'] = df1.apply(lambda x: funcA(x['Entries']) if x['Conflict'] == 'Yes' else funcB(x['Entries']), axis=1) </code></pre> In this example, <code>lambda</code> takes the columns 'Entries' and 'Conflict' of the dataframe df and applies these columns either to funcA or funcB of the same row with <code>apply</code>. The condition if funcA or funcB will be applied is done with an <code>if-else</code> clause in lambda. This returns the following result: <pre class="prettyprint"><code>df Out[12]: Conflict Entries 0 Yes manaaa 1 Yes guyaaa 2 Yes boyaaa 3 No girlbbb </code></pre>

Apply function to dataframe column element based on value in other column for same row?

I have a dataframe:

df = pd.DataFrame(
    {'number': ['10', '20' , '30', '40'], 'condition': ['A', 'B', 'A', 'B']})

df = 
    number    condition
0    10         A
1    20         B
2    30         A
3    40         B

I want to apply a function to each element within the number column, as follows:

 df['number'] = df['number'].apply(lambda x: func(x))

BUT, even though I apply the function to the number column, I want the function to also make reference to the condition column i.e. in pseudo code:

func(n):
    #if the value in corresponding condition column is equal to some set of values:
        # do some stuff to n using the value in condition
        # return new value for n

For a single number, and an example function I would write:

number = 10
condition = A
def func(num, condition):
    if condition == A:
        return num*3
    if condition == B:
        return num*4

func(number,condition) = 15

How can I incorporate the same function to my apply statement written above? i.e. making reference to the value within the condition column, while acting on the value within the number column?

Note: I have read through the docs on np.where(), pandas.loc() and pandas.index() but I just cannot figure out how to put it into practice.

I am struggling with the syntax for referencing the other column from within the function, as I need access to both the values in the number and condition column.

As such, my expected output is:

df = 
    number    condition
0    30         A
1    80         B
2    90         A
3    160         B

UPDATE: The above was far too vague. Please see the following:

df1 = pd.DataFrame({'Entries':['man','guy','boy','girl'],'Conflict':['Yes','Yes','Yes','No']})


    Entries    Conflict
0    "man"    "Yes"
1    "guy"    "Yes"
2    "boy"    "Yes"
3    "girl"   "No

def funcA(d):
    d = d + 'aaa'
    return d
def funcB(d):
    d = d + 'bbb'
    return d

df1['Entries'] = np.where(df1['Conflict'] == 'Yes', funcA, funcB)

Output:
{'Conflict': ['Yes', 'Yes', 'Yes', 'Np'],
 'Entries': array(<function funcB at 0x7f4acbc5a500>, dtype=object)}

How can I apply the above np.where statement to take a pandas series as mentioned in the comments, and produce the desired output shown below:

Desired Output:

    Entries    Conflict
0    "manaaa"    "Yes"
1    "guyaaa"    "Yes"
2    "boyaaa"    "Yes"
3    "girlbbb"   "No

How to apply a function to each row/column in Dataframe?

There are different ways to apply a function to each row or column in DataFrame. We will learn about various ways in this post. Let’s create a small dataframe first and see that. Method 1: Applying lambda function to each row/column. In the above examples, we saw how a user defined function is applied to each row and column.

Can we apply a function to more than one column?

Example 2: For Row. We can also apply a function to more than one column or row in the dataframe. Example 2: For Row. How to Apply a function to multiple columns in Pandas?

How do I apply a function to a column in pandas?

In Pandas, columns and dataframes can be transformed and manipulated using methods such as apply () and transform (). The desired transformations are passed in as arguments to the methods as functions. Each method has its subtle differences and utility. This article will introduce how to apply a function to a column or an entire dataframe.

How do I transform a column in a Dataframe in pandas?

Use transform () to Apply a Function to Pandas DataFrame Column In Pandas, columns and dataframes can be transformed and manipulated using methods such as apply () and transform (). The desired transformations are passed in as arguments to the methods as functions. Each method has its subtle differences and utility.

As the question was in regard to the apply function to a dataframe column for the same row, it seems more accurate to use the pandas apply funtion in combination with lambda:

import pandas as pd
df = pd.DataFrame({'number': [10, 20 , 30, 40], 'condition': ['A', 'B', 'A', 'B']})

def func(number,condition):
    multiplier = {'A': 2, 'B': 4}
    return number * multiplier[condition]

df['new_number'] = df.apply(lambda x: func(x['number'], x['condition']), axis=1)

In this example, lambda takes the columns 'number' and 'condition' of the dataframe df and applies these columns of the same row to the function func with apply.

This returns the following result:

df
Out[10]: 
 condition  number  new_number
0   A   10  20
1   B   20  80
2   A   30  60
3   B   40  160

For the UPDATE case its also possible to use the pandas apply function:

df1 = pd.DataFrame({'Entries':['man','guy','boy','girl'],'Conflict':['Yes','Yes','Yes','No']})

def funcA(d):
    d = d + 'aaa'
    return d
def funcB(d):
    d = d + 'bbb'
    return d

df1['Entries'] = df1.apply(lambda x: funcA(x['Entries']) if x['Conflict'] == 'Yes' else funcB(x['Entries']), axis=1)

In this example, lambda takes the columns 'Entries' and 'Conflict' of the dataframe df and applies these columns either to funcA or funcB of the same row with apply. The condition if funcA or funcB will be applied is done with an if-else clause in lambda.

This returns the following result:

df
Out[12]:


    Conflict    Entries
0   Yes     manaaa
1   Yes     guyaaa
2   Yes     boyaaa
3   No  girlbbb

I don't know about using pandas.DataFrame.apply, but you could define a certain condition:multiplier key-value mapping (seen in multiplier below), and pass that into your function. Then you can use a list comprehension to calculate the new number output based on those conditions:

import pandas as pd
df = pd.DataFrame({'number': [10, 20 , 30, 40], 'condition': ['A', 'B', 'A', 'B']})

multiplier = {'A': 2, 'B': 4}

def func(num, condition, multiplier):
    return num * multiplier[condition]

df['new_number'] = [func(df.loc[idx, 'number'], df.loc[idx, 'condition'], 
                     multiplier) for idx in range(len(df))]

Here's the result:

df
Out[24]: 
  condition  number  new_number
0         A      10          30
1         B      20          80
2         A      30          90
3         B      40         160

There is likely a vectorized, pure-pandas solution that's more "ideal." But this works, too, in a pinch.

Apply function to dataframe column element based on value in other column for same row?

Tags:

python

pandas

numpy

Chuck

People also ask

2 Answers

Rene B.

blacksite

Recent Activity

Donate For Us

Apply function to dataframe column element based on value in other column for same row?

Tags:

python

pandas

numpy

Chuck

People also ask

2 Answers

Rene B.

blacksite

Related questions

Recent Activity

Donate For Us