Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas assign value to cell based on values of other cells in row

Given the following data frame:

import pandas as pd
import numpy as np
DF = pd.DataFrame({'COL1': ['a','b','b'], 
                   'COL2' : [0,np.nan,1],})

DF

    COL1    COL2
0    a        0      
1    b       NaN     
2    b        1      

I want to be able to assign a new column COL3 that has a value of 2 for every row where COL1 is b and COL2 is not null.

The desired result is as follows:

    COL1    COL2    COL3
0    a        0      0
1    b       NaN     0
2    b        1      2

Thanks in advance!

like image 495
Dance Party Avatar asked Jan 17 '16 06:01

Dance Party


People also ask

How do I get a column value of a pandas DataFrame based on another column in Python?

You can extract a column of pandas DataFrame based on another value by using the DataFrame. query() method. The query() is used to query the columns of a DataFrame with a boolean expression. The blow example returns a Courses column where the Fee column value matches with 25000.

How do I change the value of a column based on another column?

Update column based on another column using CASE statement We use a CASE statement to specify new value of first_name column for each value of id column. This is a much better approach than using WHERE clause because with WHERE clause we can only change a column value to one new value.

How do I change row values based on conditions in pandas?

You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.

How do I change the value of a specific cell in pandas?

if we want to modify the value of the cell [0,"A"] u can use one of those solution : df. iat[0,0] = 2. df.at[0,'A'] = 2.

How do you apply a function to each row of a column in pandas?

In order to apply a function to every row, you should use axis=1 param to apply(), default it uses axis=0 meaning it applies a function to each column. By applying a function to each row, we can create a new column by using the values from the row, updating the row e.t.c.

How do you assign a value to a column in a data frame?

DataFrame - assign() function The assign() function is used to assign new columns to a DataFrame. Returns a new object with all original columns in addition to new ones. Existing columns that are re-assigned will be overwritten. The column names are keywords.


3 Answers

Define a function to return your value based on other columns.

def value_handle (row):
    if row['COL1'] == 'b' and not pd.isnull(row['COL2']) :
        return 2
    else:
        return 0

Then call the new function when introducing the new column.

DF['COL3'] = DF.apply (lambda row: value_handle (row),axis=1)
like image 34
madawa Avatar answered Oct 30 '22 07:10

madawa


This can be achieved using the apply method on the DataFrame. You'll need to pass in a function to apply to each row and set the axis to 1 to set it to the correct mode (apply for each row, instead of for each column).

Here's a working example:

def row_handler(row):
    if row['COL1'] == 'b' and not np.isnan(row['COL2']):
        return 2
    return 0

DF['COL3'] = DF.apply(row_handler, axis=1)

Which returns this:

>> print DF
  COL1  COL2  COL3
0    a     0     0
1    b   NaN     0
2    b     1     2
like image 200
lextoumbourou Avatar answered Oct 30 '22 07:10

lextoumbourou


You can use numpy.where with isin and notnull:

DF['COL3'] = np.where((DF['COL1'].isin(['b'])) &(DF['COL2'].notnull()), 2, 0)
print DF 


  COL1  COL2  COL3
0    a     0     0
1    b   NaN     0
2    b     1     2
like image 36
jezrael Avatar answered Oct 30 '22 08:10

jezrael