Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas/Python: Set value of one column based on value in another column

I need to set the value of one column based on the value of another in a Pandas dataframe. This is the logic:

if df['c1'] == 'Value':     df['c2'] = 10 else:     df['c2'] = df['c3'] 

I am unable to get this to do what I want, which is to simply create a column with new values (or change the value of an existing column: either one works for me).

If I try to run the code above or if I write it as a function and use the apply method, I get the following:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). 
like image 827
NLR Avatar asked Mar 07 '18 21:03

NLR


People also ask

How do I get the value of a column in a DataFrame based on another column?

You can extract a column of pandas DataFrame based on another value by using the DataFrame. query() method. The query() is used to query the columns of a DataFrame with a boolean expression.

How do I create a conditional column in Pandas?

You can create a conditional column in pandas DataFrame by using np. where() , np. select() , DataFrame. map() , DataFrame.


1 Answers

one way to do this would be to use indexing with .loc.

Example

In the absence of an example dataframe, I'll make one up here:

import numpy as np import pandas as pd  df = pd.DataFrame({'c1': list('abcdefg')}) df.loc[5, 'c1'] = 'Value'  >>> df       c1 0      a 1      b 2      c 3      d 4      e 5  Value 6      g 

Assuming you wanted to create a new column c2, equivalent to c1 except where c1 is Value, in which case, you would like to assign it to 10:

First, you could create a new column c2, and set it to equivalent as c1, using one of the following two lines (they essentially do the same thing):

df = df.assign(c2 = df['c1']) # OR: df['c2'] = df['c1'] 

Then, find all the indices where c1 is equal to 'Value' using .loc, and assign your desired value in c2 at those indices:

df.loc[df['c1'] == 'Value', 'c2'] = 10 

And you end up with this:

>>> df       c1  c2 0      a   a 1      b   b 2      c   c 3      d   d 4      e   e 5  Value  10 6      g   g 

If, as you suggested in your question, you would perhaps sometimes just want to replace the values in the column you already have, rather than create a new column, then just skip the column creation, and do the following:

df['c1'].loc[df['c1'] == 'Value'] = 10 # or: df.loc[df['c1'] == 'Value', 'c1'] = 10 

Giving you:

>>> df       c1 0      a 1      b 2      c 3      d 4      e 5     10 6      g 
like image 187
sacuL Avatar answered Oct 06 '22 09:10

sacuL