Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

If condition with a dataframe

I want if the conditions are true if df[df["tg"] > 10 and df[df["tg"] < 32 then multiply by five otherwise divide by two. However, I get the following error

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

d = {'year': [2001, 2001, 2001, 2001, 2001, 2001, 2001, 2001],
     'day': [1, 2, 3, 4, 1, 2, 3, 4,],
     'month': [1, 1, 1, 1, 2, 2, 2, 2],
     'tg': [10, 11, 12, 13, 50, 21, -1, 23],
     'rain': [1, 2, 3, 2, 4, 1, 2, 1]}
df = pd.DataFrame(data=d)
print(df)


[OUT]

   year  day  month  tg  rain
0  2001    1      1  10     1
1  2001    2      1  11     2
2  2001    3      1  12     3
3  2001    4      1  13     2
4  2001    1      2  50     4
5  2001    2      2  21     1
6  2001    3      2  -1     2
7  2001    4      2  23     1

df["score"] = (df["tg"] * 5) if ((df[df["tg"] > 10]) and (df[df["tg"] < 32])) else (df["tg"] / 2) 

[OUT]
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

What I want

   year  day  month  tg  rain   score
0  2001    1      1  10     1    5
1  2001    2      1  11     2    55
2  2001    3      1  12     3    60
3  2001    4      1  13     2    65
4  2001    1      2  50     4    25
5  2001    2      2  21     1    42
6  2001    3      2  -1     2    0.5
7  2001    4      2  23     1    46

like image 820
Mr. Hankey Avatar asked Nov 04 '21 16:11

Mr. Hankey


People also ask

Can you use if statements with pandas?

pandas is a Python library built to work with relational data at scale. As you work with values captured in pandas Series and DataFrames, you can use if-else statements and their logical structure to categorize and manipulate your data to reveal new insights.

How do you write if-else condition in pandas?

Use NumPy. select() to Apply the if-else Condition in a Pandas DataFrame in Python. We can define multiple conditions for a column in a list and their corresponding values in another list if the condition is True .


2 Answers

You can use where:

df['score'] = (df['tg']*5).where(df['tg'].between(10, 32), df['tg']/5)
like image 77
mozway Avatar answered Sep 25 '22 16:09

mozway


Use np.where:

# do you need `inclusive=True`? Expected output says yes, your logic says no
mask = df['tg'].between(10,32, inclusive=False)
df['score'] = df['tg'] * np.where(mask, 5, 1/2)

 # or
 # df['score'] = np.where(mask, df['tg'] * 5, df['tg']/2)

Output:

   year  day  month  tg  rain  score
0  2001    1      1  10     1    5.0
1  2001    2      1  11     2   55.0
2  2001    3      1  12     3   60.0
3  2001    4      1  13     2   65.0
4  2001    1      2  50     4   25.0
5  2001    2      2  21     1  105.0
6  2001    3      2  -1     2   -0.5
7  2001    4      2  23     1  115.0
like image 32
Quang Hoang Avatar answered Sep 23 '22 16:09

Quang Hoang