I am needing to create a list for patients in a df that classifies them as 'high', 'medium', or 'low' depending on their BMI and if they smoke. When I current run the code, I am getting 'Medium' for all six entries. (Pseudo names and data have been used)
df = pd.DataFrame({'Name':['Jordan', 'Jess', 'Jake', 'Alice', 'Alan', 'Lauren'],
'Age':[26, 23, 19, 20, 24, 28],
'Sex':['M', 'F' , 'M', 'F', 'M', 'F'],
'BMI':[26, 22, 24, 17, 35, 20],
'Smokes':['No', 'No', 'Yes', 'No', 'Yes', 'No']})
risk_list = []
for i in df.Name:
if df.BMI.any() > 30 | df.BMI.any() < 19.99 | df.Smokes.any() == "Yes":
risk_list.append("High")
elif df.BMI.any() >= 25 & df.BMI.any() <= 29.99:
risk_list.append("Medium")
elif df.BMI.any() < 24.99 & df.BMI.any() > 19.99 and df.Smokes.any() == "No":
risk_list.append("Low")
print(risk_list)
Output:
['Medium', 'Medium', 'Medium', 'Medium', 'Medium', 'Medium']
I am new to pandas and python for that matter. I think I am close but cannot figure out why my data is not being returned correctly.
Thanks.
There are a lot of things in your codes. Just to name a few:
You need several parentheses: df.BMI.any() > 30 | df.BMI.any() < 19.99
should be (df.BMI.any() > 30) | (df.BMI.any() < 19.99)
&
is different from and
everything inside the loop, e.g. df.BMI.any()
is independent from the row you are looking at, i.e. Name
, so you would get the same values everywhere.
I think you can use np.select
:
np.select([df.BMI.gt(30) | df.BMI.lt(19.99) | df.Smokes.eq('Yes'),
df.BMI.between(25,29.99)],
['High', 'Medium'], 'Low')
Output:
array(['Medium', 'Low', 'High', 'High', 'High', 'Low'], dtype='<U6')
In addition to @QuangHoang's answer, iterating over a dataframe is somewhat intuitive. You use .iterrows()
, not your Name
column because this isn't a dictionary.
risk_list = []
for _, i in df.iterrows():
if i.BMI > 30 or i.BMI < 19.99 or i.Smokes == "Yes":
risk_list.append("High")
elif i.BMI >= 25 and i.BMI <= 29.99:
risk_list.append("Medium")
elif i.BMI < 24.99 and i.BMI > 19.99 and i.Smokes == "No":
risk_list.append("Low")
>>> print(risk_list)
['Medium', 'Low', 'High', 'High', 'High', 'Low']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With