Python Pandas: Create Column That Acts As A Conditional Running Variable

Question

I'm trying to create a new dataframe column that acts as a running variable that resets to zero or "passes" under certain conditions. Below is a simplified example of what I'm looking to accomplish. Let's say I'm trying to quit drinking coffee and I'm tracking the number of days in a row i've gone without drinking any. On days where I forgot to make note of whether I drank coffee, I put "forgot", and my tally does not get influenced.

Below is how i'm currently accomplishing this, though I suspect there's a much more efficient way of going about it.

Thanks in advance!

import pandas as pd

Day = [1,2,3,4,5,6,7,8,9,10,11]  
DrankCoffee = ['no','no','forgot','yes','no','no','no','no','no','yes','no']

df = pd.DataFrame(list(zip(Day,DrankCoffee)), columns=['Day','DrankCoffee'])

df['Streak'] = 0  

s = 0

for (index,row) in df.iterrows():
   if row['DrankCoffee'] == 'no':
      s += 1
   if row['DrankCoffee'] == 'yes':
      s = 0
   else:
      pass

   df.loc[index,'Streak'] = s

enter image description here

Maarten Fabré · Accepted Answer

you can use groupby.transform

for each streak, what you're looking for is something like this:

def my_func(group):
    return (group == 'no').cumsum()

you can divide the different streak with simple comparison and cumsum

streak = (df['DrankCoffee'] == 'yes').cumsum()

then apply the transform

df['Streak'] = df.groupby(streak)['DrankCoffee'].transform(my_func)

Python Pandas: Create Column That Acts As A Conditional Running Variable

Tags:

python

python-3.x

pandas

dataframe

conditional-statements

crowsnest

1 Answers

Maarten Fabré

Recent Activity

Donate For Us

Python Pandas: Create Column That Acts As A Conditional Running Variable

Tags:

python

python-3.x

pandas

dataframe

conditional-statements

crowsnest

1 Answers

Maarten Fabré

Related questions

Recent Activity

Donate For Us