I have a dataframe of individuals who each have multiple records. I want to enumerate the record in the sequence for each individual in python. Essentially I would like to create the 'sequence' column in the following table:
patient date sequence
145 20Jun2009 1
145 24Jun2009 2
145 15Jul2009 3
582 09Feb2008 1
582 21Feb2008 2
987 14Mar2010 1
987 02May2010 2
987 12May2010 3
This is essentially the same question as here, but I am working in python and unable to implement the sql solution. I suspect I can use a groupby statement with an iterable count, but have so far been unsuccessful. Thanks!
Use the Python range() Function to Generate Sequences of Numbers. In Python, range is an immutable sequence type, meaning it's a class that generates a sequence of numbers that cannot be modified. The main advantage to the range class over other data types is that it is memory efficient.
You call . groupby() and pass the name of the column that you want to group on, which is "state" . Then, you use ["last_name"] to specify the columns on which you want to perform the actual aggregation. You can pass a lot more than just a single column name to .
I stumbled upon the answer which was embarrassingly simple. The groupby statement has a 'cumcount()' option which will enumerate group items.
df['sequence']=df.groupby('patient').cumcount()
The caveat is that the records have to be in the order you want them enumerated.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With