Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Add a sequence number to each element in a group using python

I have a dataframe of individuals who each have multiple records. I want to enumerate the record in the sequence for each individual in python. Essentially I would like to create the 'sequence' column in the following table:

patient  date      sequence
145      20Jun2009        1
145      24Jun2009        2
145      15Jul2009        3
582      09Feb2008        1
582      21Feb2008        2
987      14Mar2010        1
987      02May2010        2
987      12May2010        3

This is essentially the same question as here, but I am working in python and unable to implement the sql solution. I suspect I can use a groupby statement with an iterable count, but have so far been unsuccessful. Thanks!

like image 207
DKA Avatar asked Mar 30 '15 18:03

DKA


People also ask

How do you assign a number to a sequence in Python?

Use the Python range() Function to Generate Sequences of Numbers. In Python, range is an immutable sequence type, meaning it's a class that generates a sequence of numbers that cannot be modified. The main advantage to the range class over other data types is that it is memory efficient.

How do you group data together in Python?

You call . groupby() and pass the name of the column that you want to group on, which is "state" . Then, you use ["last_name"] to specify the columns on which you want to perform the actual aggregation. You can pass a lot more than just a single column name to .


1 Answers

I stumbled upon the answer which was embarrassingly simple. The groupby statement has a 'cumcount()' option which will enumerate group items.

df['sequence']=df.groupby('patient').cumcount()

The caveat is that the records have to be in the order you want them enumerated.

like image 74
DKA Avatar answered Oct 13 '22 00:10

DKA