Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fill the numbers between two columns in Pandas data frame

I have a Pandas dataframe with below columns:

id  start  end
1   101    101
2   102    104
3   108    109

I want to fill the gaps between start and end with additional rows, so the output may look like this:

id  number
1    101
2    102
2    103
2    104
3    108
3    109

Is there anyway to do it in Pandas? Thanks.

like image 383
Calvin Avatar asked Sep 14 '25 04:09

Calvin


1 Answers

Use nested list comprehension with range and flattening for list of tuples, last use DataFrame constructor:

zipped = zip(df['id'], df['start'], df['end'])

df = pd.DataFrame([(i, y) for i, s, e in zipped for y in range(s, e+1)],
                   columns=['id','number'])
print (df)
   id  number
0   1     101
1   2     102
2   2     103
3   2     104
4   3     108
5   3     109
like image 68
jezrael Avatar answered Sep 15 '25 19:09

jezrael