Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fill rows with consecutive values and above rows using pandas

I have a data frame like this:

df

col1    col2
 1        A 
 3        B
 6        A
 10       C

I want to create a data frame from above df in such a way that, if col1 values are not consecutive, it will create another row with the next col1 value and col2 value will be the just the above value.

the data frame I am looking for should be

df
col1    col2
 1        A
 2        A
 3        B
 4        B
 5        B
 6        A
 7        A
 8        A
 9        A
 10       C

I could do it using a simple for loop, But is there any pythonic way to do it most efficiently using pandas ?

like image 619
Kallol Avatar asked Sep 18 '19 12:09

Kallol


2 Answers

Here is one way using set_index() and reindex and ffill:

df.set_index('col1').reindex(range(df.col1.min(),df.col1.max()+1)).ffill().reset_index()

#df.set_index('col1').reindex(range(df.col1.min(),df.col1.max()+1),method='ffill')\
                                                     #.reset_index()

   col1 col2
0     1    A
1     2    A
2     3    B
3     4    B
4     5    B
5     6    A
6     7    A
7     8    A
8     9    A
9    10    C
like image 51
anky Avatar answered Nov 14 '22 21:11

anky


One way is using reindex with ffill:

(df.set_index('col1')
   .reindex(range(df.col1.iloc[0], df.col1.iloc[-1]+1))
   .ffill()
   .reset_index())

    col1 col2
0     1    A
1     2    A
2     3    B
3     4    B
4     5    B
5     6    A
6     7    A
7     8    A
8     9    A
9    10    C

Or another way using Series.repeat:

df.col2.repeat(df.col1.diff().shift(-1).fillna().reset_index(drop=True)
like image 20
yatu Avatar answered Nov 14 '22 22:11

yatu