I want to create a new column that repeats the other column every 4 rows. Use the beginning rows to fill the rows in between. For example for df
,
d = {'col1': range(1,10)}
df = pd.DataFrame(data=d)
I hope to create a col2 that returns to the following:
col1 col2
1 1
2 1
3 1
4 1
5 5
6 5
7 5
8 5
9 9
This is what I tried
df['col2'] = np.concatenate([np.repeat(df.col1.values[0::4], 4),
np.repeat(np.NaN, len(df)%3)])
It yields the error: ValueError: Length of values does not match length of index
If I change 4 to 3, the code works because len(df)
is 9. I hope to work on a code that works more universally.
Here is an approach, Dataframe.groupby.cumcount
+ pandas.Series.shift
to create a mask. Use the mask to fill col2
with col1
& use Series.ffill
missing values.
g = df.groupby(df.index % 4).cumcount()
mask = g.ne(g.shift(1))
0 True
1 False
2 False
3 False
4 True
5 False
6 False
7 False
8 True
dtype: bool
df.loc[mask, 'col2'] = df.loc[mask, 'col1']
col1 col2
0 1 1.0
1 2 NaN
2 3 NaN
3 4 NaN
4 5 5.0
5 6 NaN
6 7 NaN
7 8 NaN
8 9 9.0
df['col2'].ffill(inplace=True)
col1 col2
0 1 1.0
1 2 1.0
2 3 1.0
3 4 1.0
4 5 5.0
5 6 5.0
6 7 5.0
7 8 5.0
8 9 9.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With