I have a dataframe like this,
a b
0 c1 y
1 c2 n
2 c3 n
3 c4 y
4 c5 y
I want to make duplicate of n times for the same dataframe.
to do that i used,
pd.concat([df]*3).reset_index(drop=True)
But now i have a data frame like below,
a b c
0 c1 y 2017-10-10
1 c2 n 2017-10-10
2 c3 n 2017-10-10
3 c4 y 2017-10-10
4 c5 y 2017-10-10
In this,I wanna do the same operation but column c should be added by one day. i.e.,
a b c
0 c1 y 2017-10-10
1 c2 n 2017-10-10
2 c3 n 2017-10-10
3 c4 y 2017-10-10
4 c5 y 2017-10-10
0 c1 y 2017-10-11
1 c2 n 2017-10-11
2 c3 n 2017-10-11
3 c4 y 2017-10-11
4 c5 y 2017-10-11
0 c1 y 2017-10-12
1 c2 n 2017-10-12
2 c3 n 2017-10-12
3 c4 y 2017-10-12
4 c5 y 2017-10-12
0 c1 y 2017-10-13
1 c2 n 2017-10-13
2 c3 n 2017-10-13
3 c4 y 2017-10-13
4 c5 y 2017-10-13
I tried like this,
df1=df.copy()
df2=df.copy()
df3=df.copy()
df1['c']=(df['c']+datetime.timedelta(days=1)).copy()
df2['c']=(df['c']+datetime.timedelta(days=2)).copy()
df3['c']=(df['c']+datetime.timedelta(days=3)).copy()
print pd.concat([df,df1,df2,df3])
My code works good, but i'm searching for pythonic efficient way to solve this.
One way is to use pd.DataFrame.assign
within a list comprehension:
initial_date = pd.Timestamp('2017-10-10')
# original dataframe already loaded in df
res = pd.concat([df.assign(c=initial_date + pd.Timedelta(days=i)) for i in range(4)])
print(res)
a b c
0 c1 y 2017-10-10
1 c2 n 2017-10-10
2 c3 n 2017-10-10
3 c4 y 2017-10-10
4 c5 y 2017-10-10
0 c1 y 2017-10-11
1 c2 n 2017-10-11
2 c3 n 2017-10-11
3 c4 y 2017-10-11
4 c5 y 2017-10-11
0 c1 y 2017-10-12
1 c2 n 2017-10-12
2 c3 n 2017-10-12
3 c4 y 2017-10-12
4 c5 y 2017-10-12
0 c1 y 2017-10-13
1 c2 n 2017-10-13
2 c3 n 2017-10-13
3 c4 y 2017-10-13
4 c5 y 2017-10-13
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With