I have input data like this.
NAME | PLACE | DATE
A | X | 2020-04-30
B | Y | 2019-04-30
i want to duplicate 5 times and change date by increasing years
NAME | PLACE | DATE
A | X | 2020-04-30
A | X | 2021-04-30
A | X | 2022-04-30
A | X | 2023-04-30
A | X | 2024-04-30
A | X | 2025-04-30
B | Y | 2019-04-30
B | Y | 2020-04-30
B | Y | 2021-04-30
B | Y | 2022-04-30
B | Y | 2023-04-30
B | Y | 2024-04-30
Is this possible to do using pandas repeat ?.
Pandas Series: repeat() function The repeat() function is used to repeat elements of a Series. Returns a new Series where each element of the current Series is repeated consecutively a given number of times. The number of repetitions for each element. This should be a non-negative integer.
In Python, if you want to repeat the elements multiple times in the NumPy array then you can use the numpy. repeat() function.
iloc attribute enables purely integer-location based indexing for selection by position over the given Series object. Example #1: Use Series. iloc attribute to perform indexing over the given Series object.
Use:
df['Date'] = pd.to_datetime(df['Date'])
y = np.array([pd.offsets.DateOffset(years=_) for _ in np.tile(range(6), len(df.index))])
df = df.reindex(df.index.repeat(6)).assign(Date=lambda x: x['Date'] + y)
Details:
Create a np.array
of DateOffset
objects that needs to be added to the Date
column to get the desired year offset.
print(y)
array([<DateOffset: years=0>, <DateOffset: years=1>,
<DateOffset: years=2>, <DateOffset: years=3>,
<DateOffset: years=4>, <DateOffset: years=5>,
<DateOffset: years=0>, <DateOffset: years=1>,
<DateOffset: years=2>, <DateOffset: years=3>,
<DateOffset: years=4>, <DateOffset: years=5>], dtype=object)
Use reindex
to reindex the dataframe as required and use assign to add the Date
with the years.
print(df)
Name Place Date
0 A X 2020-04-30
0 A X 2021-04-30
0 A X 2022-04-30
0 A X 2023-04-30
0 A X 2024-04-30
0 A X 2025-04-30
1 B Y 2019-04-30
1 B Y 2020-04-30
1 B Y 2021-04-30
1 B Y 2022-04-30
1 B Y 2023-04-30
1 B Y 2024-04-30
Let's try this, convert single date to an array of dates
for the given range & will utilize DataFrame.explode
to transform each element of a list-like to a row.
import pandas as pd
df = pd.DataFrame({
"Name": ["A", "B"],
"Place": ["X", "Y"],
"Date": ["2020-04-30", "2020-04-30"]
})
expand = 5
print(
df.assign(
Date=pd.to_datetime(df.Date)
.apply(lambda x: [x.replace(x.year + i) for i in range(0, expand + 1)])
).explode("Date").reset_index(drop=True)
)
Name Place Date
0 A X 2020-04-30
1 A X 2021-04-30
2 A X 2022-04-30
3 A X 2023-04-30
4 A X 2024-04-30
5 A X 2025-04-30
6 B Y 2020-04-30
7 B Y 2021-04-30
8 B Y 2022-04-30
9 B Y 2023-04-30
10 B Y 2024-04-30
11 B Y 2025-04-30
Here is a way to do it:
df_out = df.reindex(df.index.repeat(6))
df_out['DATE'] += pd.Series([pd.DateOffset(years=i)
for i in df_out.groupby('AME').cumcount()],
index=df_out.index)
df_out.reset_index(drop=True)
Output:
AME PLACE DATE
0 A X 2020-04-30
1 A X 2021-04-30
2 A X 2022-04-30
3 A X 2023-04-30
4 A X 2024-04-30
5 A X 2025-04-30
6 B Y 2019-04-30
7 B Y 2020-04-30
8 B Y 2021-04-30
9 B Y 2022-04-30
10 B Y 2023-04-30
11 B Y 2024-04-30
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With