Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas series repeat n time and change column value

Tags:

python

pandas

I have input data like this.

NAME | PLACE | DATE
  A  |   X   | 2020-04-30
  B  |   Y   | 2019-04-30

i want to duplicate 5 times and change date by increasing years

NAME | PLACE | DATE
  A  |   X   | 2020-04-30
  A  |   X   | 2021-04-30
  A  |   X   | 2022-04-30
  A  |   X   | 2023-04-30
  A  |   X   | 2024-04-30
  A  |   X   | 2025-04-30
  B  |   Y   | 2019-04-30
  B  |   Y   | 2020-04-30
  B  |   Y   | 2021-04-30
  B  |   Y   | 2022-04-30
  B  |   Y   | 2023-04-30
  B  |   Y   | 2024-04-30

Is this possible to do using pandas repeat ?.

like image 558
Tech Avatar asked Jul 25 '20 06:07

Tech


People also ask

How do you repeat a series on pandas?

Pandas Series: repeat() function The repeat() function is used to repeat elements of a Series. Returns a new Series where each element of the current Series is repeated consecutively a given number of times. The number of repetitions for each element. This should be a non-negative integer.

How do you repeat a column and time in Python?

In Python, if you want to repeat the elements multiple times in the NumPy array then you can use the numpy. repeat() function.

Can you use ILOC on a series?

iloc attribute enables purely integer-location based indexing for selection by position over the given Series object. Example #1: Use Series. iloc attribute to perform indexing over the given Series object.


3 Answers

Use:

df['Date'] = pd.to_datetime(df['Date'])

y = np.array([pd.offsets.DateOffset(years=_) for _ in np.tile(range(6), len(df.index))])
df = df.reindex(df.index.repeat(6)).assign(Date=lambda x: x['Date'] + y)

Details:

Create a np.array of DateOffset objects that needs to be added to the Date column to get the desired year offset.

print(y)
array([<DateOffset: years=0>, <DateOffset: years=1>,
       <DateOffset: years=2>, <DateOffset: years=3>,
       <DateOffset: years=4>, <DateOffset: years=5>,
       <DateOffset: years=0>, <DateOffset: years=1>,
       <DateOffset: years=2>, <DateOffset: years=3>,
       <DateOffset: years=4>, <DateOffset: years=5>], dtype=object)

Use reindex to reindex the dataframe as required and use assign to add the Date with the years.

print(df)
  Name Place       Date
0    A     X 2020-04-30
0    A     X 2021-04-30
0    A     X 2022-04-30
0    A     X 2023-04-30
0    A     X 2024-04-30
0    A     X 2025-04-30
1    B     Y 2019-04-30
1    B     Y 2020-04-30
1    B     Y 2021-04-30
1    B     Y 2022-04-30
1    B     Y 2023-04-30
1    B     Y 2024-04-30
like image 102
Shubham Sharma Avatar answered Oct 22 '22 06:10

Shubham Sharma


Let's try this, convert single date to an array of dates for the given range & will utilize DataFrame.explode to transform each element of a list-like to a row.

import pandas as pd

df = pd.DataFrame({
    "Name": ["A", "B"],
    "Place": ["X", "Y"],
    "Date": ["2020-04-30", "2020-04-30"]
})

expand = 5
print(
    df.assign(
        Date=pd.to_datetime(df.Date)
            .apply(lambda x: [x.replace(x.year + i) for i in range(0, expand + 1)])
    ).explode("Date").reset_index(drop=True)
)

   Name Place       Date
0     A     X 2020-04-30
1     A     X 2021-04-30
2     A     X 2022-04-30
3     A     X 2023-04-30
4     A     X 2024-04-30
5     A     X 2025-04-30
6     B     Y 2020-04-30
7     B     Y 2021-04-30
8     B     Y 2022-04-30
9     B     Y 2023-04-30
10    B     Y 2024-04-30
11    B     Y 2025-04-30
like image 43
sushanth Avatar answered Oct 22 '22 07:10

sushanth


Here is a way to do it:

df_out = df.reindex(df.index.repeat(6))

df_out['DATE'] += pd.Series([pd.DateOffset(years=i) 
                              for i in df_out.groupby('AME').cumcount()], 
                            index=df_out.index)    
df_out.reset_index(drop=True)

Output:

      AME    PLACE       DATE
0     A       X    2020-04-30
1     A       X    2021-04-30
2     A       X    2022-04-30
3     A       X    2023-04-30
4     A       X    2024-04-30
5     A       X    2025-04-30
6     B       Y    2019-04-30
7     B       Y    2020-04-30
8     B       Y    2021-04-30
9     B       Y    2022-04-30
10    B       Y    2023-04-30
11    B       Y    2024-04-30
like image 29
Scott Boston Avatar answered Oct 22 '22 06:10

Scott Boston