I have a dataframe where each row has a range of years. This is the code to build it.
original = pd.DataFrame({'City': ['Paris','Rome','New York', 'Tokyo'], 'Color': ['red', 'orange', 'blue', 'purple'], 'Years': ['2010-2012', '2019-2020', '2015-2018', '2002-2003']})
The table looks something like this.
City Color Years
Paris red 2010-2012
Rome orange 2019-2020
New York blue 2015-2018
Tokyo purple 2002-2003
I want to create a new row for each year in the range of 'Years'. The dataframe should look like this.
City Color Years
Paris red 2010
Paris red 2011
...
New York blue 2018
Tokyo purple 2002
Tokyo purple 2003
This is the code I'm using right now. I'm trying to add a new row for each year, but it only returns an empty dataframe, and I'm not sure why.
df_empty = pd.DataFrame({'City': [], 'Color': [], 'Years': []})
for index, row in original.iterrows():
dates = [int(s) for s in row['Years'].split("-") if s.isdigit()]
for i in range(dates[0],dates[1] + 1):
newrow = row
newrow.append(pd.Series([str(i)]))
df_empty.add(newrow)
Idea is Series.str.split
columns to new DataFrame
, so possible repeat index values by difference by years. Also is used GroupBy.cumcount
for add all ranges of years per index values:
df = original['Years'].str.split('-', expand=True).astype(int)
original['Years'] = df[0]
df = original.loc[original.index.repeat(df[1] - df[0] + 1)]
df['Years'] += df.groupby(level=0).cumcount()
df = df.reset_index(drop=True)
print (df)
City Color Years
0 Paris red 2010
1 Paris red 2011
2 Paris red 2012
3 Rome orange 2019
4 Rome orange 2020
5 New York blue 2015
6 New York blue 2016
7 New York blue 2017
8 New York blue 2018
9 Tokyo purple 2002
10 Tokyo purple 2003
Another solution with DataFrame.explode
and list comprehension for ranges first 4
and last 4
characters for years:
original['Years'] = [[y for y in range(int(x[:4]), int(x[-4:]) + 1)]
for x in original['Years']]
original = original.explode('Years').reset_index(drop=True)
print (original)
City Color Years
0 Paris red 2010
1 Paris red 2011
2 Paris red 2012
3 Rome orange 2019
4 Rome orange 2020
5 New York blue 2015
6 New York blue 2016
7 New York blue 2017
8 New York blue 2018
9 Tokyo purple 2002
10 Tokyo purple 2003
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With