Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas convert yearly to monthly

Tags:

I'm working on pulling financial data, in which some is formatted in yearly and other is monthly. My model will need all of it monthly, therefore I need that same yearly value repeated for each month. I've been using this stack post and trying to adapt the code to my data.

Here is my dataframe:

df.head()

   date ticker value
0 1999-12-31  ECB/RA6  1.0
1 2000-12-31  ECB/RA6  4.0
2 2001-12-31  ECB/RA6  2.0
3 2002-12-31  ECB/RA6  3.0
4 2003-12-31  ECB/RA6  2.0

Here is my desired output first 5 rows:

   date ticker value
0 1999-12-31  ECB/RA6  1.0
1 2000-01-31  ECB/RA6  4.0
2 2000-02-28  ECB/RA6  4.0
3 2000-13-31  ECB/RA6  4.0
4 2000-04-30  ECB/RA6  4.0

And my code:

df['date'] = pd.to_datetime(df['date'], format='%Y-%m')
df = df.pivot(index='date', columns='ticker')
start_date = df.index.min() - pd.DateOffset(day=1)
end_date = df.index.max() + pd.DateOffset(day=31)
dates = pd.date_range(start_date, end_date, freq='M')
dates.name = 'date'
df = df.reindex(dates, method='ffill')

df = df.stack('ticker')
df = df.sortlevel(level=1)
df = df.reset_index()

However, it is not repeating the months as expected

like image 947
jKraut Avatar asked Jul 25 '17 14:07

jKraut


People also ask

How do you convert a date to month and year in Python?

Method 1: Use DatetimeIndex. month attribute to find the month and use DatetimeIndex. year attribute to find the year present in the Date.

How can I convert daily data to monthly in pandas?

You can also convert to month just by using “m” instead of “w”. For Eg.:df. resample(“m”). mean() .

How do you get pandas month name and year?

Pandas dt. month_name() function returns the month names of the DateTimeIndex with specified locale.


1 Answers

You want resample

First, you need to set the index so that resample will work. Then you backfill and reset the index.

df.set_index('date').resample('M').bfill().reset_index()

         date   ticker  value
0  1999-12-31  ECB/RA6    1.0
1  2000-01-31  ECB/RA6    4.0
2  2000-02-29  ECB/RA6    4.0
3  2000-03-31  ECB/RA6    4.0
4  2000-04-30  ECB/RA6    4.0
5  2000-05-31  ECB/RA6    4.0
6  2000-06-30  ECB/RA6    4.0
7  2000-07-31  ECB/RA6    4.0
8  2000-08-31  ECB/RA6    4.0
9  2000-09-30  ECB/RA6    4.0
10 2000-10-31  ECB/RA6    4.0
11 2000-11-30  ECB/RA6    4.0
12 2000-12-31  ECB/RA6    4.0
13 2001-01-31  ECB/RA6    2.0
14 2001-02-28  ECB/RA6    2.0
15 2001-03-31  ECB/RA6    2.0
...

To handle this per ticker

df.set_index('date').groupby('ticker', group_keys=False) \
    .resample('M').bfill().reset_index()
like image 99
piRSquared Avatar answered Sep 30 '22 13:09

piRSquared