Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

converting daily stock data to weekly-based via pandas in Python

Tags:

I've got a DataFrame storing daily-based data which is as below:

Date              Open        High         Low       Close   Volume 2010-01-04   38.660000   39.299999   38.509998   39.279999  1293400    2010-01-05   39.389999   39.520000   39.029999   39.430000  1261400    2010-01-06   39.549999   40.700001   39.020000   40.250000  1879800    2010-01-07   40.090000   40.349998   39.910000   40.090000   836400    2010-01-08   40.139999   40.310001   39.720001   40.290001   654600    2010-01-11   40.209999   40.520000   40.040001   40.290001   963600    2010-01-12   40.160000   40.340000   39.279999   39.980000  1012800    2010-01-13   39.930000   40.669998   39.709999   40.560001  1773400    2010-01-14   40.490002   40.970001   40.189999   40.520000  1240600    2010-01-15   40.570000   40.939999   40.099998   40.450001  1244200    

What I intend to do is to merge it into weekly-based data. After grouping:

  1. the Date should be every Monday (at this point, holidays scenario should be considered when Monday is not a trading day, we should apply the first trading day in current week as the Date).
  2. Open should be Monday's (or the first trading day of current week) Open.
  3. Close should be Friday's (or the last trading day of current week) Close.
  4. High should be the highest High of trading days in current week.
  5. Low should be the lowest Low of trading days in current week.
  6. Volumn should be the sum of all Volumes of trading days in current week.

which should look like this:

Date              Open        High         Low       Close   Volume 2010-01-04   38.660000   40.700001   38.509998   40.290001  5925600    2010-01-11   40.209999   40.970001   39.279999   40.450001  6234600    

Currently, my code snippet is as below, which function should I use to mapping daily-based data to the expected weekly-based data? Many thanks!

import pandas_datareader.data as web  start = datetime.datetime(2010, 1, 1) end = datetime.datetime(2016, 12, 31) f = web.DataReader("MNST", "yahoo", start, end, session=session) print f 
like image 377
Judking Avatar asked Jan 04 '16 18:01

Judking


People also ask

How do you convert daily stock into weekly in Python?

Method 1: using Python for-loops. Function new_case_count() takes in DataFrame object, iterates over it and converts indexes, which are dates in string format, to Pandas Datetime format. Based on the date's day of the week, each week's new cases count is calculated and stored in a list.

How do I convert daily data to weekly?

Click a cell in the date column of the pivot table that Excel created in the spreadsheet. Right-click and select "Group," then "Days." Enter "7" in the "Number of days" box to group by week. Click "OK" and verify that you have correctly converted daily data to weekly data.

How do you get weekly day on pandas?

The day of the week with Monday=0, Sunday=6. Return the day of the week. It is assumed the week starts on Monday, which is denoted by 0 and ends on Sunday which is denoted by 6. This method is available on both Series with datetime values (using the dt accessor) or DatetimeIndex.


2 Answers

You can resample (to weekly), offset (shift), and apply aggregation rules as follows:

logic = {'Open'  : 'first',          'High'  : 'max',          'Low'   : 'min',          'Close' : 'last',          'Volume': 'sum'}  offset = pd.offsets.timedelta(days=-6)  f = pd.read_clipboard(parse_dates=['Date'], index_col=['Date']) f.resample('W', loffset=offset).apply(logic) 

to get:

                 Open       High        Low      Close   Volume Date                                                            2010-01-04  38.660000  40.700001  38.509998  40.290001  5925600 2010-01-11  40.209999  40.970001  39.279999  40.450001  6234600 
like image 70
Stefan Avatar answered Oct 19 '22 23:10

Stefan


In general, assuming that you have the dataframe in the form you specified, you need to do the following steps:

  1. put Date in the index
  2. resample the index.

What you have is a case of applying different functions to different columns. See.

You can resample in various ways. for e.g. you can take the mean of the values or count or so on. check pandas resample.

You can also apply custom aggregators (check the same link). With that in mind, the code snippet for your case can be given as:

f['Date'] = pd.to_datetime(f['Date']) f.set_index('Date', inplace=True) f.sort_index(inplace=True)  def take_first(array_like):     return array_like[0]  def take_last(array_like):     return array_like[-1]  output = f.resample('W',                                 # Weekly resample                     how={'Open': take_first,                           'High': 'max',                          'Low': 'min',                          'Close': take_last,                          'Volume': 'sum'},                      loffset=pd.offsets.timedelta(days=-6))  # to put the labels to Monday  output = output[['Open', 'High', 'Low', 'Close', 'Volume']] 

Here, W signifies a weekly resampling which by default spans from Monday to Sunday. To keep the labels as Monday, loffset is used. There are several predefined day specifiers. Take a look at pandas offsets. You can even define custom offsets (see).

Coming back to the resampling method. Here for Open and Close you can specify custom methods to take the first value or so on and pass the function handle to the how argument.

This answer is based on the assumption that the data seems to be daily, i.e. for each day you have only 1 entry. Also, no data is present for the non-business days. i.e. Sat and Sun. So taking the last data point for the week as the one for Friday is ok. If you so want you can use business week instead of 'W'. Also, for more complex data you may want to use groupby to group the weekly data and then work on the time indices within them.

btw a gist for the solution can be found at: https://gist.github.com/prithwi/339f87bf9c3c37bb3188

like image 43
goofd Avatar answered Oct 20 '22 00:10

goofd