Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can't index by timestamp in pandas dataframe

I took an excel sheet which has dates and some values and want to convert them to pandas dataframe and select only rows which are between certain dates.

For some reason I cannot select a row by date index

Raw Data in Excel file

MCU                         
Timestamp   50D 10P1    10P2    10P3    10P6    10P9    10P12
12-Feb-15   25.17   5.88    5.92    5.98    6.18    6.23    6.33
11-Feb-15   25.9    6.05    6.09    6.15    6.28    6.31    6.39
10-Feb-15   26.38   5.94    6.05    6.15    6.33    6.39    6.46

Code

xls = pd.ExcelFile('e:/Data.xlsx')
vols = xls.parse(asset.upper()+'VOL',header=1)
vols.set_index('Timestamp',inplace=True)

Data before set_index

      Timestamp    50D  10P1  10P2  10P3  10P6  10P9  10P12  25P1  25P2  \
0    2015-02-12  25.17  5.88  5.92  5.98  6.18  6.23   6.33  2.98  3.08   
1    2015-02-11  25.90  6.05  6.09  6.15  6.28  6.31   6.39  3.12  3.17   
2    2015-02-10  26.38  5.94  6.05  6.15  6.33  6.39   6.46  3.01  3.16  

Data after set_index

              50D  10P1  10P2  10P3  10P6  10P9  10P12  25P1  25P2  25P3  \
Timestamp                                                                  
2015-02-12  25.17  5.88  5.92  5.98  6.18  6.23   6.33  2.98  3.08  3.21   
2015-02-11  25.90  6.05  6.09  6.15  6.28  6.31   6.39  3.12  3.17  3.32   
2015-02-10  26.38  5.94  6.05  6.15  6.33  6.39   6.46  3.01  3.16  3.31  

Output

>>> vols.index
<class 'pandas.tseries.index.DatetimeIndex'>
[2015-02-12, ..., NaT]
Length: 1478, Freq: None, Timezone: None

>>> vols[date(2015,2,12)]
*** KeyError: datetime.date(2015, 2, 12)

I would expect this not to fail, and also I should be able to select a range of dates. Tried so many combinations but not getting it.

like image 291
Delta_Fore Avatar asked Feb 14 '15 18:02

Delta_Fore


People also ask

What is pandas timestamps used for?

Pandas replacement for python datetime.datetime object. Timestamp is the pandas equivalent of python’s Datetime and is interchangeable with it in most cases. It’s the type used for the entries that make up a DatetimeIndex, and other timeseries oriented data structures in pandas.

What is index indexing in pandas?

Indexing and selecting data¶. The axis labeling information in pandas objects serves many purposes: Identifies data (i.e. provides metadata) using known indicators, important for analysis, visualization, and interactive console display. Enables automatic and explicit data alignment.

How do I multi-index data in pandas?

An easy way to do this is via Pandas’ multi-indexing functionality. If you run the following line of code on our data above (stored in the dataframe called data ), it creates a multi-index for data. We’ve chosen to index by both stock ticker and date, hence multi-indexing because we are indexing by more than one column.

How do I find the difference between two pandas timestamp objects?

Comparison between pandas timestamp objects is carried out using simple comparison operators: >, <,==,< = , >=. The difference can be calculated using a simple ‘–’ operator. Given time can be converted to pandas timestamp using pandas.Timestamp() method.


1 Answers

Using a datetime.date instance to try to retrieve the index won't work, you just need a string representation of the date, e.g. '2015-02-12' or '2015/02/14'.

Secondly, vols[date(2015,2,12)] is actually looking in your DataFrame's column headings, not the index. You can use loc to fetch row index labels instead. For example you could write vols.loc['2015-02-12']

like image 132
Alex Riley Avatar answered Oct 05 '22 04:10

Alex Riley