Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the closest single row after a specific datetime index using Python Pandas

Tags:

python

pandas

DataFrame I have:

            A   B   C 
2012-01-01  1   2   3 
2012-01-05  4   5   6 
2012-01-10  7   8   9 
2012-01-15  10  11  12 

What I am using now:

date_after = dt.datetime( 2012, 1, 7 )
frame.ix[date_after:].ix[0:1]
Out[1]: 
            A  B  C
2012-01-10  7  8  9

Is there any better way of doing this? I do not like that I have to specify .ix[0:1] instead of .ix[0], but if I don't the output changes to a TimeSeries instead of a single row in a DataFrame. I find it harder to work with a rotated TimeSeries back on top of the original DataFrame.

Without .ix[0:1]:

frame.ix[date_after:].ix[0]
Out[1]: 
A    7
B    8
C    9
Name: 2012-01-10 00:00:00

Thanks,

John

like image 267
John Cornwell Avatar asked Mar 26 '12 18:03

John Cornwell


2 Answers

You might want to go directly do the index:

i = frame.index.searchsorted(date)
frame.ix[frame.index[i]]

A touch verbose but you could put it in a function. About as good as you'll get (O(log n))

like image 121
Wes McKinney Avatar answered Sep 18 '22 06:09

Wes McKinney


Couldn't resist answering this, even though the question was asked, and answered, in 2012, by Wes himself, and again in 2015, by ajsp. Yes, besides 'truncate', you can also use get_loc with the option 'backfill' to get the nearest date after the specific date. By the way, if you want to nearest date before the specific date, use 'ffill'. If you just want nearby, use 'nearest'.

df.iloc[df.index.get_loc(datetime.datetime(2016,2,2),method='backfill')]
like image 35
Kevin Zhu Avatar answered Sep 19 '22 06:09

Kevin Zhu