Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I slice a pandas time series on dates not in the index?

I have a time series indexed by datetime.date. Here are the first knots of the series:

1999-12-31  0
2000-06-30  170382.118454
2000-12-29  -319260.443362

I want to slice from the beginning of the series until Dec 28th 2000, but this doesn't work since that date is not in the index (I get a KeyError when I try original_series[:datetime.date(2000,12,28)]. I've also tried converting the index to timestamps, but that gives very spurious results (it manufactures fake knots, see below), so I wondered if there's a good approach to this problem.

test = pd.Series(original_series.values, map(pd.Timestamp, original_series.index))

At a first glance, this looks alright:

1999-12-31         0.000000
2000-06-30    170382.118454
2000-12-29   -319260.443362

But then I try to do my slicing (where do those extra days in January 2000 come from?):

In [84]: test[:'2000-12-28']
Out[84]: 
1999-12-31         0.000000
2000-06-30    170382.118454
2000-01-03    -71073.979016
2000-01-04    100498.744748
2000-01-05     91104.743684
2000-01-06     82290.255459
like image 810
Lars Larsson Avatar asked Oct 20 '15 10:10

Lars Larsson


People also ask

How do I work with dates and times in Pandas?

Pandas has a built-in function called to_datetime()that converts date and time in string format to a DateTime object. As you can see, the 'date' column in the DataFrame is currently of a string-type object. Thus, to_datetime() converts the column to a series of the appropriate datetime64 dtype.

How do I select a specific date in Pandas?

In order to select rows between two dates in pandas DataFrame, first, create a boolean mask using mask = (df['InsertedDates'] > start_date) & (df['InsertedDates'] <= end_date) to represent the start and end of the date range. Then you select the DataFrame that lies within the range using the DataFrame.


1 Answers

You can simply do, if ts is your time.serie:

In [77]: ts = pd.Series([99,65],index=pd.to_datetime(['2000-12-24','2000-12-30']))

In [78]: ts
Out[78]:
2000-12-24    99
2000-12-30    65
dtype: int64

In [79]: ts[ts.index<=pd.to_datetime('2000-12-28')]
Out[79]:
2000-12-24    99
dtype: int64

If you have index as string just proceed with:

ts[ts.index.map(pd.to_datetime)<=pd.to_datetime('2000-12-28')]
like image 93
Colonel Beauvel Avatar answered Sep 28 '22 18:09

Colonel Beauvel