I know that you can select data from pandas.DatetimeIndex using pandas.DataFrame.between_time. Is there a convenient way to exclude between two times in pandas
?
For example, to exclude data between 16:00 and 17:00, I am currently doing the following.
In [1]: import pandas as pd
import numpy as np
In [2]: df = pd.DataFrame(np.random.randn(24 * 60 + 1, 2), columns=list("AB"), index=pd.date_range(start="20161013 00:00:00", freq="1T", periods=24 * 60 +1))
In [3]: idx = df.index.hour == 16
In [4]: df = df[~idx]
In [5]: df.between_time("16:00", "17:00")
Out[5]:
A B
2016-10-13 17:00:00 -0.745892 1.832912
EDIT
I have been able to use this:
In[41]:df2 = df.ix[np.setdiff1d(df.index, df.between_time("16:00", "17:00").index)]
In[42]:df2.between_time("15:59", "17:01")
Out[42]:
A B
2016-10-13 15:59:00 1.190678 0.783776
2016-10-13 17:01:00 -0.590931 -1.059962
Is there a better way?
Use df. dates1-df. dates2 to find the difference between the two dates and then convert the result in the form of months.
The values property is used to get a Numpy representation of the DataFrame. Only the values in the DataFrame will be returned, the axes labels will be removed. The values of the DataFrame. A DataFrame where all columns are the same type (e.g., int64) results in an array of the same type.
You can combine between_time
with drop
:
df2 = df.drop(df.between_time("16:00", "17:00").index)
Edit
An alternate method is to exploit the fact that between_time
operates circularly, so you can switch the order of your input times to exclude the range between them:
df.between_time("17:00", "16:00", include_start=False, include_end=False)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With