I've got a dataframe of the form:
Contract Date
201501 2014-04-29 1416.0
2014-04-30 1431.1
2014-05-01 1430.6
2014-05-02 1443.9
2014-05-05 1451.6
2014-05-06 1461.4
2014-05-07 1456.0
2014-05-08 1441.1
2014-05-09 1437.8
2014-05-12 1445.2
2014-05-13 1458.2
2014-05-14 1487.6
2014-05-15 1477.6
2014-05-16 1467.9
2014-05-19 1484.9
2014-05-20 1470.5
2014-05-21 1476.9
2014-05-22 1490.0
2014-05-23 1473.3
2014-05-27 1462.5
2014-05-28 1456.3
2014-05-29 1460.5
201507 2014-05-30 1463.5
2014-06-02 1447.5
2014-06-03 1444.4
2014-06-04 1444.7
2014-06-05 1455.9
2014-06-06 1464.0
Where Contract & Date are indices of type int
and datetime64
respectively.
What I want is to select a date range. It works by doing:
df.reset_index('Contract', drop=True).loc['2014-09']
But I hate this as it loses the index/is not very pleasant (I have to do a lot of these).
I think I should be able to do it like this:
df.loc[:,'2014-09']
to bring back all of September 2014's data. In reality, this doesn't work. I can only select a single day by doing:
df.loc[:,'2014-09-02']
Why doesn't my multi-index slicer work?
Pandas needs you to be explicit about whether you're selecting columns or sub-levels of a hierarchical index. In this case, df.loc[:,'2014-09']
fails because pandas tries to get all rows and then look for a column labelled '2014-09'
(which doesn't exist).
Instead, you need to give both levels of the multi-index and the column labels/slice.
To select all of the May 2014 data from your example you could write:
>>> df.loc[(slice(None), '2014-05'), :]
Contract Date
201501 2014-05-01 1430.6
2014-05-02 1443.9
2014-05-05 1451.6
2014-05-06 1461.4
2014-05-07 1456.0
2014-05-08 1441.1
2014-05-09 1437.8
2014-05-12 1445.2
2014-05-13 1458.2
2014-05-14 1487.6
2014-05-15 1477.6
2014-05-16 1467.9
2014-05-19 1484.9
2014-05-20 1470.5
2014-05-21 1476.9
2014-05-22 1490.0
2014-05-23 1473.3
2014-05-27 1462.5
2014-05-28 1456.3
2014-05-29 1460.5
201507 2014-05-30 1463.5
Here [(slice(None), '2014-05'), :]
translates to a slice of [:, '2014-05']
for the rows and [:]
for the columns.
The pd.IndexSlice
object was introduced to make these slice semantics a little easier:
>>> idx = pd.IndexSlice
>>> df.loc[idx[:, '2014-05'], :]
# same slice of DataFrame
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With