With single indexed dataframe I can do the following:
df2 = DataFrame(data={'data': [1,2,3]},
index=Index([dt(2016,1,1),
dt(2016,1,2),
dt(2016,2,1)]))
>>> df2['2016-01 : '2016-01']
data
2016-01-01 1
2016-01-02 2
>>> df2['2016-01-01' : '2016-01-01']
data
2016-01-01 1
Date time slicing works when you give it a complete day (i.e. 2016-01-01), and it also works when you give it a partial date, like just the year and month (2016-01). All this works great, but when you introduce a multiindex, it only works for complete dates. The partial date slicing doesn't seem to work anymore
df = DataFrame(data={'data': [1, 2, 3]},
index=MultiIndex.from_tuples([(dt(2016, 1, 1), 2),
(dt(2016, 1, 1), 3),
(dt(2016, 1, 2), 2)],
names=['date', 'val']))
>>> df['2016-01-01 : '2016-01-02']
data
date val
2016-01-01 2 1
3 2
2016-01-02 2 3
ok, thats fine, but the partial date:
>>> df['2016-01' : '2016-01']
File "pandas/index.pyx", line 134, in pandas.index.IndexEngine.get_loc (pandas/index.c:3824)
File "pandas/index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas/index.c:3704)
File "pandas/hashtable.pyx", line 686, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12280)
File "pandas/hashtable.pyx", line 694, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12231)
KeyError: '2016-01'
(I shortened the traceback).
Any idea if this is possible? Is this a bug? Is there any way to do what I want to do without having to resort to something like:
df.loc[(df.index.get_level_values('date') >= start_date) &
(df.index.get_level_values('date') <= end_date)]
Any tips, comments, suggestions, etc are MOST appreciated! I've tried a lot of other things to no avail!
from_tuples() function is used to convert list of tuples to MultiIndex. It is one of the several ways in which we construct a MultiIndex.
A multi-level index DataFrame is a type of DataFrame that contains multiple level or hierarchical indexing. You can create a MultiIndex (multi-level index) in the following ways. From a list of arrays using MultiIndex.from_arrays() From an array of tuples using MultiIndex.from_tuples()
A multi-index dataframe has multi-level, or hierarchical indexing. We can easily convert the multi-level index into the column by the reset_index() method. DataFrame. reset_index() is used to reset the index to default and make the index a column of the dataframe.
Cross-section should work:
df.xs(slice('2016-01-01', '2016-01-01'), level='date')
Documentation: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.xs.html
Use the pandas IndexSlice for a more pandtastic syntax.
idx = pd.IndexSlice
df.loc[idx['2016-01-01':'2016-01-01', :], :]
Remember pandas slices are left and right inclusive.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With