With single indexed dataframe I can do the following: <pre class="prettyprint"><code>df2 = DataFrame(data={'data': [1,2,3]}, index=Index([dt(2016,1,1), dt(2016,1,2), dt(2016,2,1)])) >>> df2['2016-01 : '2016-01'] data 2016-01-01 1 2016-01-02 2 >>> df2['2016-01-01' : '2016-01-01'] data 2016-01-01 1 </code></pre> Date time slicing works when you give it a complete day (i.e. 2016-01-01), and it also works when you give it a partial date, like just the year and month (2016-01). All this works great, but when you introduce a multiindex, it only works for complete dates. The partial date slicing doesn't seem to work anymore <pre class="prettyprint"><code>df = DataFrame(data={'data': [1, 2, 3]}, index=MultiIndex.from_tuples([(dt(2016, 1, 1), 2), (dt(2016, 1, 1), 3), (dt(2016, 1, 2), 2)], names=['date', 'val'])) >>> df['2016-01-01 : '2016-01-02'] data date val 2016-01-01 2 1 3 2 2016-01-02 2 3 </code></pre> ok, thats fine, but the partial date: <pre class="prettyprint"><code>>>> df['2016-01' : '2016-01'] File "pandas/index.pyx", line 134, in pandas.index.IndexEngine.get_loc (pandas/index.c:3824) File "pandas/index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas/index.c:3704) File "pandas/hashtable.pyx", line 686, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12280) File "pandas/hashtable.pyx", line 694, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12231) KeyError: '2016-01' </code></pre> (I shortened the traceback). Any idea if this is possible? Is this a bug? Is there any way to do what I want to do without having to resort to something like: <pre class="prettyprint"><code>df.loc[(df.index.get_level_values('date') >= start_date) & (df.index.get_level_values('date') <= end_date)] </code></pre> Any tips, comments, suggestions, etc are MOST appreciated! I've tried a lot of other things to no avail!

Cross-section should work: <pre class="prettyprint"><code>df.xs(slice('2016-01-01', '2016-01-01'), level='date') </code></pre> Documentation: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.xs.html

Use the pandas IndexSlice for a more pandtastic syntax. <pre class="prettyprint"><code>idx = pd.IndexSlice df.loc[idx['2016-01-01':'2016-01-01', :], :] </code></pre> Remember pandas slices are left and right inclusive.

Pandas Dataframe datetime slicing with Index vs MultiIndex

Tags:

python

datetime

slice

pandas

dataframe

With single indexed dataframe I can do the following:

df2 = DataFrame(data={'data': [1,2,3]}, 
                index=Index([dt(2016,1,1),
                      dt(2016,1,2),
                      dt(2016,2,1)]))

>>> df2['2016-01 : '2016-01']
                data
    2016-01-01     1
    2016-01-02     2

>>> df2['2016-01-01' : '2016-01-01']
                data
    2016-01-01     1

Date time slicing works when you give it a complete day (i.e. 2016-01-01), and it also works when you give it a partial date, like just the year and month (2016-01). All this works great, but when you introduce a multiindex, it only works for complete dates. The partial date slicing doesn't seem to work anymore

df = DataFrame(data={'data': [1, 2, 3]},
               index=MultiIndex.from_tuples([(dt(2016, 1, 1), 2),
                                             (dt(2016, 1, 1), 3),
                                             (dt(2016, 1, 2), 2)],
                                             names=['date', 'val']))


 >>> df['2016-01-01 : '2016-01-02']
                            data
     date       val     
     2016-01-01 2           1
                3           2
     2016-01-02 2           3

ok, thats fine, but the partial date:

>>> df['2016-01' : '2016-01']
 File "pandas/index.pyx", line 134, in pandas.index.IndexEngine.get_loc      (pandas/index.c:3824)
 File "pandas/index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas/index.c:3704)
 File "pandas/hashtable.pyx", line 686, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12280)
 File "pandas/hashtable.pyx", line 694, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12231)
  KeyError: '2016-01'

(I shortened the traceback).

Any idea if this is possible? Is this a bug? Is there any way to do what I want to do without having to resort to something like:

df.loc[(df.index.get_level_values('date') >= start_date) &
       (df.index.get_level_values('date') <= end_date)]

Any tips, comments, suggestions, etc are MOST appreciated! I've tried a lot of other things to no avail!

549

asked Apr 14 '16 11:04

Bryant

2 Answers

Cross-section should work:

df.xs(slice('2016-01-01', '2016-01-01'), level='date')

Documentation: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.xs.html

answered Oct 07 '22 18:10

IanS

Use the pandas IndexSlice for a more pandtastic syntax.

idx = pd.IndexSlice
df.loc[idx['2016-01-01':'2016-01-01', :], :]

Remember pandas slices are left and right inclusive.

answered Oct 07 '22 18:10

Little Bobby Tables

Related questions
                            
                                Two's complement of Hex number in Python
                            
                                pip install error: cannot import name 'unpack_url'
                            
                                Counting differences between two strings
                            
                                Python | Why is accessing instance attribute slower than local?
                            
                                python-docx style_id error while creating a word document
                            
                                Reduce by key in python
                            
                                How to shift a string to right in python?
                            
                                How can I format a float to variable precision?
                            
                                Setting DataFrame values with enlargement
                            
                                Python script avoid quitting when Ctrl-C is pressed
                            
                                Wagtail Views: extra context
                            
                                Dictionary of Pandas' Dataframe to JSON
                            
                                How to split pandas column by a delimiter and select preferred element as the replacement
                            
                                Filter list with regex [duplicate]
                            
                                how to add hour to pandas dataframe column
                            
                                How do I convert a complex number?
                            
                                what's the difference between "when='D' " and "when='midnight'" for TimedRotatingFileHandler?
                            
                                python install module apiclient
                            
                                use __name__ as attribute
                            
                                Is there a way to check whether a related object is already fetched?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With