How can you slice with string keys instead of integers on a python OrderedDict?

Tags:

Since an OrderedDict has the features of both a list (with ordered elements), and a dictionary (with keys instead of indexes), it would seem natural that you could slice using keys.

>>> from collections import OrderedDict
>>> cities = OrderedDict((('san francisco', 650), ('new york', 212), ('shanghai', 8621), ('barcelona', 42423)))
>>> test['shanghai':]  # I want all the cities from shanghai to the end of the list
TypeError: unhashable type

What's interesting about this is that it's not the error you'd see due to OrderedDictionary.__getslice__ not being implemented. I tried adding my own __getslice__ method to OrderedDict, but I keep running into this TypeError problem. It seems like Python is doing some kind of type checking to enforce that slice keys are only integers, before they even get passed to the __getslice__ function, how unpythonic!

>>> class BetterOrderedDict(OrderedDict):
        def __getslice__(self, start=None, end=None, step=1):
            return 'potato'

>>> test = BetterOrderedDict((('one', 1), ('two', 2), ('three', 3), ('four', 4)))
>>> print test[1:4]
'potato'                           # ok this makes sense so far

>>> test['one':'four']
TypeError: unhashable type         # WTF, strings are hashable!

So my question is, why can't I implement non-int slices, what kind of type-checking is preventing the slice keys from even reaching my __getslice__ function, and can I override it by implementing my BetterOrderedDict in C with bindings?

914

asked Jan 12 '15 23:01

Nick Sweeting

3 Answers

__getslice__ is deprecated way of implementing slicing. Instead you should handle slice objects with __getitem__:

from collections import OrderedDict

class SlicableDict(OrderedDict):
    def __getitem__(self, key):
        if isinstance(key, slice):
            return 'potato({},{},{})'.format(key.start, key.stop, key.step)
        return super(SlicableDict, self).__getitem__(key)

>>> s = SlicableDict(a=1, b=2, c=3)
>>> s
SlicableDict([('a', 1), ('c', 3), ('b', 2)])
>>> s['a']
1
>>> s['a':'c']
'potato(a,c,None)'

And if you need more than potato, than you can implement all three slicing operations following way:

def _key_slice_to_index_slice(items, key_slice):
    try:
        if key_slice.start is None:
            start = None
        else:
            start = next(idx for idx, (key, value) in enumerate(items)
                         if key == key_slice.start)
        if key_slice.stop is None:
            stop = None
        else:
            stop = next(idx for idx, (key, value) in enumerate(items)
                        if key == key_slice.stop)
    except StopIteration:
        raise KeyError
    return slice(start, stop, key_slice.step)

class SlicableDict(OrderedDict):
    def __getitem__(self, key):
        if isinstance(key, slice):
            items = self.items()
            index_slice = _key_slice_to_index_slice(items, key)
            return SlicableDict(items[index_slice])
        return super(SlicableDict, self).__getitem__(key)

    def __setitem__(self, key, value):
        if isinstance(key, slice):
            items = self.items()
            index_slice = _key_slice_to_index_slice(items, key)
            items[index_slice] = value.items()
            self.clear()
            self.update(items)
            return
        return super(SlicableDict, self).__setitem__(key, value)

    def __delitem__(self, key):
        if isinstance(key, slice):
            items = self.items()
            index_slice = _key_slice_to_index_slice(items, key)
            del items[index_slice]
            self.clear()
            self.update(items)
            return
        return super(SlicableDict, self).__delitem__(key)

150

answered Oct 02 '22 16:10

zch

This is the actual implementation of the slicing feature you are expecting.

OrderedDict internally maintains the order of the keys in the form of a doubly linked list. Quoting the actual comment from Python 2.7.9,

# The internal self.__map dict maps keys to links in a doubly linked list.
# The circular doubly linked list starts and ends with a sentinel element.
# The sentinel element never gets deleted (this simplifies the algorithm).
# Each link is stored as a list of length three:  [PREV, NEXT, KEY].

Now, to slice the dictionary, we need to iterate the doubly linked list, __root, which is actually a private variable, protected by the name mangling mechanism.

Note: This involves hacky name unmangling to use the OrderedDict's internal data structures.

from collections import OrderedDict

class SlicableDict(OrderedDict):
    def __getitem__(self, key):
        if isinstance(key, slice):
            # Unmangle `__root` to access the doubly linked list
            root = getattr(self, "_OrderedDict__root")
            # By default, make `start` as the first element, `end` as the last
            start, end = root[1][2], root[0][2]
            start = key.start or start
            end = key.stop or end
            step = key.step or 1
            curr, result, begun, counter = root[1], [], False, 0

            # Begin iterating
            curr, result, begun = root[1], [], False
            while curr is not root:
                # If the end value is reached, `break` and `return`
                if curr[2] == end:
                    break
                # If starting value is matched, start appending to `result`
                if curr[2] == start:
                    begun = True
                if begun:
                    if counter % step == 0:
                        result.append((curr[2], self[curr[2]]))
                    counter += 1

                # Make the `curr` point to the next element
                curr = curr[1]

            return result

        return super(SlicableDict, self).__getitem__(key)

Few sample runs:

>>> s = SlicableDict(a=1, b=2, c=3, d=4)
>>> s
SlicableDict([('a', 1), ('c', 3), ('b', 2), ('e', 5), ('d', 4), ('f', 6)])
>>> s['a':'c']
[('a', 1)]
>>> s['a':]
[('a', 1), ('c', 3), ('b', 2), ('e', 5), ('d', 4)]
>>> s[:'a']
[]
>>> s['a':'f':2]
[('a', 1), ('b', 2), ('d', 4)]

answered Oct 02 '22 16:10

thefourtheye

Try this (very ugly) implementation

class SliceOrdered(OrderedDict):

    def __getitem__(self, key):
        if isinstance(key, slice):
            tmp = OrderedDict()
            i_self = iter(self)
            for k in i_self:
                if key.start <= k <= key.stop:
                    tmp[k] = self[k]
                    if key.step is not None and key.step > 1:
                        for _ in range(key.step-1):
                            try:
                                next(i_self)
                            except StopIteration:
                                break
            return tmp
        else:
            return super(SliceOrdered, self).__getitem__(key)

DEMO (Python3.4)

>>> s = SliceOrdered([('a',2), ('b',2), ('c',3), ('d',4)])
>>> s['a':'c']
OrderedDict([('a', 2), ('b', 2), ('c', 3)])
>>> s['a':'d':2]
OrderedDict([('a', 2), ('c', 3)])

N.B. this probably only works because in this example, the OrderedDict was not only ordered, but also sorted. In an unsorted dictionary the slice 'a':'c' does not necessary contain 'b', so my if key.start <= k <= key.stop logic probably fails. The following code should respect that:

class SliceOrdered(OrderedDict):
    def __getitem__(self, key):
        if not isinstance(key, slice):
            return super(SliceOrdered,self).__getitem__(key)
        tmp = OrderedDict()
        step = key.step or 1
        accumulating = False
        i_self = iter(self)
        for k in i_self:
            if k == key.start:
                accumulating = True
            if accumulating:
                tmp[k] = self[k]
                for _ in range(step-1):
                    next(i_self)
            if k == key.stop:
                accumulating = False
                break
        return tmp

answered Sep 29 '22 16:09

Adam Smith

Related questions
                            
                                Inline SVG Served By Python Script in Google App Engine Not Appearing
                            
                                determining whether a MIME type is binary or text-based
                            
                                Adding and removing audio sources to/from GStreamer pipeline on-the-go
                            
                                What is PyObjC?
                            
                                How should I comment partial Python functions?
                            
                                SQLAlchemy equivalent to Django's annotate() method
                            
                                a good solution to set up a rdf triplestore in python?
                            
                                Is there any way to generate tornado localization CSV file like django makemessage?
                            
                                Appending tuples to lists
                            
                                Simple sqlite question
                            
                                String coverage optimization in Python
                            
                                Any push back like function in python?
                            
                                How to set up "front page" documentation on PYPI for a project?
                            
                                How to compare 2 iframes and get difference visually?
                            
                                Disable all `pylint` 'Convention' messages
                            
                                How to group the choices in a Django Select widget?
                            
                                Python Web Crawlers and "getting" html source code
                            
                                Seeding random number generators in parallel programs
                            
                                How do I get an ECDSA public key from just a Bitcoin signature? ... SEC1 4.1.6 key recovery for curves over (mod p)-fields
                            
                                Is there a way to make Colab give an Audio Notification when cell has finished running

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How can you slice with string keys instead of integers on a python OrderedDict?

Tags:

python

dictionary

python-internals

python-2.7

ordereddictionary

Nick Sweeting

People also ask

3 Answers

zch

thefourtheye

Adam Smith

Recent Activity

Donate For Us