Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Subclassing Python dictionary to override __setitem__

I am building a class which subclasses dict, and overrides __setitem__. I would like to be certain that my method will be called in all instances where dictionary items could possibly be set.

I have discovered three situations where Python (in this case, 2.6.4) does not call my overridden __setitem__ method when setting values, and instead calls PyDict_SetItem directly

  1. In the constructor
  2. In the setdefault method
  3. In the update method

As a very simple test:

class MyDict(dict):     def __setitem__(self, key, value):         print "Here"         super(MyDict, self).__setitem__(key, str(value).upper())  >>> a = MyDict(abc=123) >>> a['def'] = 234 Here >>> a.update({'ghi': 345}) >>> a.setdefault('jkl', 456) 456 >>> print a {'jkl': 456, 'abc': 123, 'ghi': 345, 'def': '234'} 

You can see that the overridden method is only called when setting the items explicitly. To get Python to always call my __setitem__ method, I have had to reimplement those three methods, like this:

class MyUpdateDict(dict):     def __init__(self, *args, **kwargs):         self.update(*args, **kwargs)      def __setitem__(self, key, value):         print "Here"         super(MyUpdateDict, self).__setitem__(key, value)      def update(self, *args, **kwargs):         if args:             if len(args) > 1:                 raise TypeError("update expected at most 1 arguments, got %d" % len(args))             other = dict(args[0])             for key in other:                 self[key] = other[key]         for key in kwargs:             self[key] = kwargs[key]      def setdefault(self, key, value=None):         if key not in self:             self[key] = value         return self[key] 

Are there any other methods which I need to override, in order to know that Python will always call my __setitem__ method?

UPDATE

Per gs's suggestion, I've tried subclassing UserDict (actually, IterableUserDict, since I want to iterate over the keys) like this:

from UserDict import *; class MyUserDict(IterableUserDict):     def __init__(self, *args, **kwargs):         UserDict.__init__(self,*args,**kwargs)      def __setitem__(self, key, value):         print "Here"         UserDict.__setitem__(self,key, value) 

This class seems to correctly call my __setitem__ on setdefault, but it doesn't call it on update, or when initial data is provided to the constructor.

UPDATE 2

Peter Hansen's suggestion got me to look more carefully at dictobject.c, and I realised that the update method could be simplified a bit, since the built-in dictionary constructor simply calls the built-in update method anyway. It now looks like this:

def update(self, *args, **kwargs):     if len(args) > 1:         raise TypeError("update expected at most 1 arguments, got %d" % len(args))     other = dict(*args, **kwargs)     for key in other:         self[key] = other[key] 
like image 426
Ian Clelland Avatar asked Jan 13 '10 23:01

Ian Clelland


People also ask

How do I override a dictionary in Python?

To override a dict with Python, we can create a subclass of the MutableMapping class. to create a TransformedDict class that is a subclass of the MutableMapping . We use a dict as the value of the store instance variable.

How do you append a dictionary as a value to another dictionary in Python?

To append an element to an existing dictionary, you have to use the dictionary name followed by square brackets with the key name and assign a value to it.

How do you update multiple values in a dictionary?

Update values of multiple keys in a dictionary using update() function. If we want to update the values of multiple keys in the dictionary, then we can pass them as key-value pairs in the update() function.


1 Answers

I'm answering my own question, since I eventually decided that I really do want to subclass Dict, rather than creating a new mapping class, and UserDict still defers to the underlying Dict object in some cases, rather than using the provided __setitem__.

After reading and re-reading the Python 2.6.4 source (mostly Objects/dictobject.c, but I grepped eveywhere else to see where the various methods are used,) my understanding is that the following code is sufficient to have my __setitem__ called every time that the object is changed, and to otherwise behave exactly as a Python Dict:

Peter Hansen's suggestion got me to look more carefully at dictobject.c, and I realised that the update method in my original answer could be simplified a bit, since the built-in dictionary constructor simply calls the built-in update method anyway. So the second update in my answer has been added to the code below (by some helpful person ;-).

class MyUpdateDict(dict):     def __init__(self, *args, **kwargs):         self.update(*args, **kwargs)      def __setitem__(self, key, value):         # optional processing here         super(MyUpdateDict, self).__setitem__(key, value)      def update(self, *args, **kwargs):         if args:             if len(args) > 1:                 raise TypeError("update expected at most 1 arguments, "                                 "got %d" % len(args))             other = dict(args[0])             for key in other:                 self[key] = other[key]         for key in kwargs:             self[key] = kwargs[key]      def setdefault(self, key, value=None):         if key not in self:             self[key] = value         return self[key] 

I've tested it with this code:

def test_updates(dictish):     dictish['abc'] = 123     dictish.update({'def': 234})     dictish.update(red=1, blue=2)     dictish.update([('orange', 3), ('green',4)])     dictish.update({'hello': 'kitty'}, black='white')     dictish.update({'yellow': 5}, yellow=6)     dictish.setdefault('brown',7)     dictish.setdefault('pink')     try:         dictish.update({'gold': 8}, [('purple', 9)], silver=10)     except TypeError:         pass     else:         raise RunTimeException("Error did not occur as planned")  python_dict = dict([('b',2),('c',3)],a=1) test_updates(python_dict)  my_dict = MyUpdateDict([('b',2),('c',3)],a=1) test_updates(my_dict) 

and it passes. All other implementations I've tried have failed at some point. I'll still accept any answers that show me that I've missed something, but otherwise, I'm ticking the checkmark beside this one in a couple of days, and calling it the right answer :)

like image 179
Ian Clelland Avatar answered Sep 22 '22 14:09

Ian Clelland