Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Easily access deeply nested dict (get and set)

Tags:

I'm building some Python code to read and manipulate deeply nested dicts (ultimately for interacting with JSON services, however it would be great to have for other purposes) I'm looking for a way to easily read/set/update values deep within the dict, without needing a lot of code.

@see also Python: Recursively access dict via attributes as well as index access? -- Curt Hagenlocher's "DotDictify" solution is pretty eloquent. I also like what Ben Alman presents for JavaScript in http://benalman.com/projects/jquery-getobject-plugin/ It would be great to somehow combine the two.

Building off of Curt Hagenlocher and Ben Alman's examples, it would be great in Python to have a capability like:

>>> my_obj = DotDictify() >>> my_obj.a.b.c = {'d':1, 'e':2} >>> print my_obj {'a': {'b': {'c': {'d': 1, 'e': 2}}}} >>> print my_obj.a.b.c.d 1 >>> print my_obj.a.b.c.x None >>> print my_obj.a.b.c.d.x None >>> print my_obj.a.b.c.d.x.y.z None 

Any idea if this is possible, and if so, how to go about modifying the DotDictify solution?

Alternatively, the get method could be made to accept a dot notation (and a complementary set method added) however the object notation sure is cleaner.

>>> my_obj = DotDictify() >>> my_obj.set('a.b.c', {'d':1, 'e':2}) >>> print my_obj {'a': {'b': {'c': {'d': 1, 'e': 2}}}} >>> print my_obj.get('a.b.c.d') 1 >>> print my_obj.get('a.b.c.x') None >>> print my_obj.get('a.b.c.d.x') None >>> print my_obj.get('a.b.c.d.x.y.z') None 

This type of interaction would be great to have for dealing with deeply nested dicts. Does anybody know another strategy (or sample code snippet/library) to try?

like image 885
Hal Avatar asked Sep 26 '10 13:09

Hal


People also ask

How do I access nested dict values?

Access Nested Dictionary Items This can be done using the special dictionary get() method. The get() method returns the value for the key if the key is in the dictionary, otherwise, it returns None.

How do you access a multi level dictionary in Python?

Access Nested Dictionary Items You can access individual items in a nested dictionary by specifying key in multiple square brackets. If you refer to a key that is not in the nested dictionary, an exception is raised. To avoid such exception, you can use the special dictionary get() method.

Can Python dictionaries be nested to any depth?

Dictionaries can be nested to any depth. All the keys in a dictionary must be of the same type. Items are accessed by their position in a dictionary. Dictionaries are mutable.

How do you handle nested dictionary?

Adding elements to a Nested Dictionary One way to add a dictionary in the Nested dictionary is to add values one be one, Nested_dict[dict][key] = 'value'. Another way is to add the whole dictionary in one go, Nested_dict[dict] = { 'key': 'value'}.


2 Answers

Attribute Tree

The problem with your first specification is that Python can't tell in __getitem__ if, at my_obj.a.b.c.d, you will next proceed farther down a nonexistent tree, in which case it needs to return an object with a __getitem__ method so you won't get an AttributeError thrown at you, or if you want a value, in which case it needs to return None.

I would argue that in every case you have above, you should expect it to throw a KeyError instead of returning None. The reason being that you can't tell if None means "no key" or "someone actually stored None at that location". For this behavior, all you have to do is take dotdictify, remove marker, and replace __getitem__ with:

def __getitem__(self, key):     return self[key] 

Because what you really want is a dict with __getattr__ and __setattr__.

There may be a way to remove __getitem__ entirely and say something like __getattr__ = dict.__getitem__, but I think this may be over-optimization, and will be a problem if you later decide you want __getitem__ to create the tree as it goes like dotdictify originally does, in which case you would change it to:

def __getitem__(self, key):     if key not in self:         dict.__setitem__(self, key, dotdictify())     return dict.__getitem__(self, key) 

I don't like the marker business in the original dotdictify.

Path Support

The second specification (override get() and set()) is that a normal dict has a get() that operates differently from what you describe and doesn't even have a set (though it has a setdefault() which is an inverse operation to get()). People expect get to take two parameters, the second being a default if the key isn't found.

If you want to extend __getitem__ and __setitem__ to handle dotted-key notation, you'll need to modify doctictify to:

class dotdictify(dict):     def __init__(self, value=None):         if value is None:             pass         elif isinstance(value, dict):             for key in value:                 self.__setitem__(key, value[key])         else:             raise TypeError, 'expected dict'      def __setitem__(self, key, value):         if '.' in key:             myKey, restOfKey = key.split('.', 1)             target = self.setdefault(myKey, dotdictify())             if not isinstance(target, dotdictify):                 raise KeyError, 'cannot set "%s" in "%s" (%s)' % (restOfKey, myKey, repr(target))             target[restOfKey] = value         else:             if isinstance(value, dict) and not isinstance(value, dotdictify):                 value = dotdictify(value)             dict.__setitem__(self, key, value)      def __getitem__(self, key):         if '.' not in key:             return dict.__getitem__(self, key)         myKey, restOfKey = key.split('.', 1)         target = dict.__getitem__(self, myKey)         if not isinstance(target, dotdictify):             raise KeyError, 'cannot get "%s" in "%s" (%s)' % (restOfKey, myKey, repr(target))         return target[restOfKey]      def __contains__(self, key):         if '.' not in key:             return dict.__contains__(self, key)         myKey, restOfKey = key.split('.', 1)         target = dict.__getitem__(self, myKey)         if not isinstance(target, dotdictify):             return False         return restOfKey in target      def setdefault(self, key, default):         if key not in self:             self[key] = default         return self[key]      __setattr__ = __setitem__     __getattr__ = __getitem__ 

Test code:

>>> life = dotdictify({'bigBang': {'stars': {'planets': {}}}}) >>> life.bigBang.stars.planets {} >>> life.bigBang.stars.planets.earth = { 'singleCellLife' : {} } >>> life.bigBang.stars.planets {'earth': {'singleCellLife': {}}} >>> life['bigBang.stars.planets.mars.landers.vikings'] = 2 >>> life.bigBang.stars.planets.mars.landers.vikings 2 >>> 'landers.vikings' in life.bigBang.stars.planets.mars True >>> life.get('bigBang.stars.planets.mars.landers.spirit', True) True >>> life.setdefault('bigBang.stars.planets.mars.landers.opportunity', True) True >>> 'landers.opportunity' in life.bigBang.stars.planets.mars True >>> life.bigBang.stars.planets.mars {'landers': {'opportunity': True, 'vikings': 2}} 
like image 173
Mike DeSimone Avatar answered Oct 29 '22 13:10

Mike DeSimone


The older answers have some pretty good tips in them, but they all require replacing standard Python data structures (dicts, etc.) with custom ones, and would not work with keys that are not valid attribute names.

These days we can do better, using a pure-Python, Python 2/3-compatible library, built for exactly this purpose, called glom. Using your example:

import glom  target = {}  # a plain dictionary we will deeply set on glom.assign(target, 'a.b.c', {'d': 1, 'e': 2}, missing=dict) # {'a': {'b': {'c': {'e': 2, 'd': 1}}}} 

Notice the missing=dict, used to autocreate dictionaries. We can easily get the value back using glom's deep-get:

glom.glom(target, 'a.b.c.d') # 1 

There's a lot more you can do with glom, especially around deep getting and setting. I should know, since (full disclosure) I created it. That means if you find a gap, you should let me know!

like image 44
Mahmoud Hashemi Avatar answered Oct 29 '22 14:10

Mahmoud Hashemi