I am looking to efficiently merge two (fairly arbitrary) data structures: one representing a set of defaults values and one representing overrides. Example data below. (Naively iterating over the structures works, but is very slow.) Thoughts on the best approach for handling this case?
_DEFAULT = { 'A': 1122, 'B': 1133, 'C': [ 9988, { 'E': [ { 'F': 6666, }, ], }, ], } _OVERRIDE1 = { 'B': 1234, 'C': [ 9876, { 'D': 2345, 'E': [ { 'F': 6789, 'G': 9876, }, 1357, ], }, ], } _ANSWER1 = { 'A': 1122, 'B': 1234, 'C': [ 9876, { 'D': 2345, 'E': [ { 'F': 6789, 'G': 9876, }, 1357, ], }, ], } _OVERRIDE2 = { 'C': [ 6543, { 'E': [ { 'G': 9876, }, ], }, ], } _ANSWER2 = { 'A': 1122, 'B': 1133, 'C': [ 6543, { 'E': [ { 'F': 6666, 'G': 9876, }, ], }, ], } _OVERRIDE3 = { 'B': 3456, 'C': [ 1357, { 'D': 4567, 'E': [ { 'F': 6677, 'G': 9876, }, 2468, ], }, ], } _ANSWER3 = { 'A': 1122, 'B': 3456, 'C': [ 1357, { 'D': 4567, 'E': [ { 'F': 6677, 'G': 9876, }, 2468, ], }, ], }
This is an example of how to run the tests: (The dictionary update doesn't work, just an stub function.)
import itertools def mergeStuff( default, override ): # This doesn't work result = dict( default ) result.update( override ) return result def main(): for override, answer in itertools.izip( _OVERRIDES, _ANSWERS ): result = mergeStuff(_DEFAULT, override) print('ANSWER: %s' % (answer) ) print('RESULT: %s\n' % (result) )
You cannot do that by "iterating", you'll need a recursive routine like this:
def merge(a, b):
if isinstance(a, dict) and isinstance(b, dict):
d = dict(a)
d.update({k: merge(a.get(k, None), b[k]) for k in b})
return d
if isinstance(a, list) and isinstance(b, list):
return [merge(x, y) for x, y in itertools.izip_longest(a, b)]
return a if b is None else b
If you want your code to be fast, don't copy like crazy
You don't really need to merge two dicts. You can just chain them.
A ChainMap class is provided for quickly linking a number of mappings so they can be treated as a single unit. It is often much faster than creating a new dictionary and running multiple update() calls.
class ChainMap(UserDict.DictMixin):
"""Combine multiple mappings for sequential lookup"""
def __init__(self, *maps):
self._maps = maps
def __getitem__(self, key):
for mapping in self._maps:
try:
return mapping[key]
except KeyError:
pass
raise KeyError(key)
def main():
for override, answer in itertools.izip( _OVERRIDES, _ANSWERS ):
result = ChainMap(override, _DEFAULT)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With