Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python fold/reduce composition of multiple dictionaries

I want to achieve the following. It's essentially the composition or merging of an arbitrary number of dictionaries, with reference to a 'seed' or root dictionary, accumulating all unchanged and updated values in the final result.

seed = {
    'update': False,
    'data': {
        'subdata': {
            'field1': 5,
            'field2': '2018-01-30 00:00:00'
        },
        'field3': 2,
        'field4': None
    },
    'data_updates': {},
    'subdata_updates': {},
    'diffs': {}
}

update_1 = {
    'update': True,
    'data': {
        'subdata': {
            'field1': 6,
            'field2': '2018-01-30 00:00:00'
        },
        'field3': 2,
        'field4': None
    },
    'data_updates': {},
    'subdata_updates': {'field1': 6},
    'diffs': {
        'field1': {
            'field': 'field1',
            'before': 5,
            'after': 6
        }
    }
}

update_2 = {
    'update': True,
    'data': {
        'subdata': {
            'field1': 5,
            'field2': '2018-01-30 00:00:00',
        },
        'field3': 2,
        'field4': 1
    },
    'data_updates': {'field4': 1},
    'subdata_updates': {},
    'diffs': {
        'field4': {
            'field': 'field4',
            'before': None,
            'after': 1
        }
    }
}

# I want to be able to pass in an arbitrary number of updates.
assert reduce_maps(seed, *[update_1, update_2]) == {
    'update': True,
    'data': {
        'subdata': {
            'field1': 6,
            'field2': '2018-01-30 00:00:00',
        },
        'field3': 2,
        'field4': 1
    },
    'data_updates': {'field4': 1},
    'subdata_updates': {'field1': 6},
    'diffs': {
        'field1': {
            'field': 'field1',
            'before': 5,
            'after': 6
        },
        'field4': {
            'field': 'field4',
            'before': None,
            'after': 1
        }
    }
}

You can assume the data will always be in this shape, you can also assume that each payload only ever updates one field and that no two updates will ever update the same field.

I can dimly perceive an analogue of fold lurking in the background here building up the data in passes around seed.

like image 588
jhrr Avatar asked Jun 28 '18 14:06

jhrr


People also ask

How do you flatten nested dictionary?

Basically the same way you would flatten a nested list, you just have to do the extra work for iterating the dict by key/value, creating new keys for your new dictionary and creating the dictionary at final step. For Python >= 3.3, change the import to from collections.

What is flattened dictionary?

a : to make level or smooth. b : to knock down also : to defeat decisively. c : to make dull or uninspired —often used with out. d : to make lusterless flatten paint. e : to stabilize especially at a lower level.

Which method is used to remove all items from the dictionaries?

Python Dictionary clear() Method The clear() method removes all the elements from a dictionary.


1 Answers

Here you go:

from pprint import pprint


def merge_working(pre, post):
    if not (isinstance(pre, dict) and isinstance(post, dict)):
        return post

    new = pre.copy()  # values for unique keys of pre will be preserved
    for key, post_value in post.items():
        new[key] = merge_working(new.get(key), post_value)

    return new


def merge_simplest(pre, post):
    if not isinstance(pre, dict):
        return post
    return {key: merge_simplest(pre[key], post[key])
            for key in pre}


merge = merge_working


def reduce_maps(*objects):
    new = objects[0]
    for post in objects[1:]:
        new = merge(new, post)
    return new


seed = {
    'update': False,
    'data': {
        'subdata': {
            'field1': 5,
            'field2': '2018-01-30 00:00:00'
        },
        'field3': 2,
        'field4': None
    },
    'data_updates': {},
    'subdata_updates': {},
    'diffs': {}
}

update_1 = {
    'update': True,
    'data': {
        'subdata': {
            'field1': 6,
            'field2': '2018-01-30 00:00:00'
        },
        'field3': 2,
        'field4': None
    },
    'data_updates': {},
    'subdata_updates': {'field1': 6},
    'diffs': {
        'field1': {
            'field': 'field 1',
            'before': 5,
            'after': 6
        }
    }
}

update_2 = {
    'update': True,
    'data': {
        'subdata': {
            'field1': 5,
            'field2': '2018-01-30 00:00:00',
        },
        'field3': 2,
        'field4': 1
    },
    'data_updates': {'field4': 1},
    'subdata_updates': {},  # was subdata_update
    'diffs': {
        'field4': {
            'field': 'field 4',
            'before': None,
            'after': 1
        }
    }
}

result = reduce_maps(*[seed, update_1, update_2])

golden = {
    'update': True,
    'data': {
        'subdata': {
            'field1': 5,  # was 6
            'field2': '2018-01-30 00:00:00',
        },
        'field3': 2,
        'field4': 1
    },
    'data_updates': {'field4': 1},
    'subdata_updates': {'field1': 6},  # was subdata_update
    'diffs': {
        'field1': {
            'field': 'field 1',
            'before': 5,
            'after': 6
        },
        'field4': {
            'field': 'field 4',
            'before': None,
            'after': 1
        }
    }
}

pprint(result)
pprint(golden)

assert result == golden

I've fixed what I think were typos in your data (see comments in the code).

Note that merge may need tweaking according to exact merging rules and possible data. To see what I mean, use merge = merge_simplest and understand why it fails. It wouldn't if the "data-agnostic" shape (understood as the dictionary tree disregarding values of leaves) were really the same.

like image 148
Kirill Bulygin Avatar answered Sep 28 '22 06:09

Kirill Bulygin