Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bidirectional data structure conversion in Python

Note: this is not a simple two-way map; the conversion is the important part.

I'm writing an application that will send and receive messages with a certain structure, which I must convert from and to an internal structure.

For example, the message:

{
    "Person": {
        "name": {
            "first": "John",
            "last": "Smith"
        }
    },
    "birth_date": "1997.01.12",
    "points": "330"
}

This must be converted to :

{ 
    "Person": {
        "firstname": "John",
        "lastname": "Smith",
        "birth": datetime.date(1997, 1, 12),
        "points": 330
    }
}

And vice-versa.

These messages have a lot of information, so I want to avoid having to manually write converters for both directions. Is there any way in Python to specify the mapping once, and use it for both cases?

In my research, I found an interesting Haskell library called JsonGrammar which allows for this (it's for JSON, but that's irrelevant for the case). But my knowledge of Haskell isn't good enough to attempt a port.

like image 328
André Paramés Avatar asked Apr 26 '18 11:04

André Paramés


4 Answers

That's actually quite an interesting problem. You could define a list of transformation, for example in the form (key1, func_1to2, key2, func_2to1), or a similar format, where key could contain separators to indicate different levels of the dict, like "Person.name.first".

noop = lambda x: x
relations = [("Person.name.first", noop, "Person.firstname", noop),
             ("Person.name.last", noop, "Person.lastname", noop),
             ("birth_date", lambda s: datetime.date(*map(int, s.split("."))),
              "Person.birth", lambda d: d.strftime("%Y.%m.%d")),
             ("points", int, "Person.points", str)]

Then, iterate the elements in that list and transform the entries in the dictionary according to whether you want to go from form A to B or vice versa. You will also need some helper function for accessing keys in nested dictionaries using those dot-separated keys.

def deep_get(d, key):
    for k in key.split("."):
        d = d[k]
    return d

def deep_set(d, key, val):
    *first, last = key.split(".")
    for k in first:
        d = d.setdefault(k, {})
    d[last] = val

def convert(d, mapping, atob):
    res = {}
    for a, x, b, y in mapping:
        a, b, f = (a, b, x) if atob else (b, a, y)
        deep_set(res, b, f(deep_get(d, a)))
    return res

Example:

>>> d1 = {"Person": { "name": { "first": "John", "last": "Smith" } },
...       "birth_date": "1997.01.12",
...       "points": "330" }
...
>>> print(convert(d1, relations, True))    
{'Person': {'birth': datetime.date(1997, 1, 12),
            'firstname': 'John',
            'lastname': 'Smith',
            'points': 330}}
like image 103
tobias_k Avatar answered Sep 24 '22 23:09

tobias_k


Tobias has answered it quite well. If you are looking for a library that ensures the Model Transformation dynamically then you can explore the Python's Model transformation library PyEcore.

PyEcore allows you to handle models and metamodels (structured data model), and gives the key you need for building ModelDrivenEngineering-based tools and other applications based on a structured data model. It supports out-of-the-box:

Data inheritance, Two-ways relationship management (opposite references), XMI (de)serialization, JSON (de)serialization etc

Edit

I have found something more interesting for you with example similar to yours, check out JsonBender.

import json
from jsonbender import bend, K, S

MAPPING = {
    'Person': {
        'firstname': S('Person', 'name', 'first'),
        'lastname': S('Person', 'name', 'last'),
        'birth': S('birth_date'),
        'points': S('points')
    }
}

source = {
    "Person": {
        "name": {
            "first": "John",
            "last": "Smith"
        }
        },
    "birth_date": "1997.01.12",
    "points": "330"
}

result = bend(MAPPING, source)
print(json.dumps(result))

Output:

{"Person": {"lastname": "Smith", "points": "330", "firstname": "John", "birth": "1997.01.12"}}
like image 35
NoorJafri Avatar answered Sep 23 '22 23:09

NoorJafri


Here is my take on this (converter lambdas and dot-based notation idea taken from tobias_k):

import datetime

converters = {
    (str, datetime.date): lambda s: datetime.date(*map(int, s.split("."))),
    (datetime.date, str): lambda d: d.strftime("%Y.%m.%d"),
}
mapping = [
    ('Person.name.first', str, 'Person.firstname', str),
    ('Person.name.last', str, 'Person.lastname', str),
    ('birth_date', str, 'Person.birth', datetime.date),
    ('points', str, 'Person.points', int),
]

def covert_doc(doc, mapping, converters, inverse=False):
    converted = {}
    for keys1, type1, keys2, type2 in mapping:
        if inverse:
            keys1, type1, keys2, type2 = keys2, type2, keys1, type1
        converter = converters.get((type1, type2), type2)
        keys1 = keys1.split('.')
        keys2 = keys2.split('.')
        obj1 = doc
        while keys1:
            k, *keys1 = keys1
            obj1 = obj1[k]
        dict2 = converted
        while len(keys2) > 1:
            k, *keys2 = keys2
            dict2 = dict2.setdefault(k, {})
        dict2[keys2[0]] = converter(obj1)
    return converted

# Test
doc1 = {
    "Person": {
        "name": {
            "first": "John",
            "last": "Smith"
        }
    },
    "birth_date": "1997.01.12",
    "points": "330"
}
doc2 = {
    "Person": {
        "firstname": "John",
        "lastname": "Smith",
        "birth": datetime.date(1997, 1, 12),
        "points": 330
    }
}
assert doc2 == covert_doc(doc1, mapping, converters)
assert doc1 == covert_doc(doc2, mapping, converters, inverse=True)

This nice things are that you can reuse converters (even to convert different document structures) and that you only need to define non-trivial conversions. The drawback is that, as it is, every pair of types must always use the same conversion (maybe it could be extended to add optional alternative conversions).

like image 41
jdehesa Avatar answered Sep 24 '22 23:09

jdehesa


You can use lists to describe paths to values in objects with type converting functions, for example:

from_paths = [
    (['Person', 'name', 'first'], None),
    (['Person', 'name', 'last'], None),
    (['birth_date'], lambda s: datetime.date(*map(int, s.split(".")))),
    (['points'], lambda s: int(s))
]
to_paths = [
    (['Person', 'firstname'], None),
    (['Person', 'lastname'], None),
    (['Person', 'birth'], lambda d: d.strftime("%Y.%m.%d")),
    (['Person', 'points'], str)
]

and a little function to covert from and to (much like tobias suggests but without string separation and using reduce to get values from dict):

def convert(from_paths, to_paths, obj):
    to_obj = {}
    for (from_keys, convfn), (to_keys, _) in zip(from_paths, to_paths):
        value = reduce(operator.getitem, from_keys, obj)
        if convfn:
            value = convfn(value)
        curr_lvl_dict = to_obj
        for key in to_keys[:-1]:
            curr_lvl_dict = curr_lvl_dict.setdefault(key, {})
        curr_lvl_dict[to_keys[-1]] = value
    return to_obj

test:

from_json = '''{
    "Person": {
        "name": {
            "first": "John",
            "last": "Smith"
        }
    },
    "birth_date": "1997.01.12",
    "points": "330"
}'''
>>> obj = json.loads(from_json)
>>> new_obj = convert(from_paths, to_paths, obj)
>>> new_obj
{'Person': {'lastname': u'Smith',
            'points': 330,
            'birth': datetime.date(1997, 1, 12), 'firstname': u'John'}}
>>> convert(to_paths, from_paths, new_obj)
{'birth_date': '1997.01.12',
 'Person': {'name': {'last': u'Smith', 'first': u'John'}},
 'points': '330'}
>>> 
like image 23
ndpu Avatar answered Sep 24 '22 23:09

ndpu