Assume a simple schema defined in marshmallow
class AddressSchema(Schema):
street=fields.String(required=True)
city=fields.String(required=True)
country=fields.String(default='USA')
class PersonSchema(Schema):
name=fields.String(required=True)
address=fields.Nested(AddressSchema())
The use case here is applications working with in-memory objects, and serialization/deserialization to JSON, i.e. no SQL database.
Using the standard json
library I can parse JSON objects that conform to this schema, and access objects in a manner such as person1['address']['city']
, but the use of typo-prone strings in verbose syntax is somewhat unsatisfactory.
I could define a parallel OO model, and annotate my schema with @post_load
decorators, for example:
class Address(object):
def __init__(self, street, city, country='USA'):
self.street=street
self.city=city
self.country=country
class Person(object):
def __init__(self, street, city=None):
self.street=street
self.city=city
But the repetition is not very nice (and I haven't even included descriptions in the schema).
Arguably the explicit OO model doesn't buy much - it's basic data accessors, no behavior. I could get some syntactic sugar using jsobject, so that I could write for example person1.address.city
. But this doesn't seem quite right either. As a developer I have no explicit python class API to consult to determine what fields to use, I can reference the marshmallow schema but this feels very indirect.
It would be fairly easy to generate the OO code above from the marshmallow schema definitions. I'm surprised there seems to be no such library. Perhaps code generation is considered very unpythonic? It would of course only be suitable only for data-access style class definitions; adding non-generic behavior would be strictly a no-no.
For users of the code, they would not need to know a codegen approach was used - everything would be there with an explicit API, with docs visible alongside the rest of the code in readthedocs etc.
The other approach would be dynamic classes derived from the marshmallow definitions. Again, as far as I can tell there is no such library (although the range of dynamic class generation approaches in python is impressive, I may have missed some). Arguably this would not buy you that much over the jsobjects approach, but there may be some advantages - it would be possible to interweave this with some explicit code with defined behaviors. The downside of a dynamic approach is that explicit is favored over implicit in the Python world.
The lack of libraries here means I'm either not finding something, or am not looking at this in a suitably pythonic way. I'm happy to contribute something to pypi but before adding yet-another meta-OO library I wanted to be sure I had done due diligence here.
In short, marshmallow schemas can be used to: Validate input data. Deserialize input data to app-level objects. Serialize app-level objects to primitive Python types. The serialized objects can then be rendered to standard formats such as JSON for use in an HTTP API.
Marshmallow is a Python library that converts complex data types to and from Python data types. It is a powerful tool for both validating and converting data.
The main component of Marshmallow is a Schema. A schema defines the rules that guides deserialization, called load, and serialization, called dump. It allows us to define the fields that will be loaded or dumped, add requirements on the fields, like validation or required.
Your question is quite vague, and so will be my answer, and quite subjective I hope that's ok. I am just some dude who spent the day reading serialization options in python.
I think that Marshmallow is fundamentally unpythonic, and there isn't a great way to use it, I don't intend to use it. I'll give what for me is two definitive examples.
General serializing libraries are kind of a deep rabbit hole because what you're really talking about requires elements of a type system, a parser and a graph traversal algorithm all in one go. Marshmallow doesn't do recursive parsing by default, so it fails point (2). On point (1), (kind of because of 2), it either requires extensive hacking or requires you to accept a java-like type system (everything is of known, enumerated type).
You asked about general serializing libraries, I found the library camel interesting and the blog post around it. For pickle, there's a powerful extension called dill and a mixin that handles versioning
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With