I am trying to subclass json.JSONEncoder
such that named tuples (defined using the new Python 3.6+ syntax, but it probably still applies to the output of collections.namedtuple
) are serialised to JSON objects, where the tuple fields correspond to object keys.
For example:
from typing import NamedTuple
class MyModel(NamedTuple):
foo:int
bar:str = "Hello, World!"
a = MyModel(123) # Expected JSON: {"foo": 123, "bar": "Hello, World!"}
b = MyModel(456, "xyzzy") # Expected JSON: {"foo": 456, "bar": "xyzzy"}
My understanding is that I subclass json.JSONEncoder
and override its default
method to provide serialisations for new types. The rest of the class will then do the right thing with respect to recursion, etc. I thus came up with the following:
class MyJSONEncoder(json.JSONEncoder):
def default(self, o):
to_encode = None
if isinstance(o, tuple) and hasattr(o, "_asdict"):
# Dictionary representation of a named tuple
to_encode = o._asdict()
if isinstance(o, datetime):
# String representation of a datetime
to_encode = o.strftime("%Y-%m-%dT%H:%M:%S")
# Why not super().default(to_encode or o)??
return to_encode or o
This works when it tries to serialise (i.e., as the cls
parameter to json.dumps
) a datetime
value -- to at least partially prove my hypothesis -- but the check for named tuples is never hit and it defaults to serialising it as a tuple (i.e., to a JSON array). Weirdly, I had presumed that I should call the superclass' default
method on my transformed object, but this then raises an exception when it tries to serialise a datetime
: "TypeError: Object of type 'str' is not JSON serializable", which frankly makes no sense!
I get the same behaviour if I make the named tuple type check more specific (e.g., isinstance(o, MyModel)
). I did find, however, that I can almost get the behaviour I'm looking for if I also override the encode
method, by moving the named tuple check to there:
class AlmostWorkingJSONEncoder(json.JSONEncoder):
def default(self, o):
to_encode = None
if isinstance(o, datetime):
# String representation of a datetime
to_encode = o.strftime("%Y-%m-%dT%H:%M:%S")
return to_encode or o
def encode(self, o):
to_encode = None
if isinstance(o, tuple) and hasattr(o, "_asdict"):
# Dictionary representation of a named tuple
to_encode = o._asdict()
# Here we *do* need to call the superclass' encode method??
return super().encode(to_encode or o)
This works, but not recursively: It successfully serialises top-level named tuples into JSON objects, per my requirement, but any named tuples that exist within that named tuple will be serialised with the default behaviour (JSON array). This is also the behaviour if I put the named tuple type check in both the default
and encode
methods.
The documentation implies that only the default
method should be changed in subclasses. I presume, for example, that overriding encode
in AlmostWorkingJSONEncoder
will cause it to break when it's doing chunked encoding. However, no amount of hackery has so far yielded what I want (or expect to happen, given the scant documentation).
Where is my misunderstanding?
EDIT Reading the code for json.JSONEncoder
explains why the default
method raises a type error when you pass it a string: It's not clear (at least to me) from the documentation, but the default
method is meant to transform values of some unsupported type into a serialisable type, which is then returned; if the unsupported type is not transformed into anything in your overridden method, then you should call super().default(o)
at the end to invoke a type error. So something like this:
class SubJSONEncoder(json.JSONEncoder):
def default(self, o):
if isinstance(o, Foo):
return SerialisableFoo(o)
if isinstance(o, Bar):
return SerialisableBar(o)
# etc., etc.
# No more serialisation options available, so raise a type error
super().default(o)
I believe the problem I'm experiencing is that the default
method is only called by the encoder when it can't match any supported types. A named tuple is still a tuple -- which is supported -- so it matches that first before delegating to my overridden default
method. In Python 2.7, the functions that did this matching are part of the JSONEncoder
object, but in Python 3, they seem to have been moved outside into the module namespace (and, thus, not accessible to userland). I thus believe it is not possible to subclass JSONEncoder
to serialise named tuples in a generic way without doing a lot of rewriting and hard-coupling to your own implementation :(
EDIT 2 I submitted this as a bug.
JSONEncoder is a class within the Flask project under the flask. json module. JSONEncoder is the default JSON encoder for Flask and was designed to handle more types than Python's standard library json module. jsonify is another callable from the flask. json package with code examples.
It is used to convert the array of JSON objects into a sorted JSON object. The value of the sort_keys argument of the dumps() function will require to set True to generate the sorted JSON objects from the array of JSON objects.
The json. dumps() method allows us to convert a python object into an equivalent JSON object. Or in other words to send the data from python to json. The json. dump() method allows us to convert a python object into an equivalent JSON object and store the result into a JSON file at the working directory.
Hmm, I just looked at the source and there doesn't appear to be a public hook to control how instances of list or tuple get serialized.
An unsafe approach is to monkey patch the _make_iterencode() private function.
Another approach is to preprocess the input, converting the named tuples into dicts:
from json import JSONEncoder
from typing import NamedTuple
from datetime import datetime
def preprocess(tree):
if isinstance(tree, dict):
return {k: preprocess(v) for k, v in tree.items()}
if isinstance(tree, tuple) and hasattr(tree, '_asdict'):
return preprocess(tree._asdict())
if isinstance(tree, (list, tuple)):
return list(map(preprocess, tree))
return tree
class MD(JSONEncoder):
def default(self, o):
if isinstance(o, datetime):
return o.strftime("%Y-%m-%dT%H:%M:%S")
return super().default(o)
Applied to these models:
class MyModel(NamedTuple):
foo: int
bar: str = "Hello, World!"
class LayeredModel(NamedTuple):
baz: MyModel
fob: list
a = MyModel(123)
b = MyModel(456, "xyzzy")
c = LayeredModel(a, [a, b])
outer = dict(a=a, b=b, c=c, d=datetime.now(), e=10)
print(MD().encode(preprocess(outer)))
Gives this output:
{"a": {"foo": 123, "bar": "Hello, World!"},
"b": {"foo": 456, "bar": "xyzzy"},
"c": {"baz": {"foo": 123, "bar": "Hello, World!"},
"fob": [{"foo": 123, "bar": "Hello, World!"},
{"foo": 456, "bar": "xyzzy"}]},
"d": "2019-11-03T10:46:17",
"e": 10}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With