It is easy to change the format of an object which is not JSON serializable eg datetime.datetime.
My requirement, for debugging purposes, is to alter the way some custom objects extended from base ones like dict
and list
, get serialized in json format . Code :
import datetime import json def json_debug_handler(obj): print("object received:") print type(obj) print("\n\n") if isinstance(obj, datetime.datetime): return obj.isoformat() elif isinstance(obj,mDict): return {'orig':obj , 'attrs': vars(obj)} elif isinstance(obj,mList): return {'orig':obj, 'attrs': vars(obj)} else: return None class mDict(dict): pass class mList(list): pass def test_debug_json(): games = mList(['mario','contra','tetris']) games.src = 'console' scores = mDict({'dp':10,'pk':45}) scores.processed = "unprocessed" test_json = { 'games' : games , 'scores' : scores , 'date': datetime.datetime.now() } print(json.dumps(test_json,default=json_debug_handler)) if __name__ == '__main__': test_debug_json()
Output:
{"date": "2013-05-07T01:03:13.098727", "games": ["mario", "contra", "tetris"], "scores": {"pk": 45, "dp": 10}}
Desired output:
{"date": "2013-05-07T01:03:13.098727", "games": { "orig": ["mario", "contra", "tetris"] ,"attrs" : { "src":"console"}} , "scores": { "orig": {"pk": 45, "dp": 10},"attrs": "processed":"unprocessed }}
Does the default
handler not work for serializable objects ? If not, how can I override this, without adding toJSON methods to the extended classes ?
Also, there is this version of JSON encoder which does not work :
class JsonDebugEncoder(json.JSONEncoder): def default(self,obj): if isinstance(obj, datetime.datetime): return obj.isoformat() elif isinstance(obj,mDict): return {'orig':obj , 'attrs': vars(obj)} elif isinstance(obj,mList): return {'orig':obj, 'attrs': vars(obj)} else: return json.JSONEncoder.default(self, obj)
If there is a hack with pickle,__getstate__,__setstate__,
and then using json.dumps over pickle.loads object , I am open to that as well, I tried , but that did not work.
Use toJSON() Method to make class JSON serializable So we don't need to write custom JSONEncoder. This new toJSON() serializer method will return the JSON representation of the Object. i.e., It will convert custom Python Object to JSON string.
If you try to convert the Set to json, you will get this error: TypeError: Object of type set is not JSON serializable. This is because the inbuilt Python json module can only handle primitives data types with a direct JSON equivalent and not complex data types like Set.
Serialization is the process of transforming objects of complex data types (custom-defined classes, object-relational mappers, datetime, etc.) to native data types so that they can then be easily converted to JSON notation.
The Python "TypeError: Object of type method is not JSON serializable" occurs when we try to serialize a method to JSON. To solve the error, make sure to call the method and serialize the object that the method returns.
Let’ see how to write custom encoder to JSON serializable Python set. Python json module provides a JSONEncoder to encode python types into JSON. We can extend by implementing its default () method that can JSON serializable set. The json.dump () and json.dumps () methods of the json module has a cls kwarg.
Serialization is the process of encoding the from naive datat type to JSON format. The Python module json converts a Python dictionary object into JSON object, and list and tuple are converted into JSON array, and int and float converted as JSON number, None converted as JSON null. Attention geek!
You are here because when you try to dump or encode Python set into JSON, you received an error, TypeError: Object of type set is not JSON serializable. The built-in json module of Python can only handle Python primitives types that have a direct JSON equivalent. i.e.,
The Python module json converts a Python dictionary object into JSON object, and list and tuple are converted into JSON array, and int and float converted as JSON number, None converted as JSON null. Let’s take a look at how we serialize Python data to JSON format with these methods: Dump(). Dumps(). json.dump()
It seems that to achieve the behavior you want, with the given restrictions, you'll have to delve into the JSONEncoder
class a little. Below I've written out a custom JSONEncoder
that overrides the iterencode
method to pass a custom isinstance
method to _make_iterencode
. It isn't the cleanest thing in the world, but seems to be the best given the options and it keeps customization to a minimum.
# customencoder.py from json.encoder import (_make_iterencode, JSONEncoder, encode_basestring_ascii, FLOAT_REPR, INFINITY, c_make_encoder, encode_basestring) class CustomObjectEncoder(JSONEncoder): def iterencode(self, o, _one_shot=False): """ Most of the original method has been left untouched. _one_shot is forced to False to prevent c_make_encoder from being used. c_make_encoder is a funcion defined in C, so it's easier to avoid using it than overriding/redefining it. The keyword argument isinstance for _make_iterencode has been set to self.isinstance. This allows for a custom isinstance function to be defined, which can be used to defer the serialization of custom objects to the default method. """ # Force the use of _make_iterencode instead of c_make_encoder _one_shot = False if self.check_circular: markers = {} else: markers = None if self.ensure_ascii: _encoder = encode_basestring_ascii else: _encoder = encode_basestring if self.encoding != 'utf-8': def _encoder(o, _orig_encoder=_encoder, _encoding=self.encoding): if isinstance(o, str): o = o.decode(_encoding) return _orig_encoder(o) def floatstr(o, allow_nan=self.allow_nan, _repr=FLOAT_REPR, _inf=INFINITY, _neginf=-INFINITY): if o != o: text = 'NaN' elif o == _inf: text = 'Infinity' elif o == _neginf: text = '-Infinity' else: return _repr(o) if not allow_nan: raise ValueError( "Out of range float values are not JSON compliant: " + repr(o)) return text # Instead of forcing _one_shot to False, you can also just # remove the first part of this conditional statement and only # call _make_iterencode if (_one_shot and c_make_encoder is not None and self.indent is None and not self.sort_keys): _iterencode = c_make_encoder( markers, self.default, _encoder, self.indent, self.key_separator, self.item_separator, self.sort_keys, self.skipkeys, self.allow_nan) else: _iterencode = _make_iterencode( markers, self.default, _encoder, self.indent, floatstr, self.key_separator, self.item_separator, self.sort_keys, self.skipkeys, _one_shot, isinstance=self.isinstance) return _iterencode(o, 0)
You can now subclass the CustomObjectEncoder
so it correctly serializes your custom objects. The CustomObjectEncoder
can also do cool stuff like handle nested objects.
# test.py import json import datetime from customencoder import CustomObjectEncoder class MyEncoder(CustomObjectEncoder): def isinstance(self, obj, cls): if isinstance(obj, (mList, mDict)): return False return isinstance(obj, cls) def default(self, obj): """ Defines custom serialization. To avoid circular references, any object that will always fail self.isinstance must be converted to something that is deserializable here. """ if isinstance(obj, datetime.datetime): return obj.isoformat() elif isinstance(obj, mDict): return {"orig": dict(obj), "attrs": vars(obj)} elif isinstance(obj, mList): return {"orig": list(obj), "attrs": vars(obj)} else: return None class mList(list): pass class mDict(dict): pass def main(): zelda = mList(['zelda']) zelda.src = "oldschool" games = mList(['mario', 'contra', 'tetris', zelda]) games.src = 'console' scores = mDict({'dp': 10, 'pk': 45}) scores.processed = "unprocessed" test_json = {'games': games, 'scores': scores, 'date': datetime.datetime.now()} print(json.dumps(test_json, cls=MyEncoder)) if __name__ == '__main__': main()
The answer by FastTurtle might be a much cleaner solution.
Here's something close to what you want based on the technique as explained in my question/answer: Overriding nested JSON encoding of inherited default supported objects like dict, list
import json import datetime class mDict(dict): pass class mList(list): pass class JsonDebugEncoder(json.JSONEncoder): def _iterencode(self, o, markers=None): if isinstance(o, mDict): yield '{"__mDict__": ' # Encode dictionary yield '{"orig": ' for chunk in super(JsonDebugEncoder, self)._iterencode(o, markers): yield chunk yield ', ' # / End of Encode dictionary # Encode attributes yield '"attr": ' for key, value in o.__dict__.iteritems(): yield '{"' + key + '": ' for chunk in super(JsonDebugEncoder, self)._iterencode(value, markers): yield chunk yield '}' yield '}' # / End of Encode attributes yield '}' elif isinstance(o, mList): yield '{"__mList__": ' # Encode list yield '{"orig": ' for chunk in super(JsonDebugEncoder, self)._iterencode(o, markers): yield chunk yield ', ' # / End of Encode list # Encode attributes yield '"attr": ' for key, value in o.__dict__.iteritems(): yield '{"' + key + '": ' for chunk in super(JsonDebugEncoder, self)._iterencode(value, markers): yield chunk yield '}' yield '}' # / End of Encode attributes yield '}' else: for chunk in super(JsonDebugEncoder, self)._iterencode(o, markers=markers): yield chunk def default(self, obj): if isinstance(obj, datetime.datetime): return obj.isoformat() class JsonDebugDecoder(json.JSONDecoder): def decode(self, s): obj = super(JsonDebugDecoder, self).decode(s) obj = self.recursiveObjectDecode(obj) return obj def recursiveObjectDecode(self, obj): if isinstance(obj, dict): decoders = [("__mList__", self.mListDecode), ("__mDict__", self.mDictDecode)] for placeholder, decoder in decoders: if placeholder in obj: # We assume it's supposed to be converted return decoder(obj[placeholder]) else: for k in obj: obj[k] = self.recursiveObjectDecode(obj[k]) elif isinstance(obj, list): for x in range(len(obj)): obj[x] = self.recursiveObjectDecode(obj[x]) return obj def mDictDecode(self, o): res = mDict() for key, value in o['orig'].iteritems(): res[key] = self.recursiveObjectDecode(value) for key, value in o['attr'].iteritems(): res.__dict__[key] = self.recursiveObjectDecode(value) return res def mListDecode(self, o): res = mList() for value in o['orig']: res.append(self.recursiveObjectDecode(value)) for key, value in o['attr'].iteritems(): res.__dict__[key] = self.recursiveObjectDecode(value) return res def test_debug_json(): games = mList(['mario','contra','tetris']) games.src = 'console' scores = mDict({'dp':10,'pk':45}) scores.processed = "unprocessed" test_json = { 'games' : games, 'scores' : scores ,'date': datetime.datetime.now() } jsonDump = json.dumps(test_json, cls=JsonDebugEncoder) print jsonDump test_pyObject = json.loads(jsonDump, cls=JsonDebugDecoder) print test_pyObject if __name__ == '__main__': test_debug_json()
This results in:
{"date": "2013-05-06T22:28:08.967000", "games": {"__mList__": {"orig": ["mario", "contra", "tetris"], "attr": {"src": "console"}}}, "scores": {"__mDict__": {"orig": {"pk": 45, "dp": 10}, "attr": {"processed": "unprocessed"}}}}
This way you can encode it and decode it back to the python object it came from.
EDIT:
Here's a version that actually encodes it to the output you wanted and can decode it as well. Whenever a dictionary contains 'orig' and 'attr' it will check if 'orig' contains a dictionary or a list, if so it will respectively convert the object back to the mDict or mList.
import json import datetime class mDict(dict): pass class mList(list): pass class JsonDebugEncoder(json.JSONEncoder): def _iterencode(self, o, markers=None): if isinstance(o, mDict): # Encode mDict yield '{"orig": ' for chunk in super(JsonDebugEncoder, self)._iterencode(o, markers): yield chunk yield ', ' yield '"attr": ' for key, value in o.__dict__.iteritems(): yield '{"' + key + '": ' for chunk in super(JsonDebugEncoder, self)._iterencode(value, markers): yield chunk yield '}' yield '}' # / End of Encode attributes elif isinstance(o, mList): # Encode mList yield '{"orig": ' for chunk in super(JsonDebugEncoder, self)._iterencode(o, markers): yield chunk yield ', ' yield '"attr": ' for key, value in o.__dict__.iteritems(): yield '{"' + key + '": ' for chunk in super(JsonDebugEncoder, self)._iterencode(value, markers): yield chunk yield '}' yield '}' else: for chunk in super(JsonDebugEncoder, self)._iterencode(o, markers=markers): yield chunk def default(self, obj): if isinstance(obj, datetime.datetime): # Encode datetime return obj.isoformat() class JsonDebugDecoder(json.JSONDecoder): def decode(self, s): obj = super(JsonDebugDecoder, self).decode(s) obj = self.recursiveObjectDecode(obj) return obj def recursiveObjectDecode(self, obj): if isinstance(obj, dict): if "orig" in obj and "attr" in obj and isinstance(obj["orig"], list): return self.mListDecode(obj) elif "orig" in obj and "attr" in obj and isinstance(obj['orig'], dict): return self.mDictDecode(obj) else: for k in obj: obj[k] = self.recursiveObjectDecode(obj[k]) elif isinstance(obj, list): for x in range(len(obj)): obj[x] = self.recursiveObjectDecode(obj[x]) return obj def mDictDecode(self, o): res = mDict() for key, value in o['orig'].iteritems(): res[key] = self.recursiveObjectDecode(value) for key, value in o['attr'].iteritems(): res.__dict__[key] = self.recursiveObjectDecode(value) return res def mListDecode(self, o): res = mList() for value in o['orig']: res.append(self.recursiveObjectDecode(value)) for key, value in o['attr'].iteritems(): res.__dict__[key] = self.recursiveObjectDecode(value) return res def test_debug_json(): games = mList(['mario','contra','tetris']) games.src = 'console' scores = mDict({'dp':10,'pk':45}) scores.processed = "unprocessed" test_json = { 'games' : games, 'scores' : scores ,'date': datetime.datetime.now() } jsonDump = json.dumps(test_json, cls=JsonDebugEncoder) print jsonDump test_pyObject = json.loads(jsonDump, cls=JsonDebugDecoder) print test_pyObject print test_pyObject['games'].src if __name__ == '__main__': test_debug_json()
Here's some more info about the output:
# Encoded {"date": "2013-05-06T22:41:35.498000", "games": {"orig": ["mario", "contra", "tetris"], "attr": {"src": "console"}}, "scores": {"orig": {"pk": 45, "dp": 10}, "attr": {"processed": "unprocessed"}}} # Decoded ('games' contains the mList with the src attribute and 'scores' contains the mDict processed attribute) # Note that printing the python objects doesn't directly show the processed and src attributes, as seen below. {u'date': u'2013-05-06T22:41:35.498000', u'games': [u'mario', u'contra', u'tetris'], u'scores': {u'pk': 45, u'dp': 10}}
Sorry for any bad naming conventions, it's a quick setup. ;)
Note: The datetime doesn't get decoded back to the python representation. Implementing that could be done by checking for any dict key that is called 'date' and contains a valid string representation of a datetime.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With