Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to change json encoding behaviour for serializable python object?

Tags:

python

json

It is easy to change the format of an object which is not JSON serializable eg datetime.datetime.

My requirement, for debugging purposes, is to alter the way some custom objects extended from base ones like dict and list , get serialized in json format . Code :

import datetime import json  def json_debug_handler(obj):     print("object received:")     print type(obj)     print("\n\n")     if  isinstance(obj, datetime.datetime):         return obj.isoformat()     elif isinstance(obj,mDict):         return {'orig':obj , 'attrs': vars(obj)}     elif isinstance(obj,mList):         return {'orig':obj, 'attrs': vars(obj)}     else:         return None   class mDict(dict):     pass   class mList(list):     pass   def test_debug_json():     games = mList(['mario','contra','tetris'])     games.src = 'console'     scores = mDict({'dp':10,'pk':45})     scores.processed = "unprocessed"     test_json = { 'games' : games , 'scores' : scores , 'date': datetime.datetime.now() }     print(json.dumps(test_json,default=json_debug_handler))  if __name__ == '__main__':     test_debug_json() 

DEMO : http://ideone.com/hQJnLy

Output:

{"date": "2013-05-07T01:03:13.098727", "games": ["mario", "contra", "tetris"], "scores": {"pk": 45, "dp": 10}}

Desired output:

{"date": "2013-05-07T01:03:13.098727", "games": { "orig": ["mario", "contra", "tetris"] ,"attrs" : { "src":"console"}} , "scores": { "orig": {"pk": 45, "dp": 10},"attrs": "processed":"unprocessed }}

Does the default handler not work for serializable objects ? If not, how can I override this, without adding toJSON methods to the extended classes ?

Also, there is this version of JSON encoder which does not work :

class JsonDebugEncoder(json.JSONEncoder):     def default(self,obj):         if  isinstance(obj, datetime.datetime):             return obj.isoformat()         elif isinstance(obj,mDict):             return {'orig':obj , 'attrs': vars(obj)}         elif isinstance(obj,mList):             return {'orig':obj, 'attrs': vars(obj)}         else:             return json.JSONEncoder.default(self, obj) 

If there is a hack with pickle,__getstate__,__setstate__,and then using json.dumps over pickle.loads object , I am open to that as well, I tried , but that did not work.

like image 241
DhruvPathak Avatar asked May 06 '13 19:05

DhruvPathak


People also ask

How do I make my Python object JSON serializable?

Use toJSON() Method to make class JSON serializable So we don't need to write custom JSONEncoder. This new toJSON() serializer method will return the JSON representation of the Object. i.e., It will convert custom Python Object to JSON string.

Is Python set JSON serializable?

If you try to convert the Set to json, you will get this error: TypeError: Object of type set is not JSON serializable. This is because the inbuilt Python json module can only handle primitives data types with a direct JSON equivalent and not complex data types like Set.

What is JSON serializable object?

Serialization is the process of transforming objects of complex data types (custom-defined classes, object-relational mappers, datetime, etc.) to native data types so that they can then be easily converted to JSON notation.

Is not JSON serializable Python error?

The Python "TypeError: Object of type method is not JSON serializable" occurs when we try to serialize a method to JSON. To solve the error, make sure to call the method and serialize the object that the method returns.

How to write custom encoder to JSON serializable Python set?

Let’ see how to write custom encoder to JSON serializable Python set. Python json module provides a JSONEncoder to encode python types into JSON. We can extend by implementing its default () method that can JSON serializable set. The json.dump () and json.dumps () methods of the json module has a cls kwarg.

What is serialization in Python?

Serialization is the process of encoding the from naive datat type to JSON format. The Python module json converts a Python dictionary object into JSON object, and list and tuple are converted into JSON array, and int and float converted as JSON number, None converted as JSON null. Attention geek!

Why is my Python set not JSON serializable?

You are here because when you try to dump or encode Python set into JSON, you received an error, TypeError: Object of type set is not JSON serializable. The built-in json module of Python can only handle Python primitives types that have a direct JSON equivalent. i.e.,

How to convert Python data to JSON format?

The Python module json converts a Python dictionary object into JSON object, and list and tuple are converted into JSON array, and int and float converted as JSON number, None converted as JSON null. Let’s take a look at how we serialize Python data to JSON format with these methods: Dump(). Dumps(). json.dump()


2 Answers

It seems that to achieve the behavior you want, with the given restrictions, you'll have to delve into the JSONEncoder class a little. Below I've written out a custom JSONEncoder that overrides the iterencode method to pass a custom isinstance method to _make_iterencode. It isn't the cleanest thing in the world, but seems to be the best given the options and it keeps customization to a minimum.

# customencoder.py from json.encoder import (_make_iterencode, JSONEncoder,                           encode_basestring_ascii, FLOAT_REPR, INFINITY,                           c_make_encoder, encode_basestring)   class CustomObjectEncoder(JSONEncoder):      def iterencode(self, o, _one_shot=False):         """         Most of the original method has been left untouched.          _one_shot is forced to False to prevent c_make_encoder from         being used. c_make_encoder is a funcion defined in C, so it's easier         to avoid using it than overriding/redefining it.          The keyword argument isinstance for _make_iterencode has been set         to self.isinstance. This allows for a custom isinstance function         to be defined, which can be used to defer the serialization of custom         objects to the default method.         """         # Force the use of _make_iterencode instead of c_make_encoder         _one_shot = False          if self.check_circular:             markers = {}         else:             markers = None         if self.ensure_ascii:             _encoder = encode_basestring_ascii         else:             _encoder = encode_basestring         if self.encoding != 'utf-8':             def _encoder(o, _orig_encoder=_encoder, _encoding=self.encoding):                 if isinstance(o, str):                     o = o.decode(_encoding)                 return _orig_encoder(o)          def floatstr(o, allow_nan=self.allow_nan,                      _repr=FLOAT_REPR, _inf=INFINITY, _neginf=-INFINITY):             if o != o:                 text = 'NaN'             elif o == _inf:                 text = 'Infinity'             elif o == _neginf:                 text = '-Infinity'             else:                 return _repr(o)              if not allow_nan:                 raise ValueError(                     "Out of range float values are not JSON compliant: " +                     repr(o))              return text          # Instead of forcing _one_shot to False, you can also just         # remove the first part of this conditional statement and only         # call _make_iterencode         if (_one_shot and c_make_encoder is not None                 and self.indent is None and not self.sort_keys):             _iterencode = c_make_encoder(                 markers, self.default, _encoder, self.indent,                 self.key_separator, self.item_separator, self.sort_keys,                 self.skipkeys, self.allow_nan)         else:             _iterencode = _make_iterencode(                 markers, self.default, _encoder, self.indent, floatstr,                 self.key_separator, self.item_separator, self.sort_keys,                 self.skipkeys, _one_shot, isinstance=self.isinstance)         return _iterencode(o, 0) 

You can now subclass the CustomObjectEncoder so it correctly serializes your custom objects. The CustomObjectEncoder can also do cool stuff like handle nested objects.

# test.py import json import datetime from customencoder import CustomObjectEncoder   class MyEncoder(CustomObjectEncoder):      def isinstance(self, obj, cls):         if isinstance(obj, (mList, mDict)):             return False         return isinstance(obj, cls)      def default(self, obj):         """         Defines custom serialization.          To avoid circular references, any object that will always fail         self.isinstance must be converted to something that is         deserializable here.         """         if isinstance(obj, datetime.datetime):             return obj.isoformat()         elif isinstance(obj, mDict):             return {"orig": dict(obj), "attrs": vars(obj)}         elif isinstance(obj, mList):             return {"orig": list(obj), "attrs": vars(obj)}         else:             return None   class mList(list):     pass   class mDict(dict):     pass   def main():     zelda = mList(['zelda'])     zelda.src = "oldschool"     games = mList(['mario', 'contra', 'tetris', zelda])     games.src = 'console'     scores = mDict({'dp': 10, 'pk': 45})     scores.processed = "unprocessed"     test_json = {'games': games, 'scores': scores,                  'date': datetime.datetime.now()}     print(json.dumps(test_json, cls=MyEncoder))  if __name__ == '__main__':     main() 
like image 50
FastTurtle Avatar answered Sep 22 '22 20:09

FastTurtle


The answer by FastTurtle might be a much cleaner solution.

Here's something close to what you want based on the technique as explained in my question/answer: Overriding nested JSON encoding of inherited default supported objects like dict, list

import json import datetime   class mDict(dict):     pass   class mList(list):     pass   class JsonDebugEncoder(json.JSONEncoder):     def _iterencode(self, o, markers=None):         if isinstance(o, mDict):             yield '{"__mDict__": '             # Encode dictionary             yield '{"orig": '             for chunk in super(JsonDebugEncoder, self)._iterencode(o, markers):                 yield chunk             yield ', '             # / End of Encode dictionary             # Encode attributes             yield '"attr": '             for key, value in o.__dict__.iteritems():                 yield '{"' + key + '": '                 for chunk in super(JsonDebugEncoder, self)._iterencode(value, markers):                     yield chunk                 yield '}'             yield '}'             # / End of Encode attributes             yield '}'         elif isinstance(o, mList):             yield '{"__mList__": '             # Encode list             yield '{"orig": '             for chunk in super(JsonDebugEncoder, self)._iterencode(o, markers):                 yield chunk             yield ', '             # / End of Encode list             # Encode attributes             yield '"attr": '             for key, value in o.__dict__.iteritems():                 yield '{"' + key + '": '                 for chunk in super(JsonDebugEncoder, self)._iterencode(value, markers):                     yield chunk                 yield '}'             yield '}'             # / End of Encode attributes             yield '}'         else:             for chunk in super(JsonDebugEncoder, self)._iterencode(o, markers=markers):                 yield chunk      def default(self, obj):         if isinstance(obj, datetime.datetime):             return obj.isoformat()   class JsonDebugDecoder(json.JSONDecoder):     def decode(self, s):         obj = super(JsonDebugDecoder, self).decode(s)         obj = self.recursiveObjectDecode(obj)         return obj      def recursiveObjectDecode(self, obj):         if isinstance(obj, dict):             decoders = [("__mList__", self.mListDecode),                         ("__mDict__", self.mDictDecode)]             for placeholder, decoder in decoders:                 if placeholder in obj:                  # We assume it's supposed to be converted                     return decoder(obj[placeholder])                 else:                     for k in obj:                         obj[k] = self.recursiveObjectDecode(obj[k])         elif isinstance(obj, list):             for x in range(len(obj)):                 obj[x] = self.recursiveObjectDecode(obj[x])         return obj      def mDictDecode(self, o):         res = mDict()         for key, value in o['orig'].iteritems():             res[key] = self.recursiveObjectDecode(value)         for key, value in o['attr'].iteritems():             res.__dict__[key] = self.recursiveObjectDecode(value)         return res      def mListDecode(self, o):         res = mList()         for value in o['orig']:             res.append(self.recursiveObjectDecode(value))         for key, value in o['attr'].iteritems():             res.__dict__[key] = self.recursiveObjectDecode(value)         return res   def test_debug_json():     games = mList(['mario','contra','tetris'])     games.src = 'console'     scores = mDict({'dp':10,'pk':45})     scores.processed = "unprocessed"     test_json = { 'games' : games, 'scores' : scores ,'date': datetime.datetime.now() }     jsonDump = json.dumps(test_json, cls=JsonDebugEncoder)     print jsonDump     test_pyObject = json.loads(jsonDump, cls=JsonDebugDecoder)     print test_pyObject  if __name__ == '__main__':     test_debug_json() 

This results in:

{"date": "2013-05-06T22:28:08.967000", "games": {"__mList__": {"orig": ["mario", "contra", "tetris"], "attr": {"src": "console"}}}, "scores": {"__mDict__": {"orig": {"pk": 45, "dp": 10}, "attr": {"processed": "unprocessed"}}}} 

This way you can encode it and decode it back to the python object it came from.

EDIT:

Here's a version that actually encodes it to the output you wanted and can decode it as well. Whenever a dictionary contains 'orig' and 'attr' it will check if 'orig' contains a dictionary or a list, if so it will respectively convert the object back to the mDict or mList.

import json import datetime   class mDict(dict):     pass   class mList(list):     pass   class JsonDebugEncoder(json.JSONEncoder):     def _iterencode(self, o, markers=None):         if isinstance(o, mDict):    # Encode mDict             yield '{"orig": '             for chunk in super(JsonDebugEncoder, self)._iterencode(o, markers):                 yield chunk             yield ', '             yield '"attr": '             for key, value in o.__dict__.iteritems():                 yield '{"' + key + '": '                 for chunk in super(JsonDebugEncoder, self)._iterencode(value, markers):                     yield chunk                 yield '}'             yield '}'             # / End of Encode attributes         elif isinstance(o, mList):    # Encode mList             yield '{"orig": '             for chunk in super(JsonDebugEncoder, self)._iterencode(o, markers):                 yield chunk             yield ', '             yield '"attr": '             for key, value in o.__dict__.iteritems():                 yield '{"' + key + '": '                 for chunk in super(JsonDebugEncoder, self)._iterencode(value, markers):                     yield chunk                 yield '}'             yield '}'         else:             for chunk in super(JsonDebugEncoder, self)._iterencode(o, markers=markers):                 yield chunk      def default(self, obj):         if isinstance(obj, datetime.datetime):    # Encode datetime             return obj.isoformat()   class JsonDebugDecoder(json.JSONDecoder):     def decode(self, s):         obj = super(JsonDebugDecoder, self).decode(s)         obj = self.recursiveObjectDecode(obj)         return obj      def recursiveObjectDecode(self, obj):         if isinstance(obj, dict):             if "orig" in obj and "attr" in obj and isinstance(obj["orig"], list):                 return self.mListDecode(obj)             elif "orig" in obj and "attr" in obj and isinstance(obj['orig'], dict):                 return self.mDictDecode(obj)             else:                 for k in obj:                     obj[k] = self.recursiveObjectDecode(obj[k])         elif isinstance(obj, list):             for x in range(len(obj)):                 obj[x] = self.recursiveObjectDecode(obj[x])         return obj      def mDictDecode(self, o):         res = mDict()         for key, value in o['orig'].iteritems():             res[key] = self.recursiveObjectDecode(value)         for key, value in o['attr'].iteritems():             res.__dict__[key] = self.recursiveObjectDecode(value)         return res      def mListDecode(self, o):         res = mList()         for value in o['orig']:             res.append(self.recursiveObjectDecode(value))         for key, value in o['attr'].iteritems():             res.__dict__[key] = self.recursiveObjectDecode(value)         return res   def test_debug_json():     games = mList(['mario','contra','tetris'])     games.src = 'console'     scores = mDict({'dp':10,'pk':45})     scores.processed = "unprocessed"     test_json = { 'games' : games, 'scores' : scores ,'date': datetime.datetime.now() }     jsonDump = json.dumps(test_json, cls=JsonDebugEncoder)     print jsonDump     test_pyObject = json.loads(jsonDump, cls=JsonDebugDecoder)     print test_pyObject     print test_pyObject['games'].src  if __name__ == '__main__':     test_debug_json() 

Here's some more info about the output:

# Encoded {"date": "2013-05-06T22:41:35.498000", "games": {"orig": ["mario", "contra", "tetris"], "attr": {"src": "console"}}, "scores": {"orig": {"pk": 45, "dp": 10}, "attr": {"processed": "unprocessed"}}}  # Decoded ('games' contains the mList with the src attribute and 'scores' contains the mDict processed attribute) # Note that printing the python objects doesn't directly show the processed and src attributes, as seen below. {u'date': u'2013-05-06T22:41:35.498000', u'games': [u'mario', u'contra', u'tetris'], u'scores': {u'pk': 45, u'dp': 10}} 

Sorry for any bad naming conventions, it's a quick setup. ;)

Note: The datetime doesn't get decoded back to the python representation. Implementing that could be done by checking for any dict key that is called 'date' and contains a valid string representation of a datetime.

like image 42
Roy Nieterau Avatar answered Sep 24 '22 20:09

Roy Nieterau