Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to override python's json handler?

Tags:

python

json

I'm having trouble encoding infinity in json.

json.dumps will convert this to "Infinity", but I would like it do convert it to null or another value of my choosing.

Unfortunately, setting default argument only seems to work if dumps does't already understand the object, otherwise the default handler appears to be bypassed.

Is there a way I can pre-encode the object, change the default way a type/class is encoded, or convert a certain type/class into a different object prior to normal encoding?

like image 274
cammil Avatar asked Jul 06 '13 14:07

cammil


People also ask

How do you manipulate JSON files in Python?

It's pretty easy to load a JSON object in Python. Python has a built-in package called json, which can be used to work with JSON data. It's done by using the JSON module, which provides us with a lot of methods which among loads() and load() methods are gonna help us to read the JSON file.

What is Python JSON dump?

The dump() method is used when the Python objects have to be stored in a file. The dumps() is used when the objects are required to be in string format and is used for parsing, printing, etc, . The dump() needs the json file name in which the output has to be stored as an argument.


2 Answers

Look at the source here: http://hg.python.org/cpython/file/7ec9255d4189/Lib/json/encoder.py

If you subclass JSONEncoder, you can override just the iterencode(self, o, _one_shot=False) method, which has explicit special casing for Infinity (inside an inner function).

To make this reusable, you'll also want to alter the __init__ to take some new options, and store them in the class.

Alternatively, you could pick a json library from pypi which has the appropriate extensibility you are looking for: https://pypi.python.org/pypi?%3Aaction=search&term=json&submit=search

Here's an example:

import json

class FloatEncoder(json.JSONEncoder):

    def __init__(self, nan_str = "null", **kwargs):
        super(FloatEncoder,self).__init__(**kwargs)
    self.nan_str = nan_str

    # uses code from official python json.encoder module.
    # Same licence applies.
    def iterencode(self, o, _one_shot=False):
        """Encode the given object and yield each string
        representation as available.

        For example::

            for chunk in JSONEncoder().iterencode(bigobject):
                mysocket.write(chunk)
        """
        if self.check_circular:
            markers = {}
        else:
            markers = None
        if self.ensure_ascii:
            _encoder = json.encoder.encode_basestring_ascii
        else:
            _encoder = json.encoder.encode_basestring
        if self.encoding != 'utf-8':
            def _encoder(o, _orig_encoder=_encoder,
                         _encoding=self.encoding):
                if isinstance(o, str):
                    o = o.decode(_encoding)
                return _orig_encoder(o)

        def floatstr(o, allow_nan=self.allow_nan,
                     _repr=json.encoder.FLOAT_REPR,
                     _inf=json.encoder.INFINITY,
                     _neginf=-json.encoder.INFINITY,
                     nan_str = self.nan_str):
            # Check for specials. Note that this type of test is 
            # processor and/or platform-specific, so do tests which
            # don't depend on the internals.

            if o != o:
                text = nan_str
            elif o == _inf:
                text = 'Infinity'
            elif o == _neginf:
                text = '-Infinity'
            else:
                return _repr(o)

            if not allow_nan:
                raise ValueError(
                    "Out of range float values are not JSON compliant: " +
                    repr(o))

            return text

        _iterencode = json.encoder._make_iterencode(
                markers, self.default, _encoder, self.indent, floatstr,
                self.key_separator, self.item_separator, self.sort_keys,
                self.skipkeys, _one_shot)
        return _iterencode(o, 0)


example_obj = {
    'name': 'example',
    'body': [
        1.1,
        {"3.3": 5, "1.1": float('Nan')},
        [float('inf'), 2.2]
    ]}

print json.dumps(example_obj, cls=FloatEncoder)

ideone: http://ideone.com/dFWaNj

like image 133
Marcin Avatar answered Sep 20 '22 01:09

Marcin


No, there is no simple way to achieve this. In fact, NaN and Infinity floating point values shouldn't be serialized with json at all, according to the standard. Python uses an extension of the standard. You can make the python encoding standard-compliant passing the allow_nan=False parameter to dumps, but this will raise a ValueError for infinity/nans even if you provide a default function.

You have two ways of doing what you want:

  1. Subclass JSONEncoder and change how these values are encoded. Note that you will have to take into account cases where a sequence can contain an infinity value etc. AFAIK there is no API to redefine how objects of a specific class are encoded.

  2. Make a copy of the object to encode and replace any occurrence of infinity/nan with None or some other object that is encoded as you want.

A less robust, yet much simpler solution, is to modify the encoded data, for example replacing all Infinity substrings with null:

>>> import re
>>> infty_regex = re.compile(r'\bInfinity\b')
>>> def replace_infinities(encoded):
...     regex = re.compile(r'\bInfinity\b')
...     return regex.sub('null', encoded)
... 
>>> import json
>>> replace_infinities(json.dumps([1, 2, 3, float('inf'), 4]))
'[1, 2, 3, null, 4]'

Obviously you should take into account the text Infinity inside strings etc., so even here a robust solution is not immediate, nor elegant.

like image 41
Bakuriu Avatar answered Sep 18 '22 01:09

Bakuriu