Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to limit the number of float digits JSONEncoder produces?

I am trying to set the python json library up in order to save to file a dictionary having as elements other dictionaries. There are many float numbers and I would like to limit the number of digits to, for example, 7.

According to other posts on SO encoder.FLOAT_REPR shall be used. However it is not working.

For example the code below, run in Python3.7.1, prints all the digits:

import json
json.encoder.FLOAT_REPR = lambda o: format(o, '.7f' )
d = dict()
d['val'] = 5.78686876876089075543
d['name'] = 'kjbkjbkj'
f = open('test.json', 'w')
json.dump(d, f, indent=4)
f.close()

How can I solve that?

It might be irrelevant but I am on macOS.

EDIT

This question was marked as duplicated. However in the accepted answer (and until now the only one) to the original post it is clearly stated:

Note: This solution doesn't work on python 3.6+

So that solution is not the proper one. Plus it is using the library simplejson not the library json.

like image 727
Francesco Boi Avatar asked Jan 25 '19 17:01

Francesco Boi


Video Answer


2 Answers

Option 1: Use regular expression matching to round.

You can dump your object to a string using json.dumps and then use the technique shown on this post to find and round your floating point numbers.

To test it out, I added some more complicated nested structures on top of the example you provided::

d = dict()
d['val'] = 5.78686876876089075543
d['name'] = 'kjbkjbkj'
d["mylist"] = [1.23456789, 12, 1.23, {"foo": "a", "bar": 9.87654321}]
d["mydict"] = {"bar": "b", "foo": 1.92837465}

# dump the object to a string
d_string = json.dumps(d, indent=4)

# find numbers with 8 or more digits after the decimal point
pat = re.compile(r"\d+\.\d{8,}")
def mround(match):
    return "{:.7f}".format(float(match.group()))

# write the modified string to a file
with open('test.json', 'w') as f:
    f.write(re.sub(pat, mround, d_string))

The output test.json looks like:

{
    "val": 5.7868688,
    "name": "kjbkjbkj",
    "mylist": [
        1.2345679,
        12,
        1.23,
        {
            "foo": "a",
            "bar": 9.8765432
        }
    ],
    "mydict": {
        "bar": "b",
        "foo": 1.9283747
    }
}

One limitation of this method is that it will also match numbers that are within double quotes (floats represented as strings). You could come up with a more restrictive regex to handle this, depending on your needs.

Option 2: subclass json.JSONEncoder

Here is something that will work on your example and handle most of the edge cases you will encounter:

import json

class MyCustomEncoder(json.JSONEncoder):
    def iterencode(self, obj):
        if isinstance(obj, float):
            yield format(obj, '.7f')
        elif isinstance(obj, dict):
            last_index = len(obj) - 1
            yield '{'
            i = 0
            for key, value in obj.items():
                yield '"' + key + '": '
                for chunk in MyCustomEncoder.iterencode(self, value):
                    yield chunk
                if i != last_index:
                    yield ", "
                i+=1
            yield '}'
        elif isinstance(obj, list):
            last_index = len(obj) - 1
            yield "["
            for i, o in enumerate(obj):
                for chunk in MyCustomEncoder.iterencode(self, o):
                    yield chunk
                if i != last_index: 
                    yield ", "
            yield "]"
        else:
            for chunk in json.JSONEncoder.iterencode(self, obj):
                yield chunk

Now write the file using the custom encoder.

with open('test.json', 'w') as f:
    json.dump(d, f, cls = MyCustomEncoder)

The output file test.json:

{"val": 5.7868688, "name": "kjbkjbkj", "mylist": [1.2345679, 12, 1.2300000, {"foo": "a", "bar": 9.8765432}], "mydict": {"bar": "b", "foo": 1.9283747}}

In order to get other keyword arguments like indent to work, the easiest way would be to read in the file that was just written and write it back out using the default encoder:

# write d using custom encoder
with open('test.json', 'w') as f:
    json.dump(d, f, cls = MyCustomEncoder)

# load output into new_d
with open('test.json', 'r') as f:
    new_d = json.load(f)

# write new_d out using default encoder
with open('test.json', 'w') as f:
    json.dump(new_d, f, indent=4)

Now the output file is the same as shown in option 1.

like image 180
pault Avatar answered Oct 13 '22 11:10

pault


It is still possible to monkey-patch json in Python 3, but instead of FLOAT_REPR, you need to modify float. Make sure to disable c_make_encoder just like in Python 2.

import json

class RoundingFloat(float):
    __repr__ = staticmethod(lambda x: format(x, '.2f'))

json.encoder.c_make_encoder = None
if hasattr(json.encoder, 'FLOAT_REPR'):
    # Python 2
    json.encoder.FLOAT_REPR = RoundingFloat.__repr__
else:
    # Python 3
    json.encoder.float = RoundingFloat

print(json.dumps({'number': 1.0 / 81}))

Upsides: simplicity, can do other formatting (e.g. scientific notation, strip trailing zeroes etc). Downside: it looks more dangerous than it is.

like image 39
proski Avatar answered Oct 13 '22 11:10

proski