I've noticed some strange behavior on Python 3's implementation of json.dumps
, namely the key order changes every time I dump the same object from execution to execution. Googling wasn't working since I don't care about sorting the keys, I just want them to remain the same! Here is an example script:
import json
data = {
'number': 42,
'name': 'John Doe',
'email': '[email protected]',
'balance': 235.03,
'isadmin': False,
'groceries': [
'apples',
'bananas',
'pears',
],
'nested': {
'complex': True,
'value': 2153.23412
}
}
print(json.dumps(data, indent=2))
When I run this script I get different outputs every time, for example:
$ python print_data.py
{
"groceries": [
"apples",
"bananas",
"pears"
],
"isadmin": false,
"nested": {
"value": 2153.23412,
"complex": true
},
"email": "[email protected]",
"number": 42,
"name": "John Doe",
"balance": 235.03
}
But then I run it again and I get:
$ python print_data.py
{
"email": "[email protected]",
"balance": 235.03,
"name": "John Doe",
"nested": {
"value": 2153.23412,
"complex": true
},
"isadmin": false,
"groceries": [
"apples",
"bananas",
"pears"
],
"number": 42
}
I understand that dictionaries are unordered collections and that the order is based on a hash function; however in Python 2 - the order (whatever it is) is fixed and doesn't change on a per-execution basis. The difficulty here is that it's making my tests difficult to run because I need to compare the JSON output of two different modules!
Any idea what is going on? How to fix it? Note that I would like to avoid using an OrderedDict or performing any sorting and what matters is that the string representation remains the same between executions. Also this is for testing purposes only and doesn't have any effect on the implementation of my module.
Yes, the order of elements in JSON arrays is preserved. From RFC 7159 -The JavaScript Object Notation (JSON) Data Interchange Format (emphasis mine): An object is an unordered collection of zero or more name/value pairs, where a name is a string and a value is a string, number, boolean, null, object, or array.
The JSON standard defines objects as "an unordered collection of zero or more name/value pairs". As such, an implementation does not need to preserve any specific order of object keys.
Using json. dumps() function is one way to sort the JSON object. It is used to convert the array of JSON objects into a sorted JSON object. The value of the sort_keys argument of the dumps() function will require to set True to generate the sorted JSON objects from the array of JSON objects.
The json. dump() method (without “s” in “dump”) used to write Python serialized object as JSON formatted data into a file. The json. dumps() method encodes any Python object into JSON formatted String.
Python dictionaries and JSON objects are unordered. You can ask json.dumps()
to sort the keys in the output; this is meant to ease testing. Use the sort_keys
parameter to True
:
print(json.dumps(data, indent=2, sort_keys=True))
See Why is the order in Python dictionaries and sets arbitrary? as to why you see a different order each time.
You can set the PYTHONHASHSEED
environment variable to an integer value to 'lock' the dictionary order; use this only to run tests and not in production, as the whole point of hash randomisation is to prevent an attacker from trivially DOS-ing your program.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With