I want to parse a bytes
string in JSON format to convert it into python objects. This is the source I have:
my_bytes_value = b'[{\'Date\': \'2016-05-21T21:35:40Z\', \'CreationDate\': \'2012-05-05\', \'LogoType\': \'png\', \'Ref\': 164611595, \'Classe\': [\'Email addresses\', \'Passwords\'],\'Link\':\'http://some_link.com\'}]'
And this is the desired outcome I want to have:
[{ "Date": "2016-05-21T21:35:40Z", "CreationDate": "2012-05-05", "LogoType": "png", "Ref": 164611595, "Classes": [ "Email addresses", "Passwords" ], "Link": "http://some_link.com"}]
First, I converted the bytes to string:
my_new_string_value = my_bytes_value.decode("utf-8")
but when I try to invoke loads
to parse it as JSON:
my_json = json.loads(my_new_string_value)
I get this error:
json.decoder.JSONDecodeError: Expecting value: line 1 column 174 (char 173)
Use json. dumps() to convert a list to a JSON array in Python. The dumps() function takes a list as an argument and returns a JSON String.
json now lets you put a byte[] object directly into a json and it remains readable. you can even send the resulting object over a websocket and it will be readable on the other side.
You convert the whole array to JSON as one object by calling JSON. stringify() on the array, which results in a single JSON string. To convert back to an array from JSON, you'd call JSON. parse() on the string, leaving you with the original array.
Your bytes
object is almost JSON, but it's using single quotes instead of double quotes, and it needs to be a string. So one way to fix it is to decode the bytes
to str
and replace the quotes. Another option is to use ast.literal_eval
; see below for details. If you want to print the result or save it to a file as valid JSON you can load the JSON to a Python list and then dump it out. Eg,
import json my_bytes_value = b'[{\'Date\': \'2016-05-21T21:35:40Z\', \'CreationDate\': \'2012-05-05\', \'LogoType\': \'png\', \'Ref\': 164611595, \'Classe\': [\'Email addresses\', \'Passwords\'],\'Link\':\'http://some_link.com\'}]' # Decode UTF-8 bytes to Unicode, and convert single quotes # to double quotes to make it valid JSON my_json = my_bytes_value.decode('utf8').replace("'", '"') print(my_json) print('- ' * 20) # Load the JSON to a Python list & dump it back out as formatted JSON data = json.loads(my_json) s = json.dumps(data, indent=4, sort_keys=True) print(s)
output
[{"Date": "2016-05-21T21:35:40Z", "CreationDate": "2012-05-05", "LogoType": "png", "Ref": 164611595, "Classe": ["Email addresses", "Passwords"],"Link":"http://some_link.com"}] - - - - - - - - - - - - - - - - - - - - [ { "Classe": [ "Email addresses", "Passwords" ], "CreationDate": "2012-05-05", "Date": "2016-05-21T21:35:40Z", "Link": "http://some_link.com", "LogoType": "png", "Ref": 164611595 } ]
As Antti Haapala mentions in the comments, we can use ast.literal_eval
to convert my_bytes_value
to a Python list, once we've decoded it to a string.
from ast import literal_eval import json my_bytes_value = b'[{\'Date\': \'2016-05-21T21:35:40Z\', \'CreationDate\': \'2012-05-05\', \'LogoType\': \'png\', \'Ref\': 164611595, \'Classe\': [\'Email addresses\', \'Passwords\'],\'Link\':\'http://some_link.com\'}]' data = literal_eval(my_bytes_value.decode('utf8')) print(data) print('- ' * 20) s = json.dumps(data, indent=4, sort_keys=True) print(s)
Generally, this problem arises because someone has saved data by printing its Python repr
instead of using the json
module to create proper JSON data. If it's possible, it's better to fix that problem so that proper JSON data is created in the first place.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With