JSON objects are printed into my syslog file. I need to extract the string from the log and convert it into JSON. I don't have any problems extracting the string between '{
' and '}
', but certain strings have an escape character in them, and this is causing json.loads
to fail
Here is the problem:
>>> import json
>>> resp = '{"from_hostname": {"value": "mysite.edu\"", "value2": 0, "value3": 1}}'
>>> json.loads(resp)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 365, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 381, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting , delimiter: line 1 column 41 (char 40)
>>> resp[40]
'"'
>>> resp[41]
','
>>> resp[39]
'"'
>>>
When json
sees the \"
I guess it only sees "
and assumes that the string is over, and it throws the delimiter error.
I tried replacing \"
with \\"
but that doesn't seem to work.
NOTE: The \"
can occur at the beginning or end or in the middle of the string.
How do I get this working?
if \"
can occur in your string you have to escape \
and "
with
import json
resp = '{"from_hostname": {"value": "mysite.edu\\\"", "value2": 0, "value3": 1}}'
print(json.loads(resp))
it prints
{u'from_hostname': {u'value3': 1, u'value2': 0, u'value': u'mysite.edu"'}}
is this the right interpration of your question?
The problem is that the backslash character is escaping the double quote in the Python string, but it is not actually present in the string. Printing the string demonstrates this:
>>> print '{"from_hostname": {"value": "mysite.edu\"", "value2": 0, "value3": 1}}'
'{"from_hostname": {"value": "mysite.edu"", "value2": 0, "value3": 1}}'
This shows that the backslash is not in the string. So the double quote must be escaped for the string to be a valid JSON string, which means that the backslash must be present in the string. You can do that by escaping the backslash itself with another backslash, i.e. \\
:
>>> print '{"from_hostname": {"value": "mysite.edu\\"", "value2": 0, "value3": 1}}'
{"from_hostname": {"value": "mysite.edu\"", "value2": 0, "value3": 1}}
and json.loads()
now works:
>>> json.loads('{"from_hostname": {"value": "mysite.edu\\"", "value2": 0, "value3": 1}}')
{u'from_hostname': {u'value3': 1, u'value2': 0, u'value': u'mysite.edu"'}}
Or you could use a raw string:
>>> json.loads(r'{"from_hostname": {"value": "mysite.edu\"", "value2": 0, "value3": 1}}')
{u'from_hostname': {u'value3': 1, u'value2': 0, u'value': u'mysite.edu"'}}
However, json.loads()
fails on JSON strings that you extracted from the log file which strongly suggests that the problem is there. You should post the extraction code in your question so that can be checked.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With