Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In Python, have json not escape a string

Tags:

python

json

I am caching some JSON data, and in storage it is represented as a JSON-encode string. No work is performed on the JSON by the server before sending it to the client, other than collation of multiple cached objects, like this:

def get_cached_items():
  item1 = cache.get(1)
  item2 = cache.get(2)
  return json.dumps(item1=item1, item2=item2, msg="123")

There may be other items included with the return value, in this case represented by msg="123".

The issue is that the cached items are double-escaped. It would behoove the library to allow a pass-through of the string without escaping it.

I have looked at the documentation for json.dumps default argument, as it seems to be the place where one would address this, and searched on google/SO but found no useful results.

It would be unfortunate, from a performance perspective, if I had to decode the JSON of each cached items to send it to the browser. It would be unfortunate from a complexity perspective to not be able to use json.dumps.

My inclination is to write a class that stores the cached string and when the default handler encounters an instance of this class it uses the string without perform escaping. I have yet to figure out how to achieve this though, and I would be grateful for thoughts and assistance.

EDIT For clarity, here is an example of the proposed default technique:

class RawJSON(object):
   def __init__(self, str):
       self.str = str

class JSONEncoderWithRaw(json.JSONEncoder):
   def default(self, o):
       if isinstance(o, RawJSON): 
          return o.str # but avoid call to `encode_basestring` (or ASCII equiv.)
       return super(JSONEncoderWithRaw, self).default(o)

Here is a degenerate example of the above:

>>> class M():
       str = ''
>>> m = M()
>>> m.str = json.dumps(dict(x=123))
>>> json.dumps(dict(a=m), default=lambda (o): o.str)
'{"a": "{\\"x\\": 123}"}'

The desired output would include the unescaped string m.str, being:

'{"a": {"x": 123}}'

It would be good if the json module did not encode/escape the return of the default parameter, or if same could be avoided. In the absence of a method via the default parameter, one may have to achieve the objective here by overloading the encode and iterencode method of JSONEncoder, which brings challenges in terms of complexity, interoperability, and performance.

like image 800
Brian M. Hunt Avatar asked May 09 '13 15:05

Brian M. Hunt


1 Answers

A quick-n-dirty way is to patch json.encoder.encode_basestring*() functions:

import json

class RawJson(unicode):
    pass

# patch json.encoder module
for name in ['encode_basestring', 'encode_basestring_ascii']:
    def encode(o, _encode=getattr(json.encoder, name)):
        return o if isinstance(o, RawJson) else _encode(o)
    setattr(json.encoder, name, encode)


print(json.dumps([1, RawJson(u'["abc", 2]'), u'["def", 3]']))
# -> [1, ["abc", 2], "[\"def\", 3]"]
like image 163
jfs Avatar answered Oct 16 '22 23:10

jfs