Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse only one level of json

Tags:

python

json

I have the following string:

'{
    "key1": "val1",
    "key2": ["a","b", 3],
    "key3": {"foo": 27, "bar": [1, 2, 3]}
}'

I want to parse only one level so result should be a one level dictionary with key, and value should be just a string(don't need to parse it)

For given string it should return following dictionary:

{
    "key1": "val1",
    "key2": "['a','b', 3]",
    "key3": "{'foo': 27, 'bar': [1, 2, 3]}"
}

Is there a fast way to do it? Without parsing whole string to json and convert all values back to strings.

like image 974
Eugene Nagorny Avatar asked Aug 30 '12 11:08

Eugene Nagorny


2 Answers

Hardly an answer, but I only see two possibilities:

  1. Load the full JSON and dump back the values, which you have ruled out in your question
  2. Modify the content by wrapping the values in quotes, so that the JSON load yields string values

To be honest, I think there is no such thing as 'performance critical JSON parsing code', it just sounds wrong, so I'd go with the first option.

like image 191
icecrime Avatar answered Oct 22 '22 17:10

icecrime


I think you can solve this using regex, it is working for me:

import re
pattern = re.compile('"([a-zA-Z0-9]+)"\s*:\s*(".*"|\[.*\]|\{.*\})')    
dict(re.findall(pattern, json_string))

But I dont know if this is faster, you need try using your data.

[EDIT]

Yes, it is faster. I tried the scripts below and the regex version is 5 times faster.

using json module:

import json

val='''
{
    "key1": "val1",
    "key2": ["a","b", 3],
    "key3": {"foo": 27, "bar": [1, 2, 3]}
}
'''

for n in range(100000):
    dict((k,json.dumps(v)) for k,v in json.loads(val).items())

using regex:

import re

val='''{
    "key1": "val1",
    "key2": ["a","b", 3],
    "key3": {"foo": 27, "bar": [1, 2, 3]}
}'''

pattern = re.compile('"([a-zA-Z0-9]+)"\s*:\s*(".*"|\[.*\]|\{.*\})')    
for n in range(100000):
    dict(re.findall(pattern, val))
like image 45
olivecoder Avatar answered Oct 22 '22 17:10

olivecoder