Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Storing Large JSON Object as JSON or as String?

I have a large set of JSON docs which I am willing to store in a MongoDB.

However, given I am searching and retrieving only against few fields, I was wondering from performance-wise which way it would be better.

One option is to store the large object as JSON/BSON so the doc will look like:

{
    "key_1": "Value1",
    "key_2": "Value2",
    "external_data": {
        "large": {
            "data": [
                "comes",
                "here"
            ]
        }
    }
}

Or alternatively,

{
    "key_1": "Value1",
    "key_2": "Value2",
    "external_data": '{"large":{"data":["comes","here"]}}'
}
like image 752
Tzury Bar Yochay Avatar asked Sep 19 '25 20:09

Tzury Bar Yochay


2 Answers

Interesting question, so i took the trouble to check it.


Sort answer is no significant performance difference in writes
here is the code i used for test it using pymongo driver along the results:


    docdict=dict(zip (["key" + str(i) for i in range (1,101)],[ "a"*i for i in range(1,101)]))
    docstr=str(docdict)
    def addIdtoStr(s,id):return {'_id':id,'payload':s} 
    def addIdtoDict(d,id): d.update({'_id':id});return d
    cProfile.run("for i in range(0,100000):x=dbcl.client.tests.test2.insert(addIdtoDict(docdict,i),w=0,j=0)")
     **12301152 function calls (12301128 primitive calls) in 56.089 second**
    dbcl.client.tests.test2.remove({},multi= True)
    cProfile.run("for i in range(0,100000):x=dbcl.client.tests.test2.insert(addIdStr(docstr,i),w=0,j=0)")
     **12201194 function calls (12115631 primitive calls) in 54.665 seconds**

like image 101
nickmilon Avatar answered Sep 21 '25 23:09

nickmilon


I believe that storing the data in BSON is both performance and space-efficient. And by that you "invest" in future: if you store the data as BSON now, then it'll be possible to db-query it later if such requirement appears.

But anyway, if your concern is performance - you do have to profile it in the production environment, there is no way to tell that "it'll be faster or not".

like image 38
Zaur Nasibov Avatar answered Sep 22 '25 00:09

Zaur Nasibov