Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

speed of parsing json structures

I want to make a simple database system, amd possibly use JSON as the main data format for importing and exporting (including full database backups) . So my question is: how fast is it to parse JSON, even from big JSON structures (think gigabytes), in comparison to speed when importing from other systems (like (faster) binary files, or (slower) XML)?

EDIT: to clarify, I am wondering how fast it is to parse JSON (into some internal database format), but not how fast it would be as an internal storage mechanism. So this JSON data would not be queried etc., but just parsed into another format.

Also, my main intent asking this question is I am wondering if JSON is any easier to parse than XML because of smaller delimiters (']' or '}' instead of '' or ''), and if it is maybe even similar in speed to binary formats because of the quite simple delimiters. (For example, maybe json can be parsed something like this: record delimiter = ascii code xx (xx being a brace or bracket) except where preceded by ascii xx (xx being some escape char).)

like image 471
Abbafei Avatar asked Dec 14 '10 03:12

Abbafei


People also ask

Is parsing JSON slow?

parse is a slow way to create a copy of an object.

Is parsing JSON faster than XML?

JSON is faster because it is designed specifically for data interchange. JSON encoding is terse, which requires less bytes for transit. JSON parsers are less complex, which requires less processing time and memory overhead. XML is slower, because it is designed for a lot more than just data interchange.

Is JSON parse CPU intensive?

Parsing JSON is a CPU intensive task, and JS is single threaded. So the parsing would have to block the main thread at some point.

Is JSON parse heavy?

To read a JSON document, software needs to transform the JSON text into a tree-like structure. This process is called JSON parsing. Parsing large JSON documents is a common but heavy-duty task. Big-data applications can spend 80–90% of their time parsing JSON documents.


1 Answers

It's definitely much, much slower than MySQL (for a server) or SQLite (for a client) which are preferrable.

Also, JSON speed depends almost solely on the implementation. For instance, you could eval() it, but not only that is very risky, it's also slower than a real parser. At any rate, there are probably much better optimized XML parsers than JSON parsers, just because it's a more used format. (So grab a GB-sized XML and imagine the same results but slower).

Seriously, JSON was never meant for big things. Use a real database if possible.

Edit: why is JSON much slower than a database?

Many reasons. I'll try to list a few.

  • JSON relies on matching sections such as {}s (much like XML's <>s)

This means a parser has to check where's the ending to an object block. There are other of these such as []s and ""s. In a conventional database there's no "ending tag" or "ending bracket" so it's easier to read.

  • JSON parsers need to read each and every character before being able to understand the whole object structure.

So before you can even read some of the JSON you have to read the whole file. This means waiting a few minutes at best for the sizes you mentioned, and a database is ready to be queried in less than a second (because the hierarchy is stored at the beginning).

  • In JSON you can't precalculate offsets.

In a database, size is traded for performance. You can make VARCHAR(512) and all strings will be null-padded to occupy 512 bytes. Why? Because that way you can know the 4th value is at offset 2048 for example. You can't do that with JSON hence performance suffers.

  • JSON is optimized for small filesizes.

...Because it's a web format.
This may look like a pro but it's a con from a performance perspective.

  • JSON is a JavaScript subset.

So some parsers might allow unnecessary data to be present and considered, such as comments. Chrome's native JSON used to allow comments for example (not anymore).
No database engine uses eval() right?

  • JSON is meant to have some error resilience.

People might put anything into a JSON file, so parsers are defensive and try to read invalid files sometimes. Database aren't supposed to repair a broken file silently.
You might hand-code a JSON but not a database!

  • JSON is a new, unsupported and badly tested format

There are bugs in some native parsers (like IE8's) and support for most browsers is very preliminary and slower than, say, the fastest XML parser out there. Simply because XML was being used for ages and Steve Ballmer has an XML fetish so companies please him by making almost anything under the sun XML-compatible. While JSON is one of Crockford's successful weekend pasttimes.

  • The best JSON parsers are in browsers

If you pick one random open-source JSON parser for your favourite language, what chances are that it's the best possible parser under the sun? Well, for XML you do have awesome parsers like this But what is there for JSON?

Need more reasons why JSON should be relegated to its intended use case?

like image 176
Camilo Martin Avatar answered Nov 25 '22 00:11

Camilo Martin