Recently, I needed to parse the JSON that the Chrome web browser produces when you record events in its dev tools, and get some timing data out of it. Chrome can produce a pretty large amount of data in a small amount of time, so the Ruby parser I originally built was quite slow.
Since I'm learning Go, I decided to write scripts in both Go and JavaScript/Node and compare them.
The simplest possible form of the JSON file is what I have in this Gist. It contains an event representing the request sent to fetch a page, and the event representing the response. Typically, there's a huge amount of extra data to sift through. That's its own problem, but not what I'm worried about in this question.
The JavaScript script that I wrote is here, and the Go program I wrote is here. This is the first useful thing I've written in Go, so I'm sure it's all sorts of bad. However, one thing I noticed is that it's much slower than JavaScript at parsing a large JSON file.
Time with a 119Mb JSON file in Go:
$ time ./parse data.json
= 22 Requests
Min Time: 0.77
Max Time: 0.77
Average Time: 0.77
./gm data.json 4.54s user 0.16s system 99% cpu 4.705 total
Time with a 119Mb JSON file in JavaScript/Node:
$ time node parse.js data.json
= 22 Requests
Min Time: 0.77
Max Time: 0.77
Avg Time: 0.77
node jm.js data.json 1.73s user 0.24s system 100% cpu 1.959 total
(The min/max/average times are all identical in this example because I duplicated JSON objects so as to have a very large data set, but that's irrelevant.)
I'm curious if it's just that JavaScript/Node is just way faster at parsing JSON (which wouldn't be particularly surprising, I guess), or if there's something I'm doing totally wrong in the Go program. I'm also just curious what I'm doing wrong in the Go program in general, because I'm sure there's plenty wrong with it.
Note that while these two scripts do more than parsing, it's definitely json.Unmarshal()
in Go that is adding lots of time in the program.
Update
I added a Ruby script:
$ ruby parse.rb
= 22 Requests
Min Time: 0.77
Max Time: 0.77
Avg Time: 0.77
ruby parse.rb 4.82s user 0.82s system 99% cpu 5.658 total
With Go, you are parsing the JSON into statically-typed structures. With JS and Ruby, you are parsing it into hash tables.
In order to parse JSON into the structures that you defined, the json package needs to find out the names and types of their fields. To do this, it uses the reflect package, which is much slower than accessing those fields directly.
Depending on what you do with the data after you parse it, the extra parsing time may pay for itself. The Go data structures use less memory than hash tables, and they are much faster to access. So if you do a lot with the data, the savings on processing time may outweigh the extra parsing time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With