I am using Go, Revel WAF and Redis.
I have to store large json data in Redis (maybe 20MB).
json.Unmarshal()
takes about roughly 5 seconds. What would be a better way to do it?
I tried JsonLib, encode/json, ffjson, megajson, but none of them were fast enough.
I thought about using groupcache, but Json is updated in real time.
This is the sample code:
package main
import (
"github.com/garyburd/redigo/redis"
json "github.com/pquerna/ffjson/ffjson"
)
func main() {
c, err := redis.Dial("tcp", ":6379")
defer c.Close()
pointTable, err := redis.String(c.Do("GET", "data"))
var hashPoint map[string][]float64
json.Unmarshal([]byte(pointTable), &hashPoint) //Problem!!!
}
To parse JSON, we use the Unmarshal() function in package encoding/json to unpack or decode the data from JSON to a struct. Unmarshal parses the JSON-encoded data and stores the result in the value pointed to by v. Note: If v is nil or not a pointer, Unmarshal returns an InvalidUnmarshalError.
encoding/json takes more than twice as long as easyjson and requires more allocations. So it is slower than one alternative.
Unmarshal is the contrary of marshal. It allows you to convert byte data into the original data structure. In go, unmarshaling is handled by the json. Unmarshal() method.
Parsing large JSON data does seem to be slower than it should be. It would be worthwhile to pinpoint the cause and submit a patch to the Go authors.
In the meantime, if you can avoid JSON and use a binary format, you will not only avoid this issue; you will also gain the time your code is now spending parsing ASCII decimal representations of numbers into their binary IEEE 754 equivalents (and possibly introducing rounding errors while doing so.)
If both your sender and receiver are written in Go, I suggest using Go's binary format: gob.
Doing a quick test, generating a map with 2000 entries, each a slice with 1050 simple floats, gives me 20 MB of JSON, which takes 1.16 sec to parse on my machine.
For these quick benchmarks, I take the best of three runs, but I make sure to only measure the actual parsing time, with t0 := time.Now()
before the Unmarshal call and printing time.Now().Sub(t0)
after it.
Using GOB, the same map results in 18 MB of data, which takes 115 ms to parse:
one tenth the time.
Your results will vary depending on how many actual floats you have there. If yours floats have a lot of significant digits, deserving their float64 representation, then 20 MB of JSON will contain much fewer than my two million floats. In that case the difference between JSON and GOB will be ever starker.
BTW, this proves that the problem lies indeed in the JSON parser, not in the amount of data to parse, nor in the memory structures to create (because both tests are parsing ~ 20 MB of data and recreating the same slices of floats.) Replacing all the floats with strings in the JSON gives me a parsing time of 1.02 sec, confirming that the conversion from string representation to binary floats does takes a certain time (compared to just moving bytes around) but is not the main culprit.
If the sender and the parser are not both Go, or if you want to squeeze the performance even further than GOB, you should use your own customised binary format, either using Protocol Buffers or manually with "encoding/binary" and friends.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With