I need to sign a JSON but I noticed that unmarshaling/marshaling can change JSON's order which might make the signature invalid.
Is there anyway to produce the same hash from a JSON string despite its order?
I've had a look at JOSE but couldn't find the function that actually hashes JSON.
Referring to JSON dictionaries as hash tables would be technically incorrect, however, as there is no particular data structure implementation associated with the JSON data itself. A hash is a random looking number which is generated from a piece of data and always the same for the same input.
The Theory of JSON Standard objects are either a single key and value, or else a collection of keys and values which are equivalent to a hash table in most languages (learn about hash tables in Lua).
The JSON RFC (RFC 4627) says that order of object members does not matter. Note to future self when Google brings me back here: MongoDB is sensitive to the ordering of keys when checking object equality.
JOSE JWS will absolutely do what you want at the cost of having to manage keys for signatures and verification.
But let's assume that you don't really need the whole key management stuff and general crypto functionality in JOSE and you're not SUPER concerned about performance (so a little string mangling in this process is OK).
You could dumbly unmarshal your JSON and re-marshal it, then just hash that:
package main
import (
"crypto/sha256"
"encoding/hex"
"fmt"
json "encoding/json"
)
// NB These docs are strictly-speaking the same.
const DOCA = "{ \"foo\": 1.23e1, \"bar\": { \"baz\": true, \"abc\": 12 } }"
const DOCB = "{ \"bar\": { \"abc\": 12, \"baz\": true }, \"foo\": 12.3 }"
func hash(doc string) string {
// Dumb af, but it's a cheap way to specific the most generic thing
// you can :-/
var v interface{}
json.Unmarshal([]byte(doc), &v) // NB: You should handle errors :-/
cdoc, _ := json.Marshal(v)
sum := sha256.Sum256(cdoc)
return hex.EncodeToString(sum[0:])
}
func main() {
fmt.Println(DOCA)
fmt.Printf("Hash: %s\n", hash(DOCA))
fmt.Println(DOCB)
fmt.Printf("Hash: %s\n", hash(DOCB))
}
The output of this program (at least in the golang docker container) is:
{ "foo": 1.23e1, "bar": { "baz": true, "abc": 12 } }
Hash: d50756fbb830f8335187a3f427603944c566772365d8d8e6f6760cd2868c8a73
{ "bar": { "abc": 12, "baz": true }, "foo": 12.3 }
Hash: d50756fbb830f8335187a3f427603944c566772365d8d8e6f6760cd2868c8a73
The nice thing about this approach is that, for the cost of some performance, you get insulated from whatever dumb junk you did while marshalling your JSON in the first place (so, unlike other suggestions, you don't have to think about what you might be doing with custom Marshallers and whatnot). This is especially a big deal when you forget that this was an issue at all in version 3.8 of your code a year from now, implement something that messes with the marshal order, and start breaking things.
And, of course, you could always add the hash to the resulting struct and marshal again with the extra item in the map. Obviously you want to optimize a bit for performance if you're worried about it at all and properly handle errors, but this is a good prototype anyway :-)
Oh, and if you're super-worried about edge cases biting you, you could also use canonical JSON to marshal, since it's specifically designed for this type of use (though, honestly, I couldn't come up with an example in my testing where c-json worked but go's default json didn't).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With