Functions decode
and decode'
from aeson
package are almost identical. But they have subtle difference described in documentation (posting only interesting part of docs here):
-- This function parses immediately, but defers conversion. See
-- 'json' for details.
decode :: (FromJSON a) => L.ByteString -> Maybe a
decode = decodeWith jsonEOF fromJSON
-- This function parses and performs conversion immediately. See
-- 'json'' for details.
decode' :: (FromJSON a) => L.ByteString -> Maybe a
decode' = decodeWith jsonEOF' fromJSON
I tried to read description of json
and json'
functions but still don't understand which one and when I should use because documentation is not clear enough. Can anybody describe more precisely the difference between two functions and provide some example with behavior explanation if possible?
UPDATE:
There are also decodeStrict
and decodeStrict'
functions. I'm not asking what is difference between decode'
and decodeStrict
for example which by the way is an interesting question as well. But what's lazy and what's strict here in all these functions is not obvious at all.
The difference between these two is subtle. There is a difference, but it’s a little complicated. We can start by taking a look at the types.
Value
typeIt’s important to note that the Value
type that aeson provides has been strict for a very long time (specifically, since version 0.4.0.0). This means that there cannot be any thunks between a constructor of Value
and its internal representation. This immediately means that Bool
(and, of course, Null
) must be completely evaluated once a Value
is evaluated to WHNF.
Next, let’s consider String
and Number
. The String
constructor contains a value of type strict Text
, so there can’t be any laziness there, either. Similarly, the Number
constructor contains a Scientific
value, which is internally represented by two strict values. Both String
and Number
must also be completely evaluated once a Value
is evaluated to WHNF.
We can now turn our attention to Object
and Array
, the only nontrivial datatypes that JSON provides. These are more interesting. Object
s are represented in aeson by a lazy HashMap
. Lazy HashMap
s only evaluate their keys to WHNF, not their values, so the values could very well remain unevaluated thunks. Similarly, Array
s are Vector
s, which are not strict in their values, either. Both of these sorts of Value
s can contain thunks.
With this in mind, we know that, once we have a Value
, the only places that decode
and decode'
may differ is in the production of objects and arrays.
The next thing we can try is to actually evaluate some things in GHCi and see what happens. We’ll start with a bunch of imports and definitions:
:seti -XOverloadedStrings
import Control.Exception
import Control.Monad
import Data.Aeson
import Data.ByteString.Lazy (ByteString)
import Data.List (foldl')
import qualified Data.HashMap.Lazy as M
import qualified Data.Vector as V
:{
forceSpine :: [a] -> IO ()
forceSpine = evaluate . foldl' const ()
:}
Next, let’s actually parse some JSON:
let jsonDocument = "{ \"value\": [1, { \"value\": [2, 3] }] }" :: ByteString
let !parsed = decode jsonDocument :: Maybe Value
let !parsed' = decode' jsonDocument :: Maybe Value
force parsed
force parsed'
Now we have two bindings, parsed
and parsed'
, one of which is parsed with decode
and the other with decode'
. They are forced to WHNF so we can at least see what they are, but we can use the :sprint
command in GHCi to see how much of each value is actually evaluated:
ghci> :sprint parsed
parsed = Just _
ghci> :sprint parsed'
parsed' = Just
(Object
(unordered-containers-0.2.8.0:Data.HashMap.Base.Leaf
15939318180211476069 (Data.Text.Internal.Text _ 0 5)
(Array (Data.Vector.Vector 0 2 _))))
Would you look at that! The version parsed with decode
is still unevaluated, but the one parsed with decode'
has some data. This leads us to our first meaningful difference between the two: decode'
forces its immediate result to WHNF, but decode
defers it until it is needed.
Let’s look inside these values to see if we can’t find more differences. What happens once we evaluate those outer objects?
let (Just outerObjValue) = parsed
let (Just outerObjValue') = parsed'
force outerObjValue
force outerObjValue'
ghci> :sprint outerObjValue
outerObjValue = Object
(unordered-containers-0.2.8.0:Data.HashMap.Base.Leaf
15939318180211476069 (Data.Text.Internal.Text _ 0 5)
(Array (Data.Vector.Vector 0 2 _)))
ghci> :sprint outerObjValue'
outerObjValue' = Object
(unordered-containers-0.2.8.0:Data.HashMap.Base.Leaf
15939318180211476069 (Data.Text.Internal.Text _ 0 5)
(Array (Data.Vector.Vector 0 2 _)))
This is pretty obvious. We explicitly forced both of the objects, so they are now both evaluated to hash maps. The real question is whether or not their elements are evaluated.
let (Array outerArr) = outerObj M.! "value"
let (Array outerArr') = outerObj' M.! "value"
let outerArrLst = V.toList outerArr
let outerArrLst' = V.toList outerArr'
forceSpine outerArrLst
forceSpine outerArrLst'
ghci> :sprint outerArrLst
outerArrLst = [_,_]
ghci> :sprint outerArrLst'
outerArrLst' = [Number (Data.Scientific.Scientific 1 0),
Object
(unordered-containers-0.2.8.0:Data.HashMap.Base.Leaf
15939318180211476069 (Data.Text.Internal.Text _ 0 5)
(Array (Data.Vector.Vector 0 2 _)))]
Another difference! For the array decoded with decode
, the values are not forced, but the ones decoded with decode'
are. As you can see, this means decode
doesn’t actually perform conversion to Haskell values until they are actually needed, which is what the documentation means when it says it “defers conversion”.
Clearly, these two functions are slightly different, and clearly, decode'
is stricter than decode
. What’s the meaningful difference, though? When would you prefer one over the other?
Well, it’s worth mentioning that decode
never does more work than decode'
, so decode
is probably the right default. Of course, decode'
will never do significantly more work than decode
, either, since the entire JSON document needs to be parsed before any value can be produced. The only significant difference is that decode
avoids allocating Value
s if only a small part of the JSON document is actually used.
Of course, laziness is not free, either. Being lazy means adding thunks, which can cost space and time. If all of the thunks are going to be evaluated, anyway, then decode
is simply wasting memory and runtime adding useless indirection.
In this sense, the situations when you might want to use decode'
are situations in which the whole Value
structure is going to be forced, anyway, which is probably dependent on which FromJSON
instance you’re using. In general, I wouldn’t worry about picking between them unless performance really matters and you’re decoding a lot of JSON or doing JSON decoding in a tight loop. In either case, you should benchmark. Choosing between decode
and decode'
is a very specific manual optimization, and I would not feel very confident that either would actually improve the runtime characteristics of my program without benchmarks.
Haskell is a lazy language. When you call a function, it doesn't actually execute right then, but instead the information about the call is "remembered" and returned up the stack (this remembered call information is referred to as "thunk" in the docs), and the actual call only happens if somebody up the stack actually tires to do something with the returned value.
This is the default behavior, and this is how json
and decode
work. But there is a way to "cheat" the laziness and tell the compiler to execute code and evaluate values right then and there. And this is what json'
and decode'
do.
The tradeoff there is obvious: decode
saves computation time in case you never actually do anything with the value, while decode'
saves the necessity to "remember" the call information (the "thunk") at the cost of executing everything in place.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With