I've seen many resources strongly encouraging programmers to not use Debug.Trace
from production code because of the lack of referential transparency. I still don't fully understand the problem though, and can't seem to locate the underlying reason.
My understanding is that tracing cannot alter the output of any expression (though it can in some cases cause expressions to be executed in a different order, or cause an expression to be evaluated that would have otherwise been lazily skipped). So tracing cannot affect the output of a pure function. It can't throw any sort of errors. So why is it fought against so strongly? (If any of my assumptions above are incorrect, please point it out!). Is it just a philosophical debate, or a performance thing, or could it actually introduce a bug in some way?
When programming in other, less strict languages, I often find it valuable to have a production application log values of interest to help diagnose issues that I am unable to reproduce locally.
Of course trace foo bar
can throw more errors than bar
: it can (will) throw any errors foo
throws!
But that's not really the reason to avoid it. The real reason is that you typically want the programmer to have control over the order that output happens. You don't want your debug statement saying "Foo is happening!" to interrupt itself and say "Foo is hapBar is happening!pening!", for example; and you don't want a latent debugging statement to languish unprinted just because the value it wraps never ended up being needed. The right way to control that order is to admit you are doing IO
and reflect that in your type.
I often find it valuable to have a production application log values of interest to help diagnose issues that I am unable to reproduce locally.
This can be okayish. For example, GHC can be run with options that turn on various tracing features. The resulting logs may be useful for investigating bugs. That said, the wild nature of the output from "pure" code can make it a bit hard to put the various events in a sensible order. For example, -ddump-inlinings
and -ddump-rule-rewrites
output may be interleaved with other output. So developers have the convenience of working with more "pure" code, but at the expense of logs that are trickier to pick through. It's a trade-off.
Debug.Trace.trace
breaks referential transparency.
An expression such as let x = e in x+x
, by referential tranparency, must be equivalent to e+e
, no matter what e
is.
However,
let x = trace "x computed!" 12
in x+x
will (likely) print the debug message once, while
trace "x computed!" 12 + trace "x computed!" 12
will (likely) print the debug message twice. This should not happen.
The only pragmatic way out of this is to regard the output as a "side effect we should not depend on". We already do this in pure code: we disregard observable "side effects" such as elapsed time, used space, consumed energy. Pragmatically, we should not rely on an expression to consume exactly 1324
bytes during evaluation, and write code which breaks after the new compiler is able to optimize that more and save 2
more bytes. Similarly, production code should never rely on the presence of trace
messages.
(Above, I write "likely" since this is what I think GHC does at the moment, but in principle another compiler could optimizelet x=e in ...
by inlining e
which would cause multiple trace
messages.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With