What rules are there about a function a -> () being evaluated in Haskell?

Tags:

Just as the title says: what guarantees are there for a Haskell function returning unit to be evaluated? One would think that there is no need to run any kind of evaluation in such a case, the compiler could replace all such calls with an immediate () value unless explicit requests for strictness are present, in which case the code might have to decide whether it should return () or bottom.
I have experimented with this in GHCi, and it seems like the opposite happens, that is, such a function appears to be evaluated. A very primitive example would be

f :: a -> ()
f _ = undefined

Evaluating f 1 throws an error due to the presence of undefined, so some evaluation definitely happens. It is not clear how deep the evaluation goes, though; sometimes it appears to go as deep as it is necessary to evaluate all calls to functions returning (). Example:

g :: [a] -> ()
g [] = ()
g (_:xs) = g xs

This code loops indefinitely if presented with g (let x = 1:x in x). But then

f :: a -> ()
f _ = undefined
h :: a -> ()
h _ = ()

can be used to show that h (f 1) returns (), so in this case not all unit-valued subexpressions are evaluated. What is the general rule here?

ETA: of course I know about laziness. I'm asking what prevents compiler writers from making this particular case even lazier than usually possible.

ETA2: summary of the examples: GHC appears to treat () exactly as any other type, i.e. as if there was a question about which regular value inhabiting the type should be returned from a function. The fact that there's only one such value does not seem to be (ab)used by the optimization algorithms.

ETA3: when I say Haskell, I mean Haskell-as-defined-by-the-Report, not Haskell-the-H-in-GHC. Seems to be an assumption not shared as widely as I imagined (which was 'by 100% of the readers'), or I would probably have been able to formulate a clearer question. Even so, I regret changing the title of the question, as it originally asked what guarantees are there for such a function being called.

ETA4: it would seem that this question has run its course, and I'm considering it unanswered. (I was looking for a 'close question' function but only found 'answer your own question' and as it cannot be answered, I did not go down that route.) No one brought up anything from the Report that would decide it either way, which I'm tempted to interpret as a strong but not definite 'no guarantee for the language as such' answer. All we know is that the current GHC implementation will not skip the evaluation of such a function.

I've run into the actual problem when porting an OCaml app to Haskell. The original app had a mutually recursive structure of many types, and the code declared a number of functions called assert_structureN_is_correct for N in 1..6 or 7, each of which returned unit if the structure was indeed correct and threw an exception if it was not. In addition, these functions called each other as they decomposed the correctness conditions. In Haskell this is better handled using the Either String monad, so I transcribed it that way, but the question as a theoretical issue remained. Thanks for all the inputs and replies.

729

asked Nov 27 '19 13:11

Kryptozoon

1 Answers

You appear to come from the assumption that the type () has only one possible value, (), and thus expect that any function call returning a value of type () should automatically be assumed to indeed produce the value ().

This is not how Haskell works. Every type has one more value in Haskell, namely no value, error, or so called "bottom", encoded by undefined. Thus an evaluation is actually taking place:

main = print (f 1)

is equivalent to the Core language's

main = _Case (f 1) _Of x -> print x   -- pseudocode illustration

or even^(*)

main = _Case (f 1) _Of x -> putStr "()"

and Core's _Case is forcing:

"Evaluating a %case [expression] forces the evaluation of the expression being tested (the “scrutinee”). The value of the scrutinee is bound to the variable following the %of keyword, ...".

The value is forced to weak head normal form. This is part of the language definition.

Haskell is not a declarative programming language.

^(*)print x = putStr (show x) and show () = "()", so the show call can be compiled away altogether.

The value is indeed known in advance as (), and even the value of show () is known in advance as "()". Still the accepted Haskell semantics demand that the value of (f 1) is forced to weak head normal form before proceeding to print that known in advance string, "()".

edit: Consider concat (repeat []). Should it be [], or should it be an infinite loop?

A "declarative language"'s answer to this is most probably []. Haskell's answer is, the infinite loop.

Laziness is "poor man's declarative programming", but it still isn't the real thing.

edit2: print $ h (f 1) == _Case (h (f 1)) _Of () -> print () and only h is forced, not f; and to produce its answer h doesn't have to force anything, according to its definition h _ = ().

parting remarks: Laziness may have raison d'etre but it's not its definition. Laziness is what it is. It is defined as all values being initially thunks which are forced to WHNF according to the demands coming from main. If it helps avoid bottom in a certain specific case according to its specific circumstances, then it does. If not, not. That is all.

It helps to implement it yourself, in your favorite language, to get a feel for it. But we can also trace evaluation of any expression by carefully naming all interim values as they come into being.

Going by the Report, we have

f :: a -> ()
f = \_ -> (undefined :: ())

then

print (f 1)
 = print ((\ _ ->  undefined :: ()) 1)
 = print          (undefined :: ())
 = putStrLn (show (undefined :: ()))

and with

instance Show () where
    show :: () -> String
    show x = case x of () -> "()"

it continues

 = putStrLn (case (undefined :: ()) of () -> "()")

Now, section 3.17.3 Formal Semantics of Pattern Matching of the Report says

The semantics of case expressions [are given] in Figures 3.1–3.3. Any implementation should behave so that these identities hold [...].

and case (r) in Figure 3.2 states

(r)     case ⊥ of { K x1 … xn -> e; _ -> e′ } =  ⊥
        where K is a data constructor of arity n

() is data constructor of arity 0, so it is the same as

(r)     case ⊥ of { () -> e; _ -> e′ } =  ⊥

and the overall result of the evaluation is thus ⊥.

127

answered Oct 20 '22 21:10

Will Ness

Related questions
                            
                                Generating a Rank2Type within a Monad
                            
                                Where are the magic rules for GHC assert?
                            
                                How much overhead does sparking incur?
                            
                                Is there any way to describe function parameters first in Haddock?
                            
                                Exit quietly (without exception) haskell
                            
                                Haskell - resolving cyclical module dependency
                            
                                How do I get stack to use dynamic linking?
                            
                                Why can't we define closed data families?
                            
                                In what sense is (# a #), the one-element unboxed tuple, different from 'a'?
                            
                                Reading data from Excel in Haskell
                            
                                binary heap vs binomial heap vs fibonacci heap, regarding performance for a priority queue
                            
                                Static Guarantee on Key/Value Relationships in Data.Map
                            
                                Debugging performance bottlenecks for longest common subsequence algorithm
                            
                                instance Alternative ZipList in Haskell?
                            
                                Trying to apply CPS to an interpreter
                            
                                Understanding the casts involved in patterns matching a datatype that is indexed over a user defined kind
                            
                                How to get generic (polymorphic) lambda in scala?
                            
                                More efficient tail of church encoded list
                            
                                Efficiently creating strict ByteStrings
                            
                                When does `modify` copy the vector?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What rules are there about a function a -> () being evaluated in Haskell?

Tags:

semantics

language-lawyer

haskell

lazy-evaluation

unit-type

Kryptozoon

People also ask

1 Answers

Will Ness

Recent Activity

Donate For Us