Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Perplexed by how this code is processed by Haskell's Layout facility

Whilst browsing https://wiki.haskell.org/IO_inside, I encountered the following commentary and code ...

"Moreover, Haskell layout rules allow us to use the following layout:

main = do a <- readLn
          if (a>=0) then return ()
            else do
          print "a is negative"
          ...

that may be useful for escaping from the middle of a longish 'do' statement."

Hereafter, I'll use the symbol C* to refer to the above code.

I presume that the intention of C* is to read in a number, and then:
(i) If it's non-negative, do nothing.
(ii) If it's negative, display output stating that it is.

My initial reaction was to think that C* would either not parse correctly, or would not behave as expected.

I thought that Layout would insert an empty set of braces and a semicolon immediately following the second 'do' because the lexeme 'print' is not indented more than the indentation level of the current layout context established by the 'a' in "a <- readLn".

That is, my prediction for the layout-insensitive code (hereafter, referred to as C') generated by Layout would be something like:

main = do {
          a <- readLn;
          if (a>=0) then return ()
            else do {};
          print "a is negative"
          ...
          }

That I thought this would be the case, was based on the following sentence contained in section 2.7 ('Lexical Structure' : 'Layout') of Part 1 of the Haskell 2010 Language Report (https://www.haskell.org/onlinereport/haskell2010/haskellpa1.html):

"If the indentation of the non-brace lexeme immediately following a where, let, do or of is less than or equal to the current indentation level, then instead of starting a layout, an empty list “{}” is inserted, and layout processing occurs for the current level (i.e. insert a semicolon or close brace)."

A more detailed account of the Layout rules is given in section 10.3 ('Syntax Reference' : 'Layout') of Part 1 (URL given above) of the Haskell 2010 Language Report.

On reading this more detailed account, I felt reassured that my prediction for the layout-insensitive code generated by Layout (i.e. C') was correct.

However, to my surprise, when I tried the original code stipulated above (i.e. C*) in GHCi, it worked (parsed correctly and behaved as expected).

Questions ...

  1. Is the sentence I quoted above (from section 2.7) accurate?

  2. Is the detailed account of the Layout rules mentioned above (from section 10.3) accurate?

  3. What are the flaw(s) in the reasoning I employed to arrive at my prediction for the layout-insensitive code (i.e. C') generated by Layout for the original code C* ?

  4. What is the layout-insensitive code produced by Layout for the original code stipulated above (i.e. for C*), and what are the rules / principles that explain it?

  5. In general, is there a way that I can view the layout-insensitive code generated by Layout? If so, what is it (please detail /explain the technique at a level suitable for someone new to Haskell, like me)?

like image 325
memexor Avatar asked Feb 04 '15 13:02

memexor


1 Answers

This is a known and documented deviation of GHC from the Haskell standard in default or Haskell 98 mode.

GHC has a language extension called NondecreasingIndentation that can be used to trigger this behaviour. If enabled, a do keyword introduces a new block even if the next token starts at the same indentation level as the surrounding block.

If you don't want this, say either -XNoNondecreasingIndentation or -XHaskell2010 (or use language pragmas accordingly).

You can view a pretty-printed version of the code that GHC parsed by passing the -ddump-parsed flag to GHC. This will only partially remove layout (it does so for do-blocks, but e.g. not for let), but might still provide clues.

like image 111
kosmikus Avatar answered Oct 12 '22 22:10

kosmikus