Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does an escaped ampersand mean in Haskell?

I looked at the Haskell 2010 report and noticed a weird escape sequence with an ampersand: \&. I couldn't find an explanation what this escape sequence should stand for. It also might only be located in strings. I tried print "\&" in GHCi, and it prints an empty string.

like image 510
Nolan Avatar asked Jul 09 '19 22:07

Nolan


1 Answers

It escapes... no character. It is useful to "break" some escape sequences. For instance we might want to express "\12" ++ "3" as a single string literal. If we try the obvious approach, we get

"\123" ==> "{"

We can however use

"\12\&3"

for the intended result.

Also, "\SOH" and "\SO" are both valid single ASCII character escapes, making "\SO" ++ "H" tricky to express as a single literal: we need "\SO\&H" for that.

This escape trick is also exploited by the standard Show String instance, which has to produce a valid literal syntax. We can see this in action in GHCi:

> "\140" ++ "0"
"\140\&0"
> "\SO" ++ "H"
"\SO\&H"

Further, this greatly helps external programs which aim to generate Haskell code (e.g. for metaprogramming). When emitting characters for a string literal, the external program can add \& at the end of potentially ambiguous escapes (or even of all escapes) so that the program does not have to handle unwanted interactions. E.g. if the program wants to emit \12 now, it can emit \12\& and be free to emit anything as the next character. Otherwise, the program should remember that, when the next character is emitted, it has to be prepended by \& if it's a digit. It's simpler to always add \&, even if it's not needed: \12\&A is legal, and has the same meaning as \12A.

Finally, a quote from the Haskell Report, explaining \&:

2.6 Character and String Literals

[...]

Consistent with the "maximal munch" rule, numeric escape characters in strings consist of all consecutive digits and may be of arbitrary length. Similarly, the one ambiguous ASCII escape code, "\SOH", is parsed as a string of length 1. The escape character \& is provided as a "null character" to allow strings such as "\137\&9" and "\SO\&H" to be constructed (both of length two). Thus "\&" is equivalent to "" and the character '\&' is disallowed. Further equivalences of characters are defined in Section 6.1.2.

like image 61
chi Avatar answered Nov 12 '22 18:11

chi