Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OCaml comment containing "\" results in "comment not terminated"

Tags:

escaping

ocaml

This comment is ok:

(* "\z" foo *)

This comment results in an error:

(* "\" foo *)

Is there a way to include a literal quoted single backslash in an OCaml comment? Why doesn't the obvious approach work? I would expect escapes in comments to simply be ignored.

For what it's worth, I'm trying to document tests for code that handles backslash escaping for its own purposes.

Thanks for reading.

Edit: The plot thickens. The following comments are acceptable:

(* "\" "\"    notice-> " *)
(* "\" "  "\"    notice-> " *)
(* "\" foo  "\" notice-> " " " *)
(* "\" " "  "\"    notice-> " *)
(* "\" "" "  "\"    notice-> " *)
(* "\" arbitrary "s  "\"    notice-> " *)
(* " \" note the spacing  " \"    notice-> " *)
(* "\" <- notice-> " *)
(* "\"  "  <- notice -> " " *)
(* "" *) (* """" *)

But add or take one at the ends, and it breaks. The following all fail (ask for more input in the REPL):

(* "\"  "\" *)
(* "\" foo  "\" notice->  *)
(* "\" foo  "\" notice-> " " *)
(* "\" foo "\" notice-> " " " " *)
(* "\" foo " "\" notice-> " "  *)
(* "\" foo " " "\" notice-> " "  *)
(* " *) (* """ *) (* """"" *)

I'm pretty lost. It seems like it's trying to balance quotes, but the escaped quotes throw it for a loop.

like image 978
Andrew Fleenor Avatar asked Dec 27 '13 05:12

Andrew Fleenor


1 Answers

Comments in OCaml must contain legal OCaml lexical units (tokens). This allows you to comment out code easily, even code with comments. Even code with string constants that happen to contain (* or *).

You can have "\\" in a comment. But you can't have "\" because it's not a legal OCaml token. (It's an unterminated string constant.)

You can find the legal tokens of OCaml described in the Lexical Conventions chapter of the manual.

Edit

As lukstafi points out, it's much more correct just to say that strings appearing in OCaml comments must have the same structure as strings appearing outside comments. This is necessary to allow code (possibly containing string constants that look like parts of comments) to be commented out reliably.

Edit 2

(* "\" "\"    notice-> " *)

There's nothing surprising about this (in my opinion). The comment consists of two string constants with the \ character between them. Outside of a string constant the \ character doesn't quote anything. It's just a character. (In OCaml code \ is not a legal character, but it's fine in a comment--note that lukstafi's clarification explains this.)

Maybe it will be clearer if I label all the characters. ( for open quotes, ) for close quotes, Q for backslash in a string (it quotes the next character), B for backslash outside a string (it's just an ordinary character), S for other characters in strings, C for other characters outside strings.

(* "\" "\"    notice-> " *)
  C(QSS)B(SSSSSSSSSSSSS)C

Here's one of the erroneous cases:

(* " *) (* """ *) (* """"" *)
  C(SSSSSSS)()C     C()()(SSS

It has an unterminated string constant. Note that the first string has two comment delimiter-like sequences in it. But it's just a string.

like image 165
Jeffrey Scofield Avatar answered Nov 15 '22 09:11

Jeffrey Scofield