Edit 4: It turns out that this is actually just a limitation of TTY input in general; there's nothing specific about OCaml, Coq, or Emacs which is causing the problem.
I'm working on a Coq program using Proof General in Emacs, and I've found a bug with input that's too long. If a region to submit to coqtop
through Proof General contains more than 1023 characters, Proof General (though not Emacs) hangs while waiting for a response, and the *coq*
buffer contains one extra ^G
character for every character over 1023. For instance, if a 1025-character region was sent to coqtop
, then the *coq*
buffer would end with the two extra characters ^G^G
. I can't proceed past this point in the file, and I have to kill the coqtop
process (either with C-c C-x or a kill
/killall
from the terminal).
Something about this limitation arises from coqtop
itself. If one generates a 1024-character or longer string and pipes it in, such as by running
perl -e 'print ("Eval simpl in " . (" " x 1024) . "1.\n")' | coqtop
then everything works fine. (Similarly, coqc
works fine as well.) However, if I run coqtop
in a terminal, I can't type more than 1024 characters on one line, including the ending return character. Thus, typing a 1023-character line and then hitting return works; but after typing 1024 characters, hitting any key, including return (but not including delete, etc.), does nothing but produce a beep. And it turns out that ocaml
(the OCaml toplevel) has the same behavior:
perl -e 'print ((" " x 1024) . "1;;")' | ocaml
works fine, but I can't type more than 1024 characters on one line if running ocaml
from a terminal. Since my understanding is that coqtop
relies on the OCaml toplevel (more obviously when run as coqtop -byte
), I imagine this is a related limitation.
The relevant software versions are:
And my questions are:
ocaml
and coqtop
is enforcing this character limit? And why only for input from the terminal or Emacs, as opposed to input from a pipe or a file?^G
s?Edit 3: After finding that the 1024-character input limitation also exists in the Ocaml toplevel (something which I imagine is related), I've added that information and deleted the original problem description, as it's been completely obscured and superseded. (See the edit history if necessary).
I reported this as issue 5678 on the OCaml bug tracker, and the user dim explained that this wasn't a problem with OCaml per se, but was a limitation of TTY input. The issue is this. Since text isn't sent to running commands until the user hits return, all that waiting input must be stored somewhere. The buffer it's stored in, called the input queue or the type-ahead buffer, has a fixed size, which is controlled by the C constant MAX_INPUT
. This constant is equal to 1024 on Mac OS X. Buffering like this allows useful processing of input, such as deleting characters before they're sent. All commands run from the terminal which don't do something special (such as using the readline
library) will exhibit this behavior; for example, cat
chokes in exactly the same way.
To avoid this behavior, one can unset the ICANON
flag, for instance by running stty -icanon
; this puts the TTY into non-canonical input mode, where input is not processed at all before being sent to the command. This means that editing becomes impossible: delete, left and right arrows, and so on all enter their literal equivalents (^?
, ^[[D
, ^[[C
, …); similarly, ⌃D no longer sends an EOF, but merely a literal control character. However, for my particular use case, this (so far!) seems to be ideal, since Emacs is handling all my input for me. (Edit: But there's a better option!) (Libraries like readline
, as I understand it, change this setting as well, but watch for control characters and handle editing, etc., themselves.) To restore canonical mode, one can run stty icanon
.
The ledit
tool wraps line editing around the program given to it as an argument, so ledit coqtop
works fine (if oddly; I prefer ledit -l 65536
to avoid its scrolling), but interacted oddly with Emacs. The rlwrap
tool does the same thing, but leaves the other program reading from a TTY; thus, while it can receive longer inputs, hitting enter and sending them to the wrapped command behaves very strangely and ends up necessitating that the command be killed.
Edit: In my specific use-case, I can also simply tell Emacs to use a pipe instead of a PTY, solving the problems at a stroke. The Emacs variable process-connection-type
controls how subsidiary processes are communicated with; nil
means to use a pipe, and non-nil
means to use a TTY. Proof General uses the variable proof-shell-process-connection-type
to determine how this should be set. Using a pipe solves all the 1024-character-limit problems.
I'm not sure how the Emacs/coqtop interaction plays here, but I believe there is indeed an OCaml toplevel bug, and it should be reported in the OCaml bugtracker. Would you be ready to report it? If not, I can take care of it.
What about ocaml and coqtop is enforcing this character limit?
There are various input buffer at play in the toplevel code, some of them of length 1024; after a quick look at the code, there is a resizing logic in case the input gets too large, so it should work. I have been able to reproduce the "can't type more than N characters in the interactive toplevel" issue (when not using rlwrap
), but with a N=4096 limit rather than N=1024, so I'm unsure it's the exact same problem.
And why only for input from the terminal or Emacs, as opposed to input from a pipe or a file?
The toplevel code makes a distinction between interactive and non-interactive input; iirc. it affects the way error locations are printed, for example.
Why does Proof General's (apparent) ignorance of this limit cause hanging errors and mysterious ^Gs?
I have no idea. The coqtop
problem you're observing may even be a different bug (than the ocaml
one) caused by a similar buffering logic.
How can I work around this limitation?
How about not sending too long inputs at once in Proof General? Maybe you can factorize your code to use intermediary definitions or something, to stay below the limit.
Regarding the "upstream fix" situation: both OCaml and Coq are, I believe, in the process of getting a new version soon. If people are interested enough in the bug to get a fix real soon (in particular, if you find the fix yourself), it could be integrated upstream reasonably quickly. Otherwise you'll have to wait for the next release cycle, and possibly maintain a local fork in the meantime to avoid the issue. Pragmatically, the "workaround by changing my Coq development" choice is probably the lowest-effort solution, but it won't benefit humanity as whole!
Edit: (to answer comments)
The resizing logic I was thinking of is in Lexing.lex_refill
, found in stdlib/lexing.ml
, called by the closure created by Lexing.from_function
, called from toplevel/toploop.ml
.
I have another "workaround" idea: write your long phrase to an external file foo.v
, and use Load foo.
to get the toplevel to read the file itself. I suspect this will work around the size limitation, but have not tested.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With