Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ending a Loop with EOF (without enter)

Tags:

c

linux

windows

eof

i am currently trying to end a while loop with something like this:

#include <stdio.h>
int main() 
{
    while(getchar() != EOF)
    {
        if( getchar() == EOF )
            break;
    }
    return 0;

}

When i press CTRL+D on my Ubuntu, it ends the loop immediately. But on Windows i have to press CTRL+Z and then press ENTER to close the loop. Can i get rid of the ENTER on Windows?

like image 566
Michael Hübler Avatar asked Jan 29 '23 08:01

Michael Hübler


1 Answers

The getchar behavior

For linux the EOF char is written with ctrl + d, while on Windows it is written by the console when you press enter after changing an internal status of the CRT library through ctrl + z (this behaviour is kept for retrocompatibility with very old systems). If I'm not wrong it is called soft end of file. I don't think you can bypass it, since the EOF char is actually consumed by your getchar when you press enter, not when you press ctrl + z.

As reported here:

In Microsoft's DOS and Windows (and in CP/M and many DEC operating systems), reading from the terminal will never produce an EOF. Instead, programs recognize that the source is a terminal (or other "character device") and interpret a given reserved character or sequence as an end-of-file indicator; most commonly this is an ASCII Control-Z, code 26. Some MS-DOS programs, including parts of the Microsoft MS-DOS shell (COMMAND.COM) and operating-system utility programs (such as EDLIN), treat a Control-Z in a text file as marking the end of meaningful data, and/or append a Control-Z to the end when writing a text file. This was done for two reasons:

  • Backward compatibility with CP/M. The CP/M file system only recorded the lengths of files in multiples of 128-byte "records", so by convention a Control-Z character was used to mark the end of meaningful data if it ended in the middle of a record. The MS-DOS filesystem has always recorded the exact byte-length of files, so this was never necessary on MS-DOS.

  • It allows programs to use the same code to read input from both a terminal and a text file.

Other information are also reported here:

Some modern text file formats (e.g. CSV-1203[6]) still recommend a trailing EOF character to be appended as the last character in the file. However, typing Control+Z does not embed an EOF character into a file in either MS-DOS or Microsoft Windows, nor do the APIs of those systems use the character to denote the actual end of a file.

Some programming languages (e.g. Visual Basic) will not read past a "soft" EOF when using the built-in text file reading primitives (INPUT, LINE INPUT etc.), and alternate methods must be adopted, e.g. opening the file in binary mode or using the File System Object to progress beyond it.

Character 26 was used to mark "End of file" even if the ASCII calls it Substitute, and has other characters for this.

If you modify your code like that:

#include <stdio.h>

int main() {
  while(1) {
    char c = getchar();
    printf("%d\n", c); 
    if (c == EOF)      // tried with also -1 and 26
      break;
  }
  return 0;
}

and you test it, on Windows you will see that the EOF (-1) it is not written in console until you press enter. Beore of that a ^Z is printed by the terminal emulator (I suspect). From my test, this behavior is repeated if:

  • you compile using the Microsoft Compiler
  • you compile using GCC
  • you run the compiled code in CMD window
  • you run the compiled code in bash emulator in windows

Update using Windows Console API

Following the suggestion of @eryksun, I successfully written a (ridiculously complex for what it can do) code for Windows that changes the behavior of conhost to actually get the "exit when pressing ctrl + d". It does not handle everything, it is only an example. IMHO, this is something to avoid as much as possible, since the portability is less than 0. Also, to actually handle correctly other input cases a lot more code should be written, since this stuff detaches the stdin from the console and you have to handle it by yourself.

The methods works more or less as follows:

  • get the current handler for the standard input
  • create an array of input records, a structure that contains information about what happens in the conhost window (keyboard, mouse, resize, etc.)
  • read what happens in the window (it can handle the number of events)
  • iterate over the event vector to handle the keyboard event and intercept the required EOF (that is a 4, from what I've tested) for exiting, or prints any other ascii character.

This is the code:

#include <windows.h>
#include <stdio.h>

#define Kev input_buffer[i].Event.KeyEvent // a shortcut

int main(void) {
  HANDLE h_std_in;                // Handler for the stdin
  DWORD read_count,               // number of events intercepted by ReadConsoleInput
        i;                        // iterator
  INPUT_RECORD input_buffer[128]; // Vector of events

  h_std_in = GetStdHandle( // Get the stdin handler
    STD_INPUT_HANDLE       // enumerator for stdin. Others exist for stdout and stderr
  ); 

  while(1) {
    ReadConsoleInput( // Read the input from the handler
      h_std_in,       // our handler 
      input_buffer,   // the vector in which events will be saved
      128,            // the dimension of the vector
      &read_count);   // the number of events captured and saved (always < 128 in this case)

    for (i = 0; i < read_count; i++) {    // and here we iterate from 0 to read_count
      switch(input_buffer[i].EventType) { // let's check the type of event 
        case KEY_EVENT:                   // to intercept the keyboard ones
          if (Kev.bKeyDown) {             // and refine only on key pressed (avoid a second event for key released)
            // Intercepts CTRL + D
            if (Kev.uChar.AsciiChar != 4)
              printf("%c", Kev.uChar.AsciiChar);
            else
              return 0;
          }
          break;
        default:
          break;
      }
    }
  }

  return 0;
}
like image 192
Matteo Ragni Avatar answered Feb 05 '23 17:02

Matteo Ragni