Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does ungetc fail on some characters?

ungetc() seems to fail on some characters. Here is a simple test program:

#include <stdio.h>

int main(void) {
    int c;

    printf("Type a letter and the enter key: ");

#define TRACE(x)  printf("%s -> %d\n", #x, x)
    TRACE(c = getc(stdin));
    TRACE(ungetc(c, stdin));
    TRACE(getc(stdin));

    TRACE(ungetc('\xFE', stdin));
    TRACE(getc(stdin));

    TRACE(ungetc('\xFF', stdin));
    TRACE(getc(stdin));

    return 0;
}

I run it on a unix system and type a Enter at the prompt

The output is:

Type a letter and the enter key: a
c = getc(stdin) -> 97
ungetc(c, stdin) -> 97
getc(stdin) -> 97
ungetc('\xFE', stdin) -> 254
getc(stdin) -> 254
ungetc('\xFF', stdin) -> -1
getc(stdin) -> 10

I expected this:

Type a letter and the enter key: a
c = getc(stdin) -> 97
ungetc(c, stdin) -> 97
getc(stdin) -> 97
ungetc('\xFE', stdin) -> 254
getc(stdin) -> 254
ungetc('\xFF', stdin) -> 255
getc(stdin) -> 255

Why is causing ungetc() to fail?

EDIT: to make things worse, I tested the same code on a different unix system, and it behaves as expected there. Is there some kind of undefined behavior?

like image 507
chqrlie Avatar asked Jun 14 '18 23:06

chqrlie


1 Answers

Working on the following assumptions:

  • You're on a system where plain char is signed.
  • '\xFF' is -1 on your system (the value of out-of-range character constants is implementation-defined, see below).
  • EOF is -1 on your system.

The call ungetc('\xFF', stdin); is the same as ungetc(EOF, stdin); whose behaviour is covered by C11 7.21.7.10/4:

If the value of c equals that of the macro EOF, the operation fails and the input stream is unchanged.


The input range for ungetc is the same as the output range of getchar, i.e. EOF which is negative, or a non-negative value representing a character (with negative characters being represented by their conversion to unsigned char). I presume you were going for ungetc(255, stdin);.


Regarding the value of '\xFF', see C11 6.4.4.4/10:

The value of an integer character constant [...] containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined.

Also, the values of the execution character set are implementation-defined (C11 5.2.1/1). You could check the compiler documentation to be sure, but the compiler behaviour suggests that 255 is not in the execution character set; and in fact the behaviour of a gcc version I tested suggests that it takes the range of char as the execution character set (not the range of unsigned char).

like image 57
M.M Avatar answered Oct 26 '22 00:10

M.M