Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

fgets(), signals (EINTR) and input data integrity

fgets() was intended for reading some string until EOF or \n occurred. It is very handy for reading text config files, for example, but there are some problems.

First, it may return EINTR in case of signal delivery, so it should be wrapped with loop checking for that.

Second problem is much worse: at least in glibc, it will return EINTR and loss all already read data in case it delivered in middle. This is very unlikely to happen, but I think this may be source of some complicated vulnerabilities in some daemons.

Setting SA_RESTART flag on signals seems to help avoiding this problem but I'm not sure it covers ALL possible cases on all platforms. Is it?

If no, is there a way to avoid the problem at all?

If no, it seems that fgets() is not usable for reading files in daemons because it may lead to random data loss.

Example code for tests:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <signal.h>

static  char buf[1000000];
static volatile int do_exit = 0;
static void int_sig_handle(int signum) { do_exit = 1; }

void try(void) {
  char * r;
  int err1, err2;
  size_t len;

  memset(buf,1,20); buf[20]=0;
  r = fgets(buf, sizeof(buf), stdin);
  if(!r) {
    err1 = errno;
    err2 = ferror(stdin);
    printf("\n\nfgets()=NULL, errno=%d(%s), ferror()=%d\n", err1, strerror(err1), err2);
    len = strlen(buf);
    printf("strlen()=%u, buf=[[[%s]]]\n", (unsigned)len, buf);
  } else if(r==buf) {
    err1 = errno;
    err2 = ferror(stdin);
    len = strlen(buf);
    if(!len) {
      printf("\n\nfgets()=buf, strlen()=0, errno=%d(%s), ferror()=%d\n", err1, strerror(err1), err2);
    } else {
      printf("\n\nfgets()=buf, strlen()=%u, [len-1]=0x%02X, errno=%d(%s), ferror()=%d\n",
        (unsigned)len, (unsigned char)(buf[len-1]), err1, strerror(err1), err2);
    }
  } else {
    printf("\n\nerr\n");
  }
}

int main(int argc, char * * argv) {
  struct sigaction sa;
  sa.sa_flags = 0; sigemptyset(&sa.sa_mask); sa.sa_handler = int_sig_handle;
  sigaction(SIGINT, &sa, NULL);

  printf("attempt 1\n");
  try();
  printf("\nattempt 2\n");
  try();
  printf("\nend\n");
  return 0;
}

This code can be used to test signal delivery in middle of "attempt 1" and ensure that its partially read data become completely lost after that.

How to test:

  1. run program with strace
  2. enter some line (do not press Enter), press Ctrl+D, see read() syscall completed with some data
  3. send SIGINT
  4. see fread() returned NULL, "attempt 2" and enter some data and press Enter
  5. it will print second entered data but will not print first anywhere

FreeBSD 11 libc: same behaviour

FreeBSD 8 libc: first attempt returns partially read data and sets ferror() and errno

EDIT: according with @John Bollinger recommendations I've added dumping of the buffer after NULL return. Results:

glibc and FreeBSD 11 libc: buffer contains that partially read data but NOT NULL-TERM so the only way to get its length is to clear entire buffer before calling fgets() which looks not like intended use

FreeBSD 8 libc: still returns properly null-terminated partially-read data

like image 704
firk Avatar asked Jun 02 '19 11:06

firk


2 Answers

stdio is indeed not reasonably usable with interrupting signal handlers.

Per ISO C 11 7.21.7.2 The fgets function, paragraph 3:

The fgets function returns s if successful. If end-of-file is encountered and no characters have been read into the array, the contents of the array remain unchanged and a null pointer is returned. If a read error occurs during the operation, the array contents are indeterminate and a null pointer is returned.

EINTR is a read error, so the array contents are indeterminate after such a return.

Theoretically, the behavior could be specified for fgets in a way that you could meaningfully recover from an error in the middle of the operation by setting up the buffer appropriately before the call, since you know that fgets does not write '\n' except as the final character before null termination (similar to techniques for using fgets with embedded NULs). However, it's not specified that way, and there would be no analogous way to handle other stdio functions like scanf, which have nowhere to store state for resuming them after EINTR.

Really, signals are just a really backwards way of doing things, and interrupting signals are an even more backwards tool full of race conditions and other unpleasant and unfixable corner cases. If you want to do this kind of thing in a safe and modern way, you probably need to have a thread that forwards stdin through a pipe or socket, and close the writing end of the pipe or socket in the signal handler so that the main part of your program reading from it gets EOF.

like image 118
R.. GitHub STOP HELPING ICE Avatar answered Oct 14 '22 01:10

R.. GitHub STOP HELPING ICE


First, it may return EINTR in case of signal delivery, so it should be wrapped with loop checking for that.

Of course you mean that fgets() will return NULL and set errno to EINTR. Yes, this is a possibility, and not only for fgets(), or even for stdio functions generally -- a wide variety of functions from the I/O realm and others may exhibit this behavior. Most POSIX functions that may block on events external to the program can fail with EINTR and various function-specific associated behaviors. It's a characteristic of the programming and operational environment.

Second problem is much worse: at least in glibc, it will return EINTR and loss all already read data in case it delivered in middle. This is very unlikely to happen, but I think this may be source of some complicated vulnerabilities in some daemons.

No, at least not in my tests. It is your test program that loses data. When fgets() returns NULL to signal an error, that does not imply that it has not transferred any data to the buffer, and if I modify your program to print the buffer after an EINTR is signaled then I indeed see that the data from attempt 1 have been transferred there. But the program ignores that data.

Now it is possible that other programs make the same mistake that yours does, and therefore lose data, but that is not because of a flaw in the implementation of fgets().

FreeBSD 8 libc: first attempt returns partially read data and sets ferror() and errno

I'm inclined to think that this behavior is flawed -- if the function returns before reaching end of line / file then it should signal an error by providing a NULL return value. It may, but is not obligated to, transfer some or all of the data read to that point to the user-provided buffer. (But if it doesn't transfer data then they should remain available to be read.) I also find it surprising that the function sets the file's error flag at all. I'm inclined to think that erroneous, but I'm not prepared to present an argument for that at the moment.

like image 44
John Bollinger Avatar answered Oct 13 '22 23:10

John Bollinger