Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

linux threads and fopen() fclose() fgets()

Tags:

c

linux

pthreads

I'm looking at some legacy Linux code which uses pthreads.

In one thread a file is read via fgets(). The FILE variable is a global variable shared across all threads. (Hey, I didn't write this...)

In another thread every now and again the FILE is closed and reopened with another filename.

For several seconds after this has happened, the thread fgets() acts as if it is continuing to read the last record it read from the previous file: almost as if there was an error but fgets() was not returning NULL. Then it sorts itself out and starts reading from the new file.

The code looks a bit like this (snipped for brevity so I hope it's still intelligible):

In one thread:

while(gRunState != S_EXIT){
  nanosleep(&timer_delay,0);
  flag = fgets(buff, sizeof(buff), gFile);
  if (flag != NULL){
    // do something with buff...
  }
}

In the other thread:

fclose(gFile);
gFile = fopen(newFileName,"r");

There's no lock to make sure that the fgets() is not called at the same time as the fclose()/fopen().

Any thoughts as to failure modes which might cause fgets() to fail but not return NULL?

like image 282
Simon Elliott Avatar asked Jan 24 '23 01:01

Simon Elliott


2 Answers

How the described code goes wrong

The stdio library buffers data, allocating memory to store the buffered data. The GNU C library dynamically allocates file structures (some libraries, notably on Solaris, use pointers to statically allocated file structures, but the buffer is still dynamically allocated unless you set the buffering otherwise).

If your thread works with a copy of a pointer to the global file pointer (because you passed the file pointer to the function as an argument), then it is conceivable that the code would continue to access the data structure that was orginally allocated (even though it was freed by the close), and would read data from the buffer that was already present. It would only be when you exit the function, or read beyond the contents of the buffer, that things start going wrong - or the space that was previously allocated to the file structure is reallocated for a new use.

FILE *global_fp;

void somefunc(FILE *fp, ...)
{
    ...
    while (fgets(buffer, sizeof(buffer), fp) != 0)
        ...
}

void another_function(...)
{
    ...
    /* Pass global file pointer by value */
    somefunc(global_fp, ...);
    ...
}

Proof of Concept Code

Tested on MacOS X 10.5.8 (Leopard) with GCC 4.0.1:

#include <stdio.h>
#include <stdlib.h>

FILE *global_fp;
const char etc_passwd[] = "/etc/passwd";

static void error(const char *fmt, const char *str)
{
    fprintf(stderr, fmt, str);
    exit(1);
}

static void abuse(FILE *fp, const char *filename)
{
    char buffer1[1024];
    char buffer2[1024];
    if (fgets(buffer1, sizeof(buffer1), fp) == 0)
        error("Failed to read buffer1 from %s\n", filename);
    printf("buffer1: %s", buffer1);

    /* Dangerous!!! */
    fclose(global_fp);
    if ((global_fp = fopen(etc_passwd, "r")) == 0)
        error("Failed to open file %s\n", etc_passwd);

    if (fgets(buffer2, sizeof(buffer2), fp) == 0)
        error("Failed to read buffer2 from %s\n", filename);
    printf("buffer2: %s", buffer2);
}

int main(int argc, char **argv)
{
    if (argc != 2)
        error("Usage: %s file\n", argv[0]);

    if ((global_fp = fopen(argv[1], "r")) == 0)
        error("Failed to open file %s\n", argv[1]);

    abuse(global_fp, argv[1]);

    return(0);
}

When run on its own source code, the output was:

Osiris JL: ./xx xx.c
buffer1: #include <stdio.h>
buffer2: ##
Osiris JL:

So, empirical proof that on some systems, the scenario I outlined can occur.

How to fix the code

The fix to the code is discussed well in other answers. If you avoid the problem I illustrated (for example, by avoiding global file pointers), that is simplest. Assuming that is not possible, it may be sufficient to compile with the appropriate flags (on many Unix-like systems, the compiler flag '-D_REENTRANT' does the job), and you will end up using thread-safe versions of the basic standard I/O functions. Failing that, you may need to put explicit thread-safe management policies around the access to the file pointers; a mutex or something similar (and modify the code to ensure that the threads use the mutex before using the corresponding file pointer).

like image 95
Jonathan Leffler Avatar answered Jan 25 '23 14:01

Jonathan Leffler


A FILE * is just a pointer to the various resources. If the fclose does not zero out those resource, it's possible that the values may make enough sense that fgets does not immediately notice it.

That said, until you add some locking, I would consider this code completely broken.

like image 45
R Samuel Klatchko Avatar answered Jan 25 '23 16:01

R Samuel Klatchko