Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding K&R's putc macro: K&R Chapter 8 (The Unix System Interface) Exercise 2

I've been trying to understand K&R's version of putc for some time now, and I'm out of resources (google, stack overflow, clcwiki don't quite have what I'm looking for and I have no friends or colleagues to turn to). I'll explain the context first and then ask for clarification.

This chapter of the text introduced an example of a data structure that describes a file. The structure includes a character buffer for reading and writing large chunks at a time. They then asked the reader to write a version of the standard library putc.

As a clue for the reader, K&R wrote a version of getc that has supports both buffered and unbuffered reading. They also wrote the skeleton of the putc macro, leaving the user to write the function _flushbuf() for themselves. The putc macro looks like this (p is a pointer to the file structure):

int _flushbuf(int, FILE *);
#define putc(x,p)        (--(p)->cnt >= 0 \ 
                       ? *(p)->ptr++ = (x) : _flushbuf((x),p)
typedef struct {
        int   cnt;  /*characters left*/
        char *ptr;  /*next character position*/
        char *base; /*location of buffer*/
        int   flag; /*mode of file access*/
        int   fd;   /*file descriptor*/
} FILE;

Confusingly, the conditional in the macro is actually testing if the structure's buffer is full (this is stated in the text) - as a side note, the conditional in getc is exactly the same but means the buffer is empty. Weird?

Here's where I need clarification: I think there's a pretty big problem with buffered writing in putc; since writing to p is only performed in _flushbuf(), but _flushbuf() is only called when the file structure's buffer is full, then writing is only done if the buffer is entirely filled. And the size for buffered reading is always the system's BUFSIZ. Writing anything other than exactly 'BUFSIZ' characters just doesn't happen, because _flushbuf() will never be called in putc.

putc works just fine for unbuffered writing. But the design of the macro makes buffered writing almost entirely pointless. Is this correct, or am I missing something here? Why is it like this? I truly appreciate any and all help here.

like image 743
ThornyHatcher Avatar asked Jan 03 '23 02:01

ThornyHatcher


1 Answers

I think you may be misreading what takes place inside the putc() macro; there are a lot of operators and symbols in there, and they all matter (and their order-of-execution matters!) for this to work. To help understand it better, let's substitute it into a real usage, and then expand it out until you can see what's going on.

Let's start with a simple invocation of putc('a', file), as in the example below:

FILE *file = /* ... get a file pointer from somewhere ... */;

putc('a', file);

Now substitute the macro in place of the call to putc() (this is the easy part, and is performed by the C preprocessor; also, I think you're missing a parenthesis at the end of the version you provided, so I'm going to insert it at the end where it belongs):

FILE *file = /* ... get a file pointer from somewhere ... */;

(--(file)->cnt >= 0 ? *(file)->ptr++ = ('a') : _flushbuf(('a'),file));

Well, isn't that a mess of symbols. Let's strip off the unneeded parentheses, and then convert the ?...: into the if-statement that it actually is under the hood:

FILE *file = /* ... get a file pointer from somewhere ... */;

if (--file->cnt >= 0)
    *file->ptr++ = 'a';
else
    _flushbuf('a', file);

This is closer, but it's still not quite obvious what's going on. Let's move the increments and decrements into separate statements so it's easier to see the order of execution:

FILE *file = /* ... get a file pointer from somewhere ... */;

--file->cnt;
if (file->cnt >= 0) {
    *file->ptr = 'a';
    file->ptr++;
}
else {
    _flushbuf('a', file);
}

Now, with the content reordered, it should be a little easier to see what's going on. First, we decrement cnt, the count of remaining characters. If that indicates there's room left, then it's safe to write a into the file's buffer, at the file's current write pointer, and then we move the write pointer forward.

If there isn't room left, then we call _flushbuf(), passing it both the file (whose buffer is full) and the character we wanted to write but couldn't. Presumably, _flushbuf() will first write the whole buffer out to the actual underlying I/O system, and then it will write that character, and then likely reset ptr to the beginning of the buffer and cnt to a big number to indicate that the buffer is able to store lots of data again.

So why does this result in buffered writing? The answer is that the _flushbuf() call only gets performed "every once in a while," when the buffer is full. Writing a byte to a buffer is cheap, while performing the actual I/O is expensive, so this results in _flushbuf() being invoked relatively rarely (only once for every BUFSIZ characters).

like image 52
Sean Werkema Avatar answered Jan 04 '23 14:01

Sean Werkema