Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

K&R: Chapter 6 - Why getword() function does not read EOF?

Tags:

c

eof

This is my very first post on Stack Overflow, so I hope I don't step on anyone's toes.

Of course, all inputs are welcome and appreciated, but those most suited to answer would have actually read the book, C Programming Language, 2nd ed.

I have just finished coding Exercise 6-4, but I cannot seem to figure something out. Why does the getword() function not read EOF until I press Ctrl+D (I code in C in an Arch Linux VM)?

Many of my previous exercises from the book require reading from stdin. One way I would do it is via something like

while ((c = getchar()) != EOF) {...}

In such an instance, I never have to press Ctrl+D. I enter in my input, press Enter, the stdin buffer gets flushed out, and EOF is detected automatically. The getword() function also relies on getchar() at its base, so why does it hang my program?

The getword() function is called from main():

while (getword(word, MAX_WORD) != EOF) {
    if (isalpha(word[0])) {
        root = addtree(root, word);
    }
}

The getword() function itself:

int getword(char *word, int lim) {

    char *w = word;
    int c;

    while (isspace(c = getch())) {
    }
    if (c != EOF) {
        *w++ = c;
    }
    // This point is reached
    if (!isalpha(c)) {
        // This point is never reached before Ctrl+D
        *w = '\0';
        return c;
    }
    for ( ; --lim > 0; w++) {
        if (!isalnum(*w = getch())) {
            ungetch(*w);
            break;
        }
    }
    *w = '\0';
    return word[0];
}

I put comments to indicate the point where I determined that EOF is not being read.

The getch() and ungetch() functions are the same ones used in the Polish notation calculator from Chapter 4 (and that program was able to read EOF automatically - by pressing Enter):

#define BUF_SIZE 100

char buf[BUF_SIZE];
int bufp = 0;

int getch(void) {

    return (bufp > 0) ? buf[--bufp] : getchar();
}

void ungetch(int c) {

    if (bufp >= BUF_SIZE) {
        printf("ungetch: too many characters\n");
    }
    else {
        buf[bufp++] = c;
    }
}

Thus far, this is the first program I wrote since the beginning of this book that requires me to manually enter the EOF via Ctrl+D. I just can't seem to figure out why.

Much appreciation in advance for explanations...

like image 505
AK-33 Avatar asked Dec 15 '11 02:12

AK-33


2 Answers

Having to type Ctrl+D to get EOF is the normal behavior for Unix-like systems.

For your code snippet:

while ((c = getchar()) != EOF) {...}

pressing Enter definitely shouldn't terminate the loop (unless your tty settings are badly messed up).

Try compiling and running this program:

#include <stdio.h>
int main( void )
{
    int c;
    while ((c = getchar()) != EOF) {
        putchar(c);
    }
    return 0;
}

It should print everything you type, and it should terminate only when you type control-D at the beginning of a line (or when you kill it with control-C).

like image 90
Keith Thompson Avatar answered Oct 15 '22 14:10

Keith Thompson


The 'not reached' point would only be reached if you did something like type a punctuation mark in the input - or you read EOF. If you type a letter, or spaces, then it is bypassed.

When input is coming from a terminal (standard input), then EOF is not detected until you type Control-D (or whatever is specified in the stty -a output) after you enter a newline, or after you hit another Control-D (so two in a row). The code reads through newlines because the newline character '\n' satisfies isspace().

like image 20
Jonathan Leffler Avatar answered Oct 15 '22 15:10

Jonathan Leffler