Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

tempWord[0]='\0' Does not reset String somehow

Tags:

c

string

I wrote a program in C, The expected result should be:

$ cat poem.txt

Said Hamlet to Ophelia,
I'll draw a sketch of thee,
What kind of pencil shall I use?
2B or not 2B? 

$ ./censor Ophelia < poem.txt

Said Hamlet to CENSORED,
I'll draw a sketch of thee,
What kind of pencil shall I use?
2B or not 2B?

But I got this:

$ ./censor Ophelia < poem.txt

Said Hamlet tomlet CENSORED,
I'lllia drawlia arawlia sketcha ofetcha theecha,
Whatcha kindcha ofndcha pencila shallla Ihallla usellla?
2Bsellla orellla notllla 2Botllla?

I use tempWord to store every word and compare it with the word that needs to be censored. Then I use tempWord[0]='\0' to reset the temp String, so that I can do another comparison. But it seems not working. Can anyone help?

# include <stdio.h>
# include <string.h>

int compareWord(char *list1, char *list2);
int printWord(char *list);

int main(int argc, char *argv[]) {

    int character = 0;

    char tempWord[128];
    int count = 0;

    while (character != EOF) {
        character = getchar();

        if ((character <= 'z' && character >= 'a') ||
            (character <= 'Z' && character >= 'A') ||
            character == 39) {              
            tempWord[count] = character;
            count++;
        } else {
            if (count != 0 && compareWord(tempWord, argv[1])) {
                printf("CENSORED");
                count = 0;
                tempWord[0] = '\0';
            }

            if (count != 0 && !compareWord(tempWord, argv[1])) {
                printWord(tempWord);
                count = 0;
                tempWord[0] = '\0';
            }

            if (count == 0) {
                printf("%c", character);
            }
        }
    }
    return 0;
}

int printWord(char *list) {

    // print function
}

int compareWord(char *list1, char *list2) {
         // compareWord function
}
like image 373
Di Wang Avatar asked May 09 '26 12:05

Di Wang


1 Answers

There are multiple issues in your code:

  • You do not test for end of file at the right spot: if getc() returns EOF, you should exit the loop immediately instead of processing EOF and exiting at the next iteration. The classic C idiom to do this is:

    while ((character = getchar()) != EOF) {
        ...
    
  • For portability and readability, you should use isalpha() from <ctype.h> to check if the byte is a letter and avoid hardcoding the value of the value of the apostrophe as 39, use '\'' instead.

  • You have a potential buffer overflow when storing the bytes into the tempWord array. You should compare the offset with the buffer size.

  • You do not null terminate tempWord, hence the compareWord() function cannot determine the length of the first string. The behavior is undefined.

  • You do not check if a command line argument was provided.

  • The second test is redundant: you could just use an else clause.

  • You have undefined behavior when printing the contents of tempWord[] because of the lack of null termination. This explains the unexpected behavior, but you might have much worse consequences.

  • printWord just prints a C string, use fputs().

  • The compWord function is essentially the same as strcmp(a, b) == 0.

Here is a simplified and corrected version:

#include <ctype.h>
#include <stdio.h>
#include <string.h>

int main(int argc, char *argv[]) {
    char tempWord[128];
    size_t count = 0;
    int c;

    while ((c = getchar()) != EOF) {
        if (isalpha(c) || c == '\'') {
            if (count < sizeof(tempWord) - 1) {
                tempWord[count++] = c;
            }
        } else {
            tempWord[count] = '\0';
            if (argc > 1 && strcmp(tempWord, argv[1]) == 0) {
                printf("CENSORED");
            } else {
                fputs(tempWord, stdout);
            }
            count = 0;
            putchar(c);
        }
    }
    return 0;
}

EDIT: chux rightfully commented that the above code does not handle 2 special cases:

  • words that are too long are truncated in the output.
  • the last word is omitted if it falls exactly at the end of file.

I also realized the program does not handle the case of long words passed on the command line.

Here is a different approach without a buffer that fixes these shortcomings:

#include <ctype.h>
#include <stdio.h>

int main(int argc, char *argv[]) {
    const char *word = (argc > 1) ? argv[1] : "";
    int count = 0;
    int c;

    for (;;) {
        c = getchar();
        if (isalpha(c) || c == '\'') {
            if (count >= 0 && (unsigned char)word[count] == c) {
                count++;
            } else {
                if (count > 0) {
                    printf("%.*s", count, word);
                }
                count = -1;
                putchar(c);
            }
        } else {
            if (count > 0) {
                if (word[count] == '\0') {
                    printf("CENSORED");
                } else {
                    printf("%.*s", count, word);
                }
            }
            if (c == EOF)
                break;
            count = 0;
            putchar(c);
        }
    }
    return 0;
}
like image 199
chqrlie Avatar answered May 12 '26 06:05

chqrlie



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!