Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unexpected results when assigning values to an array In c

I am reading C The Programming Language Second Edition. I've come to this exercise [1-13 in Section 1.6, pg. 24 2nd ed].

Write a program to print a histogram of the lengths of words in its input. It is easy to draw the histogram with the bars horizontal; a vertical orientation is more challenging.

Everything works pretty well I think except when I try to determine whether the last word count (VARIABLE tmp) is <=10 or > 10 then assign that to the corresponding index in count[tmp-1] or count[11] if it is greater than. I don't even care about printing an actual histogram I would just like to have a valid array representation for now. Below is the output of my program when it runs.

asdasd 
_________________________________
 1  2  3  4  5  6  7  8  9 10 11 
   (x-axis) Length of words 
---------------------------------
[0 SPACE] [0 NEWLINE] [0 TAB] 
[1 WORD] [0.000 KILOBYTE] [6 CHAR]



6---


ARRAy= 1  1  1  1  1  2  0  0  0  0 

Here is my code

#include <stdio.h>
#define MAX 10


int main (void) {
    //
    int nc,nw,nt,ns,nl;  //nc = bytes, nw = words, nt = tabs, ns = spaces, nl = newlines
    nc = nw = nt = ns = nl = 0;
    //
    int c;                     //getchar()
    int done = 0;             //don't even know I'm a noob just ignore...
    int tmp = 0;              //last word count (this works well)
    int array[MAX + 1];      //For outputting screen formatters like ---
    int count[11];           //THIS is whats broken random values
    int state = 0;
    int waslast = 0;
    float kbcount = 0;
    for (c = 0; c <= MAX; c++)
    count[c] = 0;

    while (done == 0) {
        c = getchar();
        ++nc;

        if (c == ' ' || c == '\n' || c == '\t') {
            waslast = 1;



            if (c == '\t') {
                ++nt;
                state = tmp;
                tmp = 0;
            }
            else if (c == '\n') {
                ++nl;
                state = tmp;
                tmp = 0;
            }
            else if (c == ' ') {
                ++ns;
                state = tmp;
                tmp = 0;
            }
        }

        if (c == EOF) {
            done = 1;
        }
        else if (c != ' ' && c != '\n' && c != '\t') {
            tmp++;
            state = tmp;
            if (waslast == 1) {
                nw++;
                waslast=0;
            }
            if (nc == 1)
                nw++;
        }

        if (tmp <= 10)
            count[tmp-1]++;       //Completely random assignments
        else
            count[11]++;          //This is broken
    }



    // END WHILE
    //
    //



    printf("\n");
    for (c = 1; c <= MAX + 1; c++) {
        printf("___");

    }

    printf("\n");

    for (c = 1; c <= MAX + 1; c++) {
        array[c] = c;
        printf("%2d ", array[c]);
    }

    printf("\n   (x-axis) Length of words \n");

    for (c = 1; c <= MAX + 1; c++){
        printf("---");
    }

    kbcount = (nc-1)/1024;

    printf("\n[%d SPACE] [%d NEWLINE] [%d TAB] \n[%d WORD] [%.3f KILOBYTE] [%d CHAR]\n\n\n\n%d---\n\n\n",
           ns,nl,nt,nw,kbcount,(nc -(nl+nt+ns))-1,state);

    printf("ARRAy=");
    for (c = 0; c<MAX ;++c)
        printf(" %d ",count[c]);
    return 0;


}

~ ~

like image 902
user1114092 Avatar asked Dec 23 '11 23:12

user1114092


3 Answers

c arrays are indexed from 0. count[11]++; is out of bounds of the array.

Valid indices for an 11 element array are index 0 through 10 inclusive. The 11th element in the array count is at count[10].

like image 176
Dave Rager Avatar answered Nov 17 '22 23:11

Dave Rager


You're making three major errors. The first two are stylistic and make me - and most other people - not want to read the code and help. The last is your logic error...well, one, anyway.

  1. Take the time to type out meaningful variable names. Or at least truncated ones. Like NumTabs or whatever fits your style.
  2. Get your indentation style down. Proper indentation allows easier understanding of the program flow.

So, your big glaring error is when you do this:

count[11]++;          //This is broken

In the code, it is defined higher up as:

int count[11];

That array has 11 integers in it. However, you do not start referencing them from 1. You start referencing them from 0. count[0], count[1]...count[10] are all valid. Like so:

0 1 2 3 4 5 6 7 8 9 10

As you can see that's 11 numbers even though it only goes up to 10. If you'd like to be able to access count[11] you have to declare count as

int count[12];

This is known as an off-by-one error. It's a pretty basic one but don't feel too bad, even gurus end up making (usually more complex and grievous) versions of the same basic mistake. String processing is also a real pain in general.

Anyway, if that wasn't enough, if you'd like to imagine it conceptually - this isn't how it WORKS, but conceptually - your array allocates 11 int's and 'count' is a pointer to it. So (pointer+0) would point to the 1st integer, which is the same thing as saying pointer[0]. (pointer+1) which is the same thing as saying pointer[1] would point to the 2nd integer. (pointer+5)/pointer[5] would point to the 6th integer, (pointer+11)/pointer[11] to the 12th - which doesn't exist, since there are only 11 of them.

like image 4
std''OrgnlDave Avatar answered Nov 17 '22 22:11

std''OrgnlDave


Two Three additional points (errors) that your program contain are

  • The division of integers produces a result that is an integer, so in :

    float kbcount = (nc-1)/1024;

The variable kbcount is set to the integer result of the integer division of nc - 1 by 1024.

What I believe you want wanted to say is:

float kbcount = (nc - 1) / 1024.0;

As one of parts (the denominator in this case) is a floating point number, the rules of precedence converts (promotes) both parts into a floating point value, and then the division becomes a floating point division, so the results will be a floating point value with its fractional (decimal) value. Read Section 2.7 (pg. 42) of The C Programming Language, for a better explanation.

Next,

  • You can simplify the logic of determining whether a c is part of a word or part of the "whitespace" (spaces, tabs, newlines) to make you program shorter and easier to understand.

E.g.:

enum state_t = {whitespace, printable};
...

enum state_t state = whitespace;
while (EOF != (c = getchar())) {
    bytes++;

    if (is_whitespace(c)) {
        if (state == printable) { /* the whitespace ends the 'word' state */
            count[length % MAX]++;
            length = 0;
        }
        ...
        state = whitespace;
    } else {
        if (state == whitespace) {
            /* starting a new word */
            wordcount++;
        }
        length++ /* word length */
        state = printable;
    }
    ...
    /* anything else you want to do */
}
/* done reading standard input (stdin), now print report... */

Notice how much easier it is to read with better variable names.

And finally,

  • You are a novice or a beginner, not a noob.

Ref:

//don't even know I'm a noob just ignore...

Honestly, every programmer makes tons of mistakes when they are beginning, and when learning a new programming language. But it is part of the natural learning process. You learn by exploring, trying new things, and discovering both, what works, and what doesn't. It is a process, and hopefully you can come to enjoy the challenges of the effort to not only think of a solution to a problem, but to deconstruct the solution into logical bite-sized steps, and write it correctly in a synthetic language, in this case, C.

like image 1
mctylr Avatar answered Nov 17 '22 23:11

mctylr