Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Counting the number of times a character occurs in a string in C

I'm new to C, and I'm working on my own explode like function. I'm trying to count how many times a specified character occurs in a string.

int count_chars(char * string, char * chr)
{
    int count = 0;
    int i;

    for (i = 0; i < sizeof(string); i++)
    {
        if (string[i] == chr)
        {
            count++;
        }
    }

    return count;
}

It just returns 0 every time. Can anyone explain why, please? :)

like image 905
Rob Avatar asked Sep 08 '11 13:09

Rob


3 Answers

Your code is hopelessly flawed. Here's how it should look like:

int count_chars(const char* string, char ch)
{
    int count = 0;
    int i;

    // We are computing the length once at this point
    // because it is a relatively lengthy operation,
    // and we don't want to have to compute it anew
    // every time the i < length condition is checked.
    int length = strlen(string);

    for (i = 0; i < length; i++)
    {
        if (string[i] == ch)
        {
            count++;
        }
    }

    return count;
}

See this code run on example input.

Here's what you are doing wrong:

  1. Since you want to find a character, the second parameter should be a character (and not a char*), This has implications later (see #3).
  2. sizeof(string) does not give you the length of the string. It gives the size (in bytes) of a pointer in your architecture, which is a constant number (e.g. 4 on 32-bit systems).
  3. You are comparing some value which is not a memory address to the memory address chr points to. This is comparing apples and oranges, and will always return false, so the if will never succeed.
  4. What you want to do instead is compare a character (string[i]) to the second parameter of the function (this is the reason why that one is also a char).

A "better" version of the above

Commenters below have correctly identified portions of the original answer which are not the usual way to do things in C, can result in slow code, and possibly in bugs under (admittedly extraordinary) circumstances.

Since I believe that "the" correct implementation of count_chars is probably too involved for someone who is making their first steps in C, I 'll just append it here and leave the initial answer almost intact.

int count_chars(const char* string, char ch)
{
    int count = 0;
    for(; *string; count += (*string++ == ch)) ;
    return count;
}

Note: I have intentionally written the loop this way to make the point that at some stage you have to draw the line between what is possible and what is preferable.

See this code run on example input.

like image 173
Jon Avatar answered Oct 07 '22 06:10

Jon


You probably want to use a function that actually gets the length of string, instead of sizeof.

sizeof will get the size of the datatype. It will not return the length of the string. strlen will return the length of the string.

like image 34
FrustratedWithFormsDesigner Avatar answered Oct 07 '22 08:10

FrustratedWithFormsDesigner


This is C! It's designed to be terse!

int count_chars(const char* string, char ch)
{
  int c = 0;
  while (*string) c += *(string++) == ch;
  return c;
}

Update

I'll try and explain how it works:

int c = 0;

This will be the count of the number of matches that have been found.

while (*string)

This is the loop control statement and will continue to iterate the loop as long as the condition is true. In this case the condition is *string. In C, strings are stored 'null terminated' which means the last character of the string is a character with value 0 ('\0'). *string evaluates to the character the pointer is pointing to. Expressions in C are 'true' if they evaluate to any non-zero value and 'false' if they evaluate to zero. *string is an expression, so any non-zero character *string points to is true and the '\0' at the end of the string is false. So this will terminate if *string is pointing to the end of the string.

*(string++)

This is an expression which evaluates to the value the pointer is pointing at. The ++ is a post-increment so the value of the pointer is moved one place forward, i.e. it points to the next character in the string. Note the the value of the expression is not the same as the value of *string after the expression has been evaluated because the pointer has moved.

*(string++) == ch

This is a comparison expression, it compares the value of *string (before it was updated) to the value of ch. In C, the result of this is an integer (there's no bool type in C) which has the value '1' if the expression is true and '0' if the expression is false.

c += *(string++) == ch;

We know the bit after the += is a '1' if the character is one we're looking for and '0' if not. The += is shorthand for:

c = c + (*(string++) == ch);

so it will increment the count if a matching character has been found.

In this particular case, there's little advantage to the += syntax, but if c was more complex, say *(variable [index].structure_member [index2]) then it would only be evaluated once.

The ; at the end marks the end of the statement and because there's no { after the while, it also marks the end of the while loop.

like image 38
Skizz Avatar answered Oct 07 '22 08:10

Skizz