Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does strchr implementation work

I tried to write my own implementation of the strchr() method.

It now looks like this:

char *mystrchr(const char *s, int c) {
    while (*s != (char) c) {
        if (!*s++) {
            return NULL;
        }
    }
    return (char *)s;
}

The last line originally was

return s;

But this didn't work because s is const. I found out that there needs to be this cast (char *), but I honestly don't know what I am doing there :( Can someone explain?

like image 435
Marc Avatar asked Jan 16 '13 20:01

Marc


People also ask

How does Strchr work in c?

The strchr() function returns a pointer to the first occurrence of c that is converted to a character in string. The function returns NULL if the specified character is not found.

What is the purpose of Strchr function?

The strchr() function finds the first occurrence of a character in a string. The character c can be the null character (\0); the ending null character of string is included in the search.

What does Strchr mean in C++?

C++ strchr() The strchr() function in C++ searches for the first occurrence of a character in a string.

What is the use of function 4 star Strchr CHC?

strchr() function can also be used to check the presence of a character in a string. The input consists of a character we want to check, if it exists in the string.


2 Answers

I believe this is actually a flaw in the C Standard's definition of the strchr() function. (I'll be happy to be proven wrong.) (Replying to the comments, it's arguable whether it's really a flaw; IMHO it's still poor design. It can be used safely, but it's too easy to use it unsafely.)

Here's what the C standard says:

char *strchr(const char *s, int c);

The strchr function locates the first occurrence of c (converted to a char) in the string pointed to by s. The terminating null character is considered to be part of the string.

Which means that this program:

#include <stdio.h>
#include <string.h>

int main(void) {
    const char *s = "hello";
    char *p = strchr(s, 'l');
    *p = 'L';
    return 0;
}

even though it carefully defines the pointer to the string literal as a pointer to const char, has undefined behavior, since it modifies the string literal. gcc, at least, doesn't warn about this, and the program dies with a segmentation fault.

The problem is that strchr() takes a const char* argument, which means it promises not to modify the data that s points to -- but it returns a plain char*, which permits the caller to modify the same data.

Here's another example; it doesn't have undefined behavior, but it quietly modifies a const qualified object without any casts (which, on further thought, I believe has undefined behavior):

#include <stdio.h>
#include <string.h>

int main(void) {
    const char s[] = "hello";
    char *p = strchr(s, 'l');
    *p = 'L';
    printf("s = \"%s\"\n", s);
    return 0;
}

Which means, I think, (to answer your question) that a C implementation of strchr() has to cast its result to convert it from const char* to char*, or do something equivalent.

This is why C++, in one of the few changes it makes to the C standard library, replaces strchr() with two overloaded functions of the same name:

const char * strchr ( const char * str, int character );
      char * strchr (       char * str, int character );

Of course C can't do this.

An alternative would have been to replace strchr by two functions, one taking a const char* and returning a const char*, and another taking a char* and returning a char*. Unlike in C++, the two functions would have to have different names, perhaps strchr and strcchr.

(Historically, const was added to C after strchr() had already been defined. This was probably the only way to keep strchr() without breaking existing code.)

strchr() is not the only C standard library function that has this problem. The list of affected function (I think this list is complete but I don't guarantee it) is:

void *memchr(const void *s, int c, size_t n);
char *strchr(const char *s, int c);
char *strpbrk(const char *s1, const char *s2);
char *strrchr(const char *s, int c);
char *strstr(const char *s1, const char *s2);

(all declared in <string.h>) and:

void *bsearch(const void *key, const void *base,
    size_t nmemb, size_t size,
    int (*compar)(const void *, const void *));

(declared in <stdlib.h>). All these functions take a pointer to const data that points to the initial element of an array, and return a non-const pointer to an element of that array.

like image 101
Keith Thompson Avatar answered Sep 21 '22 09:09

Keith Thompson


The practice of returning non-const pointers to const data from non-modifying functions is actually an idiom rather widely used in C language. It is not always pretty, but it is rather well established.

The reationale here is simple: strchr by itself is a non-modifying operation. Yet we need strchr functionality for both constant strings and non-constant strings, which would also propagate the constness of the input to the constness of the output. Neither C not C++ provide any elegant support for this concept, meaning that in both languages you will have to write two virtually identical functions in order to avoid taking any risks with const-correctness.

In C++ you wild be able to use function overloading by declaring two functions with the same name

const char *strchr(const char *s, int c);
char *strchr(char *s, int c);

In C you have no function overloading, so in order to fully enforce const-correctness in this case you would have to provide two functions with different names, something like

const char *strchr_c(const char *s, int c);
char *strchr(char *s, int c);

Although in some cases this might be the right thing to do, it is typically (and rightfully) considered too cumbersome and involving by C standards. You can resolve this situation in a more compact (albeit more risky) way by implementing only one function

char *strchr(const char *s, int c);

which returns non-const pointer into the input string (by using a cast at the exit, exactly as you did it). Note, that this approach does not violate any rules of the language, although it provides the caller with the means to violate them. By casting away the constness of the data this approach simply delegates the responsibility to observe const-correctness from the function itself to the caller. As long as the caller is aware of what's going on and remembers to "play nice", i.e. uses a const-qualified pointer to point to const data, any temporary breaches in the wall of const-correctness created by such function are repaired instantly.

I see this trick as a perfectly acceptable approach to reducing unnecessary code duplication (especially in absence of function overloading). The standard library uses it. You have no reason to avoid it either, assuming you understand what you are doing.

Now, as for your implementation of strchr, it looks weird to me from the stylistic point of view. I would use the cycle header to iterate over the full range we are operating on (the full string), and use the inner if to catch the early termination condition

for (; *s != '\0'; ++s)
  if (*s == c)
    return (char *) s;

return NULL;

But things like that are always a matter of personal preference. Someone might prefer to just

for (; *s != '\0' && *s != c; ++s)
  ;

return *s == c ? (char *) s : NULL;

Some might say that modifying function parameter (s) inside the function is a bad practice.

like image 32
AnT Avatar answered Sep 19 '22 09:09

AnT