I'm trying to implement my own strcmp
function, my strcmp
bahaves differently when I use special characters.
#include <string.h>
int my_strcmp(const char *s1, const char *s2)
{
const char *str1;
const char *str2;
str1 = s1;
str2 = s2;
while ((*str1 == *str2) && *str1)
{
str1++;
str2++;
}
return (*str1 - *str2);
}
int main()
{
char *src = "a§bcDef";
char *des = "acbcDef";
printf("%d %d\n", my_strcmp(des, src), strcmp(des, src));
return(0);
}
OUTPUT
161 -95
strcmp compares two character strings ( str1 and str2 ) using the standard EBCDIC collating sequence. The return value has the same relationship to 0 as str1 has to str2 . If two strings are equal up to the point at which one terminates (that is, contains a null character), the longer string is considered greater.
Consequences are unlikely to happen in strcmp, but issue is the same. strnxxx function family try to prevent reading/writing not acquired memory. Disadvantage of using strn is extra compare and decrement operation on counter. In few words: strncmp is safer then strcmp, but it is slower too.
strcmp compares both the strings till null-character of either string comes whereas strncmp compares at most num characters of both strings.
That added n in strncmp() is not a magic wand that makes unsafe code safe. It doesn't guard against null pointers, uninitialized pointers, uninitialized arrays, an incorrect value of n , or just passing incorrect data.
char
is signed in many implementations, and your strcmp
implementation considers char
values < 0 to be smaller than those greater than 0. Perhaps you want to compare the unsigned values instead.
const unsigned char *str1 = (unsigned char*) s1;
const unsigned char *str2 = (unsigned char*) s2;
Here's what the standard says about strcmp
, with a relevant section highlighted in bold:
The sign of a non-zero return value shall be determined by the sign of the difference between the values of the first pair of bytes (both interpreted as type unsigned char) that differ in the strings being compared.
Your code is taking the difference of the bytes as char
, which if signed differs from the spec.
Instead:
return (unsigned char)(*str1) - (unsigned char)(*str2);
Here's some test cases for the original code (my_strcmp
), the currently accepted answer of dasblinkenlight (my_strcmp1
), and this answer (my_strcmp2
). Only my_strcmp2
passes the tests.
#include <string.h>
#include <stdio.h>
int my_strcmp(const char *s1, const char *s2) {
const signed char *str1 = (const signed char*)(s1);
const signed char *str2 = (const signed char*)(s2);
while ((*str1 == *str2) && *str1)
{
str1++;
str2++;
}
return (*str1 - *str2);
}
int my_strcmp1(const char *s1, const char *s2) {
const signed char *str1 = (const signed char*)(s1);
const signed char *str2 = (const signed char*)(s2);
while ((*str1 == *str2) && *str1)
{
str1++;
str2++;
}
return (signed char)(*str1 - *str2);
}
int my_strcmp2(const char *s1, const char *s2) {
const signed char *str1 = (const signed char*)(s1);
const signed char *str2 = (const signed char*)(s2);
while ((*str1 == *str2) && *str1)
{
str1++;
str2++;
}
return (unsigned char)(*str1) - (unsigned char)(*str2);
}
int sgn(int a) {
return a > 0 ? 1 : a < 0 ? -1 : 0;
}
#define TEST(sc, a, b) do { \
if (sgn(sc(a, b)) != sgn(strcmp(a, b))) { \
printf("%s(%s, %s) = %d, want %d\n", #sc, a, b, sc(a, b), strcmp((const char*)a, (const char*)b)); \
fail = 1; \
} } while(0)
int main(int argc, char *argv[]) {
struct {
const char *a;
const char *b;
}cases[] = {
{"abc", "abc"},
{"\x01", "\xff"},
{"\xff", "\x01"},
{"abc", "abd"},
{"", ""},
};
int fail = 0;
for (int i = 0; i < sizeof(cases) / sizeof(cases[0]); i++) {
TEST(my_strcmp, cases[i].a, cases[i].b);
TEST(my_strcmp1, cases[i].a, cases[i].b);
TEST(my_strcmp2, cases[i].a, cases[i].b);
}
return fail;
}
(note: I put in some explicit signed
in the implementations so that the code can be tested on compilers with unsigned char). Also, sorry about the macro -- this was a quick hack to test!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With