Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why strcmp returns int but not char?

Tags:

c++

c

As far as I know, the only difference between variable types such as char, int etc. is the amount of memory they occupy. I guess that they have no role in regulating what the variable they are holding represents. If that's true, in here, I have seen the following for strcmp:

The strcmp function compares the string s1 against s2, returning a value that has the same sign as the difference between the first differing pair of characters (interpreted as unsigned char objects, then promoted to int).

I want to ask why is the result promoted to int? Since chars are being compared, their difference fits into a char in all cases. So isn't promoting the result to int simply appending bunch of 0's at the end of the result? So, why is this done?

like image 236
Utku Avatar asked May 27 '15 16:05

Utku


People also ask

What does strcmp () return?

The return value from strcmp is 0 if the two strings are equal, less than 0 if str1 compares less than str2 , and greater than 0 if str1 compares greater than str2 .

Why does strcmp return to zero?

Compare Two Character Vectors Compare two different character vectors. strcmp returns 0 because s1 and s2 are not equal. Compare two equal character vectors. strcmp returns 1 because s1 and s2 are equal.

Can strcmp return null?

If you rely on strcmp for safe string comparisons, both parameters must be strings, the result is otherwise extremely unpredictable. For instance you may get an unexpected 0, or return values of NULL, -2, 2, 3 and -3.

How does strcmp work?

The strcmp() function is used to compare two strings two strings str1 and str2 . If two strings are same then strcmp() returns 0 , otherwise, it returns a non-zero value. This function compares strings character by character using ASCII value of the characters.


2 Answers

char may or may not be signed. strcmp must return a signed type, so that it can be negative if the difference is negative.

More generally, int is preferred for passing and returning simple numerical values, since it's defined as the "natural" size for such values and, on some platforms, is more efficient to deal with than smaller types.

like image 159
Mike Seymour Avatar answered Sep 28 '22 06:09

Mike Seymour


Of course, despite the overflow possibility others have mentioned, it only needs to be able to return e.g. -1, 0, or 1 - which easily fit in a signed char. The real historical reason for this is that in the original version of C in the 1970s, functions couldn't return a char, and any attempt to do so resulted in returning an int.

In these early compilers, int was also the default type (many situations, including function return values as seen in main below, allowed you to declare something as int without actually using the int keyword), so it made sense to define any function that didn't specifically need to return a different type as returning an int.

Even now, a char return simply sign-extends the value into the int return register (r0 on pdp11, eax on x86), anyway. Treating it as a char would not have any performance benefit, whereas allowing it to be the actual difference rather than forcing it to be -1 or 1 did have a small performance benefit. And axiac's answer also makes the good point that it would have had to be promoted back to an int anyway, for the comparison operator. The reason for these promotions is also historical, incidentally, it was so that the compiler did not have to implement separate operators for every possible combination of char and int, especially since the comparison instructions on many processors only works with an int anyway.


Proof: If I make a test program on Unix V6 for PDP-11, the char type is silently ignored and an integer value outside the range is returned:

char foo() {
    return 257;
}

main() {
    printf("%d\n", foo());
    return 0;
}

# cc foo.c
# a.out
257
like image 20
Random832 Avatar answered Sep 28 '22 04:09

Random832