Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why doesn't the compiler detect out-of-bounds in string constant initialization?

I read this question and its answer in a book. But I didn't understand the book's justification.

Will the following code compile?

int main()
{
   char str[5] = "fast enough";
   return 0;
}

And the answer was:

Yes.The compiler never detects the error if bounds of an array are exceeded.

I couldn't get it.

Can anybody please explain this?

like image 394
Pale Blue Dot Avatar asked Nov 04 '09 17:11

Pale Blue Dot


3 Answers

In the C++ standard, 8.5.2/2 Character arrays says:

There shall not be more initializers than there are array elements.

In the C99 standard, 6.7.8/2 Initialization says:

No initializer shall attempt to provide a value for an object not contained within the entity being initialized

C90 6.5.7 Initializers says similar.

However, note that for C (both C90 and C99) the '\0' terminating character will be put in the array if there is room. It's not an error if the terminator will not fit (C99 6.7.8/14: "Successive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array").

On the other hand, the C++ standard has an example that indicates an error should be diagnosed if there's not room for the terminating character.

in either case, this should be diagnosed as an error in all compilers:

char str[5] = "fast enough";

Maybe pre-ANSI compilers weren't so strict, but any reasonably modern compiler should diagnose this.

like image 130
Michael Burr Avatar answered Nov 13 '22 05:11

Michael Burr


Your book must be pretty old, because gcc puts out a warning even without -Wall turned on:

$ gcc c.c
c.c: In function `main':
c.c:6: warning: initializer-string for array of chars is too long

If we slightly update the program:

#include <stdio.h>

int main(int argc, char **argv)
{

        char str[5] = "1234567890";
        printf("%s\n", str);
        return 0;
}

We can see that gcc seems to truncate the string to the length you've specified; I'm assuming that there happens to be a '\0' where str[6] would be, because otherwise we should see garbage after the 5; but maybe gcc implicitly makes str an array of length 6 and automatically sticks the '\0' in there - I'm not sure.

$ gcc c.c && ./a.exe
c.c: In function `main':
c.c:6: warning: initializer-string for array of chars is too long
12345
like image 21
Mark Rushakoff Avatar answered Nov 13 '22 05:11

Mark Rushakoff


The answer to the question that you quoted is incorrect. The correct answer is "No. The code will not compile", assuming a formally correct C compiler (as opposed to quirks of some specific compiler).

C language does not allow using an excessively long string literal to initialize a character array of specific size. The only flexibility allowed by the language here is the terminating \0 character. If the array is too short to accommodate the terminating \0, the terminating \0 is silently dropped. But the actual literal string characters cannot be dropped. If the literal is too long, it is a constraint violation and the compiler must issue a diagnostic message.

char s1[5] = "abc"; /* OK */
char s2[5] = "abcd"; /* OK */
char s3[5] = "abcde"; /* OK, zero at the end is dropped (ERROR in C++) */
char s4[5] = "abcdef"; /* ERROR, initializer is too long (ERROR in C++ as well) */

Whoever wrote your "book" did know what they were talking about (at least on this specific subject). What they state in the answer is flat out incorrect.

Note: Supplying excessively long string initializers is illegal in C89/90, C99 and C++. However C++ is even more restrictive in this regard. C++ prohibits dropping the terminating \0 character, while C allows dropping it, as described above.

like image 31
AnT Avatar answered Nov 13 '22 06:11

AnT