Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What language standards allow ignoring null terminators on fixed size arrays?

Tags:

We are transitioning C code into C++.
I noticed that the following code is well defined in C,

int main(){

  //length is valid. '\0' is ignored
  char  str[3]="abc";
}

as it is stated in Array initialization that:

"If the size of the array is known, it may be one less than the size of the string literal, in which case the terminating null character is ignored."

However, if I were to build the same code in C++, I get the following C++ error:

error: initializer-string for array of chars is too long
[-fpermissive]    char  str[3]="abc";

I'm hoping someone can expound on this.

Questions:
Is the code example valid in all C language standards?
Is it invalid in all C++ language standards?
Is there a reason that is valid in one language but not another?

like image 543
Trevor Hickey Avatar asked Jun 16 '16 14:06

Trevor Hickey


People also ask

Does char array include null terminator?

// Pre: char array must have null character at the end of data. Thus, we first find out how long the data is. The variable length will be the index of the first null character in the array, which is also the length of the data.

Which of the following type of array is terminated by null character?

A C-style string is a null (denoted by \0 ) terminated char array. The null occurs after the last character of the string. For an initialization using double quotes, "...", the compiler will insert the null .

What is a null terminated array?

In computer programming, a null-terminated string is a character string stored as an array containing the characters and terminated with a null character (a character with a value of zero, called NUL in this article).

What is the use of null character in array?

The null character marks the end of the array to make it easy to know when the string ends (and thereby avoid moving off the end of an array and possibly causing a memory violation). For example, if you declare a string char *str="SPARK"; then you can index into the string by treating str as an array.


2 Answers

What you see here is a difference in the initialization rules for cstring in C and C++. In C11 §6.7.9/14 we have

An array of character type may be initialized by a character string literal or UTF−8 string literal, optionally enclosed in braces. Successive bytes of the string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.

emphasis mine

So as long as the array is large enough for the string excluding the null terminator it is valid. So

char  str[3]="abc";

Is valid C. In C++14 however the rule that governs this found in [dcl.init.string]/2 states

There shall not be more initializers than there are array elements.

And goes on to show that the following code is an error

char cv[4] = "asdf"; // error

So in C++ you have to have enough storage for the entire string literal including the null terminator.

like image 91
NathanOliver Avatar answered Nov 03 '22 00:11

NathanOliver


Is the code example valid in all C language standards?

Note that only one ISO standard is in effect at a time; C2011 supercedes C99, which superceded C89.

I believe it should be valid under any one of those standards, though.

Is it invalid in all C++ language standards?

Same as above, just change "valid" to "invalid".

Is there a reason that is valid in one language but not another?

Most likely, it was left valid in C so as not to break any legacy code that relied on the behavior. C++ came along about a decade or so after C and tried to address some of C's shortcomings, and this was one of the holes that got plugged.

Many modern programming languages are iterations and improvements on earlier languages; C is B with a type system, C++ is C with OO support and better type safety, Java and C# are C++ with less undefined behavior, etc.

like image 42
John Bode Avatar answered Nov 03 '22 01:11

John Bode