Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is "char[]" a proper type?

Yesterday, I was surprised to come across some code that seemed to treat char[] as being a type:

typedef std::unique_ptr<char[]> CharPtr;

Previously, I would have written something like:

typedef std::unique_ptr<char*, CharDeleter> CharPtr;
// Custom definition of CharDeleter omitted

After some research, I discovered that the char[] syntax works because std::unique_ptr provides a template specialization to handle arrays (e.g. it will automatically invoke delete[] for the array without requiring a custom deleter)

But what does char[] actually mean in C++?

I've seen syntax like:

const char a[] = "Constant string"; // Example 1

char *p = new char[5]; // Example 2

bool foo(char param[10]); // Example 3

This is how I interpret these examples:

Example 1 allocates a static array (on the stack) and the empty indices are valid because the true size of the string is known at compile time (e.g. the compiler is basically handling the length for us behind the scenes)

Example 2 dynamically allocates 5 contiguous characters with the first character being stored at the address stored in p.

Example 3 defines a function that takes an array of size 10 as a parameter. (Behind the scenes the compiler treats the array like a pointer) -- e.g. it is an error to have:

void foo(char test[5]) {}
void foo(char * test) {}

because the function signatures are ambiguous to the compiler.

I feel like i understand the array/pointer differences and similarities. My confusion likely stems from my lack of experience with building/reading C++ templates.

I know that a template specialization basically allows a customized template (based on a particular template) to be used depending on the template type parameters. Is char[] simply a syntax that is available for template specialization (invoking a particular specialization)?

Also, what is the proper name for array "types" like char[]?

like image 551
CRN Avatar asked Mar 20 '15 17:03

CRN


3 Answers

What does char[] actually mean in C++?

Let's find out:

[C++11: 8.3.4/1]: In a declaration T D where D has the form

   D1 [ constant-expressionopt] attribute-specifier-seqopt

and the type of the identifier in the declaration T D1 is “derived-declarator-type-list T”, then the type of the identifier of D is an array type; if the type of the identifier of D contains the auto type-specifier, the program is ill-formed. T is called the array element type; this type shall not be a reference type, the (possibly cv-qualified) type void, a function type or an abstract class type. If the constant-expression (5.19) is present, it shall be an integral constant expression and its value shall be greater than zero. The constant expression specifies the bound of (number of elements in) the array. If the value of the constant expression is N, the array has N elements numbered 0 to N-1, and the type of the identifier of D is “derived-declarator-type-list array of N T”. An object of array type contains a contiguously allocated non-empty set of N subobjects of type T. Except as noted below, if the constant expression is omitted, the type of the identifier of D is “derived-declarator-type-list array of unknown bound of T”, an incomplete object type. The type “derived-declarator-type-list array of N T” is a different type from the type “derived-declarator-type-list array of unknown bound of T”, see 3.9. [..]

As you point out, these "arrays of unknown bounds" are being used through a std::unique_ptr specialisation.

Regarding example 1, although it's surprisingly unclear in [C++11: 8.5.5], char[] with initialiser is a special case that is not covered by the above text: a is in fact a const char[16]. So, yes, "the compiler is basically handling the length for us behind the scenes".


Example 3 defines a function that takes an array of size 10 as a parameter. (Behind the scenes the compiler treats the array like a pointer)

Almost. In fact there's nothing "behind-the-scenes" about it: the conversion is in the brochure. It's front and centre, explicit and standardised.

So:

-- e.g. it is an error to have:

void foo(char test[5]) {}
void foo(char * test) {}

because the function signatures are ambiguous to the compiler.

In fact it is an error not through "ambiguity", but because you literally defined the same function twice.

like image 153
Lightness Races in Orbit Avatar answered Oct 04 '22 08:10

Lightness Races in Orbit


char[] is a type, but a type that you cannot have an instance of. It is an incomplete object type, somewhat like struct foo;.

This means that templates can consume char[] as a type if they choose to. They cannot create a variable of type char[], but they can interact with the type.

Now, there are a bunch of "magic" behavior attached to arrays inherited from C. As a function argument parameter, char[] becomes char* (as does char[33]!)

As a local variable, char x[]="foo"; or char y[]={'a','b','c'}; becomes an array of fixed size. Here, char[] means "auto-size the array".

In a sense, these are both quirks in parameter types and variable declarations rather than quirks of the type. The type you are declaring doesn't look all that much like the type you are declaring.

There is also a bunch of strangeness involving type decay -- a variable of type char[3] like char x[3]; will decay to char* at the drop of a hat. This, much like auto-sizing arrays, is basically a legacy from C.

All of this is explicitly described in the standard, but because it differs significantly from most "regular" types it acts like magic.

After all, any sufficiently obtuse feature of the standard is indistinguishable from magic.

like image 40
Yakk - Adam Nevraumont Avatar answered Oct 04 '22 06:10

Yakk - Adam Nevraumont


Yes, char[] denotes the compound type "array of unknown bound of char". It is an incomplete type, but one that can be completed later:

extern char a[];    // "a" has incomplete type at point of declaration

char a[10];         // Now "a" has complete type.
like image 23
Kerrek SB Avatar answered Oct 04 '22 06:10

Kerrek SB