Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When and how are VLAs evaluated in sizeof expressions?

Tags:

The C Standard has this language:

6.5.3.4 The sizeof and _Alignof operators

Semantics

  1. The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. The size is determined from the type of the operand. The result is an integer. If the type of the operand is a variable length array type, the operand is evaluated; otherwise, the operand is not evaluated and the result is an integer constant.

It is unclear to me what the Standard means by If the type of the operand is a variable length array type, the operand is evaluated

  • If the type of the operand is a variable length array type, it does not seem to serve any purpose to evaluate the argument as the size can be determined from the definition of the type, as it is stipulated in 6.7.6.2 Array declarators that The size of each instance of a variable length array type does not change during its lifetime.
  • On the other hand, if the operand is a parenthesized name of a variable length array type, such as in sizeof(char[foo()]) the size expression must be evaluated at runtime to compute the size, but the language of the Standard does not seem to cover this case (what is the type of a type name?)

Should the language of the C Standard be amended for clarification?

Here is a test program to illustrate the behavior on some specific cases of VLAs:

#include <stdio.h>

static int N = 0;
int foo(void) { return ++N; }

int main() {
    typedef char S[foo()];      // foo() is called
    printf("typedef char S[foo()];\t");                             printf("N=%d\n", N);
    printf("sizeof(S)=%d\t\t", (int)sizeof(S));                     printf("N=%d\n", N);

    typedef char U[foo()];      // foo() is called
    printf("typedef char U[foo()];\t");                             printf("N=%d\n", N);
    printf("sizeof(U)=%d\t\t", (int)sizeof(U));                     printf("N=%d\n", N);

    S s1;
    printf("S s1;\t\t\t");                                          printf("N=%d\n", N);
    printf("sizeof(s1)=%d\t\t", (int)sizeof(s1));                   printf("N=%d\n", N);

    S s2;
    printf("S s2;\t\t\t");                                          printf("N=%d\n", N);
    printf("sizeof(s2)=%d\t\t", (int)sizeof(s2));                   printf("N=%d\n", N);

    U u1;
    printf("U u1;\t\t\t");                                          printf("N=%d\n", N);
    printf("sizeof(u1)=%d\t\t", (int)sizeof(u1));                   printf("N=%d\n", N);

    U *pu1 = &u1;
    printf("U *pu1 = &u1;\t\t");                                    printf("N=%d\n", N);
    printf("sizeof(*pu1)=%d\t\t", (int)sizeof(*pu1));               printf("N=%d\n", N);

    U *pu2 = NULL;
    printf("U *pu2 = NULL;\t\t");                                   printf("N=%d\n", N);
    // sizeof(*pu2) does not evaluate *pu2, contrary to the Standard specification
    printf("sizeof(*pu2)=%d\t\t", (int)sizeof(*pu2));               printf("N=%d\n", N);

    char x2[foo()][foo()];      // foo() is called twice
    printf("char x2[foo()][foo()];\t");                             printf("N=%d\n", N);
    printf("sizeof(x2)=%d\t\t", (int)sizeof(x2));                   printf("N=%d\n", N);
    printf("sizeof(x2[0])=%d\t\t", (int)sizeof(x2[0]));             printf("N=%d\n", N);

    // sizeof(char[foo()]) evaluates foo()
    printf("sizeof(char[foo()])=%d\t", (int)sizeof(char[foo()]));   printf("N=%d\n", N);
    return 0;
}

Output (both clang and gcc):

typedef char S[foo()];  N=1
sizeof(S)=1             N=1
typedef char U[foo()];  N=2
sizeof(U)=2             N=2
S s1;                   N=2
sizeof(s1)=1            N=2
S s2;                   N=2
sizeof(s2)=1            N=2
U u1;                   N=2
sizeof(u1)=2            N=2
U *pu1 = &u1;           N=2
sizeof(*pu1)=2          N=2
U *pu2 = NULL;          N=2
sizeof(*pu2)=2          N=2
char x2[foo()][foo()];  N=4
sizeof(x2)=12           N=4
sizeof(x2[0])=4         N=4
sizeof(char[foo()])=5   N=5
like image 240
chqrlie Avatar asked Jul 21 '20 18:07

chqrlie


1 Answers

If the type of the operand is a variable length array type, it does not seem to serve any purpose to evaluate the argument as the size can be determined from the definition of the type, as it is stipulated in 6.7.6.2 Array declarators that The size of each instance of a variable length array type does not change during its lifetime.

But that size is not known until the array is instantiated at runtime. An evaluation of some sort has to be performed at runtime. What exactly that evaluation needs to be is not specified.

Should the language of the C Standard be amended for clarification?

I think so, yes. I consider the following idiom to be incredibly useful for dynamically allocating 2D arrays where the number of rows and columns isn't known until runtime:

int rows, cols;
...
T (*arr)[cols] = malloc( sizeof *arr * rows );

However, as the Standard is currently worded, this (most likely) invokes undefined behavior because I'm evaluating *arr at runtime, but arr is uninitialized (and most likely invalid) at that point. You shouldn't need to dereference arr to get the size of the array type, but unfortunately the language in the standard isn't that granular. I'd like to see language similar to "If the type of the operand is a variable length array type, the operand is evaluated for the purpose of obtaining the array size alone".

like image 95
John Bode Avatar answered Oct 11 '22 21:10

John Bode