Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unexpected optimization of strlen when aliasing 2-d array

Here is my code:

#include <string.h>
#include <stdio.h>

typedef char BUF[8];

typedef struct
{
    BUF b[23];
} S;

S s;

int main()
{
    int n;

    memcpy(&s, "1234567812345678", 17);

    n = strlen((char *)&s.b) / sizeof(BUF);
    printf("%d\n", n);

    n = strlen((char *)&s) / sizeof(BUF);
    printf("%d\n", n);
}

Using gcc 8.3.0 or 8.2.1 with any optimization level except -O0, this outputs 0 2 when I was expecting 2 2. The compiler decided that the strlen is bounded to b[0] and therefore can never equal or exceed the value being divided by.

Is this a bug in my code or a bug in the compiler?

This isn't spelled out in the standard clearly, but I thought the mainstream interpretation of pointer provenance was that for any object X, the code (char *)&X should generate a pointer that can iterate over the whole of X -- this concept should hold even if X happens to have sub-arrays as internal structure.

(Bonus question, is there a gcc flag to turn off this specific optimization?)

like image 989
M.M Avatar asked Nov 08 '19 02:11

M.M


1 Answers

I checked this and it reproduced with -O1 on gcc 8.3, so I just opened list of gcc optimization flags here and started experimenting with them one by one. It turned out that disabling only sparse conditional constant propagation with -fno-tree-ccp made the problem disappear (oh luck, I planned to test couples of flags if testing one by one gives no result).

Then I switched to -O2 but did not erase -fno-tree-ccp flag. It reproduced again. I said "OK" and just started testing additional -O2 flags. It again appeared that disabling single Value Range Propagation additionaly leads to intended 2 2 output. I then erased that first -fno-tree-ccp flag, but it started reproducing again. So for -O2 you can specify -O2 -fno-tree-ccp -fno-tree-vrp to make yor program work as expected.

I did not erase these flags, but switched to -O3 then. Problem did not reproduced.

So both of these two optimization techniques in gcc 8.3 lead to such a strange behaviour (maybe they use something common internally):

  • Sparse conditional constant propagation on trees
  • Value Range Propagation on trees

I'm not pro in all that stuff to explain what and why is happening there, maybe someone else could explain. But for sure you can specify -fno-tree-ccp -fno-tree-vrp flags to disable these optimizaton techniques for your code to work as expected.

“The harder I work, the luckier I get.” – Samuel Goldwyn

Edit

As @KamilCuk noted in question comments, -fno-builtin-strlen leads to inteded behaviour too, so most probably there is a compiler bug in combination of built-in strlen and another optimization, that is intended to cut off dead code, statically determine possible expression values and propagate constants through a program. I thought compiler most probably mistakenly considered something, that determines string length in its strlen implementation (maybe in combination with integer division and/or two-dimensional arrays) as dead code and cut it off or calculated it as 0 at compile time. So I decided to play a little bit with the code to check the theories and eliminate other possible "participants" of the bug. I came to this minimal example of the behaviour, which confirmed my thoughts:

int main()
{
    // note that "7" - inner arrays size, you can put any other number here
    char b[23][7]; // local variable, no structs, no typedefs
    memcpy(&b[0][0], "12345678123456781234", 21);

    printf("%d\n", strlen(&b[0][0]) / 8); // greater than that "7" !!!
    printf("%d\n", strlen(&b[0][0]) / 7);
    printf("%d\n", strlen(&b[0][0]) / 6); // less than that "7" !!!
    printf("%d\n", strlen(&b[0][0])); // without division
}

0

0

3

20

I think we can consider this a bug in gcc.

I think -fno-builtin-strlen is better solution for the problem, as it works for all optimization levels alone and built-in strlen seems to be less powerful optimization technique, especially if your program doesn't use strlen() a lot. Still -fno-tree-ccp -fno-tree-vrp is also an option.

like image 162
Oliort UA Avatar answered Oct 17 '22 23:10

Oliort UA