Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Initializer list initialization of a member struct bitfield element causing bugs in IAR ARM

I have the following class structure in IAR:

class A
{
public:
    A(){}
    virtual ~A() {};
    virtual void load() {};
};


class C
{
public:
    C()
    {
        //C does other stuff, not relevant
    }
};

class D;

class B : public A
{
public:
    B() : invert(false) {};
    virtual ~B() {};
    void load()
    {
        //Irrelevant stuff done here
    }
private:
    C member_c;
    std::vector<D*> vector_of_d;
    struct {
        bool var_1:1;
        bool var_2:1;
        bool var_3:1;
        bool var_4:1;
        bool invert:1;
    };
};

I am running into bugs with the assembly generated to initialize B, where it seems to be getting 'confused' about where the VTable pointer is vs where the anonymous struct bitfield is. When it goes to set the invert bit false, it goes to the first word of the object (which is the VTable pointer) and flips a bit in the address. When I later call load(), it follows the invalid VTable pointer and ends up finding a null pointer, which it then blindly follows. Things obviously fall apart from there.

Here is an example of the code that would invoke this problem:

void load_A(A* to_be_loaded){
    if(to_be_loaded) to_be_loaded->load();
}

int main(){
   load_A(new B());
}

Now the big question, is have I accidentally introduced some undefined behavior somewhere? This is code that is being ported from GCC-ARM, where it worked fine, but now suddenly it is causing hard-faults when compiled with IAR. My two theories are:

  • It's a compiler bug (I know, it's never a compiler bug)
  • It's non-standard behavior that GCC handled as an extension

As far as I can tell, there shouldn't be anything wrong with using an initializer list to initialize a field in an anonymous struct. I do recognize that anonymous structs are a compiler extension, but they are documented in both IAR and GCC. Either way, IAR is not giving me any warning or error, and is generating clearly broken assembly.

Here is the assembly that it made for the B constructor

1 |    B() : invert(false) {};
2 |B::B():
3 |_ZN6BC1Ev:
4 |    0x80645e8: 0xb510         PUSH      {R4, LR}
5 |    0x80645ea: 0x4604         MOV       R4, R0
6 |    B() : invert(false) {};
7 |    0x80645ec: 0xf007 0xfb20  BL        A::subobject A() ; 0x806bc30
8 |    0x80645f0: 0x4807         LDR.N     R0, [PC, #0x1c]         ; 0x8088808 (134776840)
9 |    0x80645f2: 0x6020         STR       R0, [R4]
10|    0x80645f4: 0xf104 0x0018  ADD.W     R0, R4, #24             ; 0x18
11|    0x80645f8: 0xf00a 0xfadd  BL        C::C()              ; 0x806ebb6
12|    0x80645fc: 0xf104 0x001c  ADD.W     R0, R4, #28             ; 0x1c
13|    0x8064600: 0xf00e 0xff2e  BL        std::vector<D *>::vector() ; 0x8073460
14|    0x8064604: 0x7820         LDRB      R0, [R4]
15|    0x8064606: 0xf000 0x00ef  AND.W     R0, R0, #239            ; 0xef
16|    0x806460a: 0x7020         STRB      R0, [R4]
17|    B() : invert(false) {};
18|    0x806460c: 0x4620         MOV       R0, R4
19|    0x806460e: 0xbd10         POP       {R4, PC}
20|    0x8064610: 0x08088808     DC32      0x8088808 (134776840)

On line 14, we load the value that R4 points to, which is the base address of our object. It does not apply any offset to it, which means it points to the first thing in the object which is the VTable pointer. It then continues with the assumption that it has the bitfield and unsets one bit on line 15 before putting it back into the object where it got it from on line 16.

For reference, if we change the constructor of B to not use initializer lists (shown below) it will work as expected:

class B : public A
{
public:
    B(){ invert = false; };
    virtual ~B() {};
    void load()
    {
        //Irrelevant stuff done here
    }
private:
    C member_c;
    std::vector<D*> vector_of_d;
    struct {
        bool var_1:1;
        bool var_2:1;
        bool var_3:1;
        bool var_4:1;
        bool invert:1;
    }
};

The generated assembly is as follows, take note of the offset used in the LDRB and STRB instructions on lines 14 and 16. This is the proper offset to access the bitfield in the object.

1 |    B(){ invert = false; };
2 |B::B():
3 |_ZN6BC1Ev:
4 |    0x80645e8: 0xb510         PUSH      {R4, LR}
5 |    0x80645ea: 0x4604         MOV       R4, R0
6 |    B(){ invert = false; };
7 |    0x80645ec: 0xf007 0xfb20  BL        A::subobject A() ; 0x806bc30
8 |    0x80645f0: 0x4807         LDR.N     R0, [PC, #0x20]         ; 0x8088808 (134776840)
9 |    0x80645f2: 0x6020         STR       R0, [R4]
10|    0x80645f4: 0xf104 0x0018  ADD.W     R0, R4, #24             ; 0x18
11|    0x80645f8: 0xf00a 0xfadd  BL        C::C()              ; 0x806ebb6
12|    0x80645fc: 0xf104 0x001c  ADD.W     R0, R4, #28             ; 0x1c
13|    0x8064600: 0xf00e 0xff2e  BL        std::vector<D *>::vector() ; 0x8073460
14|    0x8064604: 0x7820         LDRB      R0, [R4, #0x2c]
15|    0x8064606: 0xf000 0x00ef  AND.W     R0, R0, #239            ; 0xef
16|    0x806460a: 0x7020         STRB      R0, [R4, #0x2c]
17|    B(){ invert = false; };
18|    0x806460c: 0x4620         MOV       R0, R4
19|    0x806460e: 0xbd10         POP       {R4, PC}
20|    0x8064610: 0x08088808     DC32      0x8088808 (134776840)

Side note, there is a slight change on line 8, but that's probably due to some offsets changes.

Does anyone have any insight as to what could be causing this?

like image 671
Lyle Cheatham Avatar asked Nov 07 '22 14:11

Lyle Cheatham


1 Answers

This is a compiler bug and according to my investigations it triggers in at least EWARM 7.80.1 and 8.11.2. It does not trigger in EWARM 8.20.1. The bug triggers on all optimization levels and I can't think of another work-around than the one mentioned in the question.

like image 191
Johan Avatar answered Nov 14 '22 22:11

Johan