memory layout hack

Question

i have been following this course in youtube and it was talking about how some programmers can use there knowledge of how memory is laid to do clever things.. one of the examples in the lecture was something like that

#include <stdio.h>
void makeArray();
void printArray();
int main(){
        makeArray();
        printArray();
        return 0;
}
void makeArray(){
    int array[10];
    int i;
    for(i=0;i<10;i++)
        array[i]=i;
}
void printArray(){
    int array[10];
    int i;  
    for(i=0;i<10;i++)
        printf("%d
",array[i]);
}

the idea is as long as the two function has the same activation record size on the stack segment it will work and print numbers from 0 to 9 ... but actually it prints something like that

134520820
-1079626712
0
1
2
3
4
5
6
7

there are always those two values at the begging ... can any one explain that ??? iam using gcc in linux

the exact lecture url starting at 5:15

paxdiablo · Accepted Answer

I'm sorry but there's absolutely nothing clever about that piece of code and people who use it are very foolish.

Addendum:

Or, sometimes, just sometimes, very clever. Having watched the video linked to in the question update, this wasn't some rogue code monkey breaking the rules. This guy understood what he was doing quite well.

It requires a deep understanding of the underlying code generated and can easily break (as mentioned and seen here) if your environment changes (like compilers, architectures and so on).

But, provided you have that knowledge, you can probably get away with it. It's not something I'd suggest to anyone other than a veteran but I can see it having its place in very limited situations and, to be honest I've no doubt occasinally been somewhat more ... pragmatic ... than I should have been in my own career :-)

Now back to your regular programming ...

It's non-portable between architectures, compilers, releases of compilers, and probably even optimisation levels within the same release of a compiler, as well as being undefined behaviour (reading uninitialised variables).

Your best bet if you want to understand it is to examine the assembler code output by the compiler.

But your best bet overall is to just forget about it and code to the standard.

For example, this transcript shows how gcc can have different behaviour at different optimisation levels:

pax> gcc -o qq qq.c ; ./qq
0
1
2
3
4
5
6
7
8
9

pax> gcc -O3 -o qq qq.c ; ./qq
1628373048
1629343944
1629097166
2280872
2281480
0
0
0
1629542238
1629542245

At gcc's high optimisation level (what I like to call its insane optimisation level), this is the makeArray function. It's basically figured out that the array is not used and therefore optimised its initialisation out of existence.

_makeArray:
        pushl   %ebp            ; stack frame setup
        movl    %esp, %ebp

                                ; heavily optimised function

        popl    %ebp            ; stack frame tear-down

        ret                     ; and return

I'm actually slightly surprised that gcc even left the function stub in there at all.

Update: as Nicholas Knight points out in a comment, the function remains since it must be visible to the linker - making the function static results in gcc removing the stub as well.

If you check the assembler code at optimisation level 0 below, it gives a clue (it's not the actual reason - see below). Examine the following code and you'll see that the stack frame setup is different for the two functions despite the fact that they have exactly the same parameters passed in and the same local variables:

subl    $48, %esp     ; in makeArray
subl    $56, %esp     ; in printArray

This is because printArray allocates some extra space to store the address of the printf format string and the address of the array element, four bytes each, which accounts for the eight bytes (two 32-bit values) difference.

That's the most likely explanation for your array in printArray() being off by two values.

Here's the two functions at optimisation level 0 for your enjoyment :-)

_makeArray:
        pushl   %ebp                     ; stack fram setup
        movl    %esp, %ebp
        subl    $48, %esp
        movl    $0, -4(%ebp)             ; i = 0
        jmp     L4                       ; start loop
L5:
        movl    -4(%ebp), %edx
        movl    -4(%ebp), %eax
        movl    %eax, -44(%ebp,%edx,4)   ; array[i] = i
        addl    $1, -4(%ebp)             ; i++
L4:
        cmpl    $9, -4(%ebp)             ; for all i up to and including 9
        jle     L5                       ; continue loop
        leave
        ret
        .section .rdata,"dr"
LC0:
        .ascii "%d\12\0"                 ; format string for printf
        .text

_printArray:
        pushl   %ebp                     ; stack frame setup
        movl    %esp, %ebp
        subl    $56, %esp
        movl    $0, -4(%ebp)             ; i = 0
        jmp     L8                       ; start loop
L9:
        movl    -4(%ebp), %eax           ; get i
        movl    -44(%ebp,%eax,4), %eax   ; get array[i]
        movl    %eax, 4(%esp)            ; store array[i] for printf
        movl    $LC0, (%esp)             ; store format string
        call    _printf                  ; make the call
        addl    $1, -4(%ebp)             ; i++
L8:
        cmpl    $9, -4(%ebp)             ; for all i up to and including 9
        jle     L9                       ; continue loop
        leave
        ret

Update: As Roddy points out in a comment. that's not the cause of your specific problem since, in this case, the array is actually at the same position in memory (%ebp-44 with %ebp being the same across the two calls). What I was trying to point out was that two functions with the same argument list and same local parameters did not necessarily end up with the same stack frame layout.

All it would take would be for printArray to swap the location of its local variables (including any temporaries not explicitly created by the developer) around and you would have this problem.

memory layout hack

Tags:

c

memory-management

operating-system

gcc

Ahmed Kotb

1 Answers

paxdiablo

Recent Activity

Donate For Us

memory layout hack

Tags:

c

memory-management

operating-system

gcc

Ahmed Kotb

1 Answers

paxdiablo

Related questions

Recent Activity

Donate For Us