Alloca implementation

Tags:

How does one implement alloca() using inline x86 assembler in languages like D, C, and C++? I want to create a slightly modified version of it, but first I need to know how the standard version is implemented. Reading the disassembly from compilers doesn't help because they perform so many optimizations, and I just want the canonical form.

Edit: I guess the hard part is that I want this to have normal function call syntax, i.e. using a naked function or something, make it look like the normal alloca().

Edit # 2: Ah, what the heck, you can assume that we're not omitting the frame pointer.

570

asked Apr 03 '09 16:04

dsimcha

2 Answers

implementing alloca actually requires compiler assistance. A few people here are saying it's as easy as:

Click to copy

sub esp, <size>

which is unfortunately only half of the picture. Yes that would "allocate space on the stack" but there are a couple of gotchas.

if the compiler had emitted code which references other variables relative to esp instead of ebp (typical if you compile with no frame pointer). Then those references need to be adjusted. Even with frame pointers, compilers do this sometimes.
more importantly, by definition, space allocated with alloca must be "freed" when the function exits.

The big one is point #2. Because you need the compiler to emit code to symmetrically add <size> to esp at every exit point of the function.

The most likely case is the compiler offers some intrinsics which allow library writers to ask the compiler for the help needed.

EDIT:

In fact, in glibc (GNU's implementation of libc). The implementation of alloca is simply this:

Click to copy

#ifdef  __GNUC__ # define __alloca(size) __builtin_alloca (size) #endif /* GCC.  */

EDIT:

after thinking about it, the minimum I believe would be required would be for the compiler to always use a frame pointer in any functions which uses alloca, regardless of optimization settings. This would allow all locals to be referenced through ebp safely and the frame cleanup would be handled by restoring the frame pointer to esp.

EDIT:

So i did some experimenting with things like this:

Click to copy

#include <stdlib.h> #include <string.h> #include <stdio.h>  #define __alloca(p, N) \     do { \         __asm__ __volatile__( \         "sub %1, %%esp \n" \         "mov %%esp, %0  \n" \          : "=m"(p) \          : "i"(N) \          : "esp"); \     } while(0)  int func() {     char *p;     __alloca(p, 100);     memset(p, 0, 100);     strcpy(p, "hello world\n");     printf("%s\n", p); }  int main() {     func(); }

which unfortunately does not work correctly. After analyzing the assembly output by gcc. It appears that optimizations get in the way. The problem seems to be that since the compiler's optimizer is entirely unaware of my inline assembly, it has a habit of doing the things in an unexpected order and still referencing things via esp.

Here's the resultant ASM:

Click to copy

8048454: push   ebp 8048455: mov    ebp,esp 8048457: sub    esp,0x28 804845a: sub    esp,0x64                      ; <- this and the line below are our "alloc" 804845d: mov    DWORD PTR [ebp-0x4],esp 8048460: mov    eax,DWORD PTR [ebp-0x4] 8048463: mov    DWORD PTR [esp+0x8],0x64      ; <- whoops! compiler still referencing via esp 804846b: mov    DWORD PTR [esp+0x4],0x0       ; <- whoops! compiler still referencing via esp 8048473: mov    DWORD PTR [esp],eax           ; <- whoops! compiler still referencing via esp            8048476: call   8048338 <memset@plt> 804847b: mov    eax,DWORD PTR [ebp-0x4] 804847e: mov    DWORD PTR [esp+0x8],0xd       ; <- whoops! compiler still referencing via esp 8048486: mov    DWORD PTR [esp+0x4],0x80485a8 ; <- whoops! compiler still referencing via esp 804848e: mov    DWORD PTR [esp],eax           ; <- whoops! compiler still referencing via esp 8048491: call   8048358 <memcpy@plt> 8048496: mov    eax,DWORD PTR [ebp-0x4] 8048499: mov    DWORD PTR [esp],eax           ; <- whoops! compiler still referencing via esp 804849c: call   8048368 <puts@plt> 80484a1: leave 80484a2: ret

As you can see, it isn't so simple. Unfortunately, I stand by my original assertion that you need compiler assistance.

answered Sep 27 '22 17:09

Evan Teran

It would be tricky to do this - in fact, unless you have enough control over the compiler's code generation it cannot be done entirely safely. Your routine would have to manipulate the stack, such that when it returned everything was cleaned, but the stack pointer remained in such a position that the block of memory remained in that place.

The problem is that unless you can inform the compiler that the stack pointer is has been modified across your function call, it may well decide that it can continue to refer to other locals (or whatever) through the stack pointer - but the offsets will be incorrect.

answered Sep 27 '22 17:09

Michael Burr

Related questions
                            
                                How to share semaphores between processes using shared memory
                            
                                Why dividing two integers doesn't get a float? [duplicate]
                            
                                Passing a byte[] in Java to a function in C through JNI: how to use jarraybyte
                            
                                C header issue: #include and "undefined reference"
                            
                                Programmatically reading a web page
                            
                                fork() and pipes() in c
                            
                                Can you control what a bitwise right shift will fill in C?
                            
                                How do you spawn another process in C?
                            
                                How to change the integer type used by an enum (C++)?
                            
                                Why does const int main = 195 result in a working program but without the const it ends in a segmentation fault?
                            
                                Direct formula for summing XOR
                            
                                C programming and TDD
                            
                                What are near, far and huge pointers?
                            
                                Examples or tutorials of using libjpeg-turbo's TurboJPEG
                            
                                Why do projects use the -I include switch given the dangers?
                            
                                LinkedList - How to free the memory allocated using malloc
                            
                                Multiple assignment in one line
                            
                                C - calloc() v. malloc() [duplicate]
                            
                                Practical Use of Zero-Length Bitfields
                            
                                C vs C++ sizeof [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Alloca implementation

Tags:

c

memory-management

assembly

alloca

inline-assembly

dsimcha

People also ask

2 Answers

Evan Teran

Michael Burr

Recent Activity

Donate For Us