Is it possible to identify whether an address reference belongs to static/heap/stack in the process address space

Question

We have a mechanism that monitors the load & store instructions that captures the address referenced. I'd like to classify the addresses whether they belong to the stack, the heap or the region where the static variables are allocated. Is there a way to do this classification programatically?

My initial thought was to do a malloc() with a small memory request (1?) as soon as the process starts running so that I could capture the "base address" (or starting address) for the heap. That way, I can distinguish from those variables statically allocated and the rest. For those references not belonging to the static region (those are, heap and stack), how could I differentiate them?

Some small tests show that the following simple code (run in Linux 3.18/x86-64 compiled with gcc 4.8.4)

#include <stdio.h>
#include <stdlib.h>

int x;

int foo (void)
{
    int s;
    int *h = malloc (sizeof(int));

    printf ("x = %p, *s = %p, h = %p
", &x, &s, h);
}

int main (int argc, char *argv[])
{
    foo();
    return 0;
}

shows some randomization of the address space (not in the static variables but in the remaining part -- heap & stack) which may add some uncertainty but maybe a way to find the limits of these regions of the addres space.

kfx · Accepted Answer

There is no standard C API for this, which means that all possible solutions are going to be based on platform-specific hacks. Also, this answer limits itself to single-threaded applications.

How to recognize a stack address?

The stack is a continuous memory region. Therefore all you need to know are two numbers: the top of the stack and the bottom of the stack. The top of the stack is basically limited by the stack frame of the current function. However, since the size of the current stack frame cannot be accessed from C code, it's a difficult to tell where exactly the current frame ends. The trick here is to call one more function from the current and use an addess the in the called functions stack frame as the boundary value for stack_top.

Learning the bottom of the stack is simpler - its value stays constant during the execution of the program, and is bounded by the stack frame of the entry-point function (main() in C programs). Therefore taking address of some local variable in the main() function is a sufficient approximation.

One more caveat is that x86 stack grows backwards, which means that the top of the stack has a smaller address than the bottom. This code sums it up:

void *stack_bottom;

bool IS_IN_STACK(void *x) __attribute__((noinline));
bool IS_IN_STACK(void *x) {
    void *stack_top = &stack_top;
    return x <= stack_bottom && x >= stack_top;
}

int main (int argc, char *argv[]) {
   int x;
   stack_bottom = &x;
   ...

How to recognize an address of a static variable?

The logic is even simpler here. Static variables are allocated in a memory region starting with a fixed, platform-specific address. Usually this region precedes all other regions in memory. The only thing that has to be learned therefore is the end address of this static memory region.

Luckily, GCC linker provides symbols end, edata and etext that denote the end of .bss, .data and .text segments respectively. Static variables are allocated either in .bss or .data segment, therefore this check should be sufficient on most platforms:

#define IS_STATIC(x) ((void*)(x) <= (void*)&end || (void*)(x) <= (void*)&edata)

This macro checks both edata and end to avoid making assumptions about which of .bss and .data comes first in memory.

Heap addresses.

Heap variables are typically allocated in addresses directly following the addresses in .data and .bss regions. However, sometimes heap addresses may belong to non-continuous memory ranges. Therefore the best you can do here is to read Linux process files to find out the memory mappings as suggested in the other answer. Alternatively, just check if both IS_IN_STACK and IS_STATIC return false.

The complete program using these macros:

int x;
extern int end, edata;

void *stack_bottom;

bool IS_IN_STACK(void *x) __attribute__((noinline));
bool IS_IN_STACK(void *x) {
    void *stack_top = &stack_top;
    return x <= stack_bottom && x >= stack_top;
}

#define IS_STATIC(x) ((void*)(x) <= (void*)&end || (void*)(x) <= (void*)&edata)

int foo (void)
{
    int s;
    int *h = malloc (sizeof(int));

    printf ("x = %p, *s = %p, h = %p
", &x, &s, h);
    // prints 0 1 0
    printf ("%d %d %d
", IS_IN_STACK(&x), IS_IN_STACK(&s), IS_IN_STACK(h));
    // prints 1 0 0
    printf ("%d %d %d
", IS_STATIC(&x), IS_STATIC(&s), IS_STATIC(h));
}

int main (int argc, char *argv[])
{
    int x;
    stack_bottom = &x;
    foo();
    return 0;
}

Oleg Andriyanov · Answer

I guess in order to get the correct result you should parse /proc/<pid>/maps file on Linux. Sample contents:

# cat maps
00400000-00407000 r-xp 00000000 fc:02 1837717           /sbin/getty
00606000-00607000 r--p 00006000 fc:02 1837717           /sbin/getty
00607000-00608000 rw-p 00007000 fc:02 1837717           /sbin/getty
00608000-0060a000 rw-p 00000000 00:00 0 
0252e000-0254f000 rw-p 00000000 00:00 0                 [heap]
7f3ca601f000-7f3ca6833000 r--p 00000000 fc:02 2105304   /usr/lib/locale/locale-archive
...
7f3ca7656000-7f3ca7657000 r--p 00022000 fc:02 1711858   /lib/x86_64-linux-gnu/ld-2.19.so
7f3ca7657000-7f3ca7658000 rw-p 00023000 fc:02 1711858   /lib/x86_64-linux-gnu/ld-2.19.so
7f3ca7658000-7f3ca7659000 rw-p 00000000 00:00 0 
7fffbbcf2000-7fffbbd13000 rw-p 00000000 00:00 0         [stack]
7fffbbdfc000-7fffbbdfe000 r-xp 00000000 00:00 0         [vdso]
7fffbbdfe000-7fffbbe00000 r--p 00000000 00:00 0         [vvar]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]

Refer to proc(5).

Is it possible to identify whether an address reference belongs to static/heap/stack in the process address space

Tags:

c

heap-memory

memory-address

Harald

2 Answers

kfx

Oleg Andriyanov

Recent Activity

Donate For Us

Is it possible to identify whether an address reference belongs to static/heap/stack in the process address space

Tags:

c

heap-memory

memory-address

Harald

2 Answers

kfx

Oleg Andriyanov

Related questions

Recent Activity

Donate For Us