Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are "shadow bytes" in AddressSanitizer and how should I interpret them?

I am debugging a C program and am gravely confused about the lower half of the AddressSanitizer outputs when it finds problems. Let's use this for example:

==33184==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000005 at pc 0x55f312fe2509 bp 0x7ffc99f5f5c0 sp 0x7ffc99f5f5b0
WRITE of size 1 at 0x602000000005 thread T0
    #0 0x55f312fe2508 in main /home/user/c/friends/main.c:20
    #1 0x7fa5ea0e9b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
    #2 0x55f312fe21c9 in _start (/home/user/c/friends/cmake-build-debug/friends+0x11c9)

0x602000000005 is located 11 bytes to the left of 5-byte region [0x602000000010,0x602000000015)
allocated by thread T0 here:
    #0 0x7fa5eb2b8b40 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb40)
    #1 0x55f312fe23f4 in main /home/user/c/friends/main.c:18
    #2 0x7fa5ea0e9b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)

SUMMARY: AddressSanitizer: heap-buffer-overflow /home/user/c/friends/main.c:20 in main

  0x0c047fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c047fff8000:[fa]fa 05 fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==33184==ABORTING

Everything above this line, I understand: SUMMARY: AddressSanitizer: heap-buffer-overflow /home/user/c/friends/main.c:20 in main

My question involves the data presented below that line. I read this answer but it did not answer my question. The memory dump shown by ASAN looks like this:

  0x0c047fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c047fff8000:[fa]fa 05 fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  1. What is the line with the arrow trying to tell me? My assumption is that 05 which appears between the fas is referring to the 0x602000000005 is located 11 bytes to the left of 5-byte region "5-byte region." However, I am still confused because the legend says that fa means "heap left redzone," yet it appears to the right of the 05 and to the left of it. Why are there no "heap right redzones?"

  2. In this example, ASAN says that the program went 11 bytes out of the 5-byte region, yet it shows far more fas than that.

  3. Is there any proper, detailed documentation which actually explains what these terms "heap left redzone", "stack mid redzone", "Global redzone", etc mean? I've not been able to find any.

  4. What is a "Shadow byte/address" in this context?

like image 622
the_endian Avatar asked May 08 '20 07:05

the_endian


People also ask

What are shadow bytes?

Just for completeness, a shadow byte is a byte that shadows a group of eight normally-accessible program bytes and tracks some information about them useful to the sanitizer. A shadow address is the address of a shadow byte.

How does Fsanitize address work?

The "-fsanitize=address" flag is used to tell the compiler to add AddressSanitizer. Additionally, due to some environmental configuration settings on OSC systems, we must also statically link against Asan. This is done using the "-static-libasan" flag. It's helpful to compile the code with debug symbols.

What is heap buffer overflow?

A heap overflow condition is a buffer overflow, where the buffer that can be overwritten is allocated in the heap portion of memory, generally meaning that the buffer was allocated using a routine such as malloc().

What is a shadow byte?

Poisoning a byte in the main memory means writing some special value into the corresponding shadow memory. So "shadow bytes" are metadata describing the state of your program's addressable memory. it tells us that the hexdump is of the shadow memory which describes the state of your program's "real" memory. What states does it track?

How does AddressSanitizer work?

AddressSanitizer maps every 8 bytes of application memory into 1 byte of shadow memory. If a memory address is unpoisoned (i.e. addressable) the bit in the shadow memory is 0. If a memory address is poisoned (i.e. not addressable) the bit in the shadow memory is 1.

How does AddressSanitizer detect if memory address is poisoned?

If a memory address is poisoned (i.e. not addressable) the bit in the shadow memory is 1. That way, AddressSanitizer can identify which memory access is allowed or not and report errors. If you want to get into the details about the implementation, read the documentation of the AddressSanitizer algorithm.

How does address sanitizer track memory allocation?

Address Sanitizer uses runtime instrumentation to track memory allocations, which mean you must build your code with Address Sanitizer to take advantage of it's features. There is extensive documentation on the AddressSanitizer Github Wiki. Memory leaks can increase the total memory used by your program.


1 Answers

What are “shadow bytes” in AddressSanitizer and how should I interpret them?

From the AddressSanitizerAlgorithm page on GitHub (which is also linked from the LLVM AddressSanitizer page):

The virtual address space is divided into 2 disjoint classes:

  • Main application memory (Mem): this memory is used by the regular application code.
  • Shadow memory (Shadow): this memory contains the shadow values (or metadata). There is a correspondence between the shadow and the main application memory. Poisoning a byte in the main memory means writing some special value into the corresponding shadow memory.

So "shadow bytes" are metadata describing the state of your program's addressable memory.

If we look at the asan output:

Shadow byte legend (one shadow byte represents 8 application bytes):

it tells us that the hexdump is of the shadow memory which describes the state of your program's "real" memory. What states does it track?

  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  ...

so if a whole 8-byte line is addressable, the shadow byte that tracks (or shadows) it should have value 00. If it's partly-addressable, the shadow byte will be 01..07, which is presumably the number of addressable bytes in the line.

The value the hex dump is pointing you to is fa, or "Heap left redzone" - presumably this is some kind of guard region around heap allocations to detect overruns.

From the same link:

The run-time library replaces the malloc and free functions. The memory around malloc-ed regions (red zones) is poisoned

More broadly, this description (in program addresses)

0x602000000005 is located 11 bytes to the left of 5-byte region
  [0x602000000010,0x602000000015)

matches the shadow map shown:

=>0x0c047fff8000:[fa]fa 05 fa ...

Assuming natural alignment,

  • shadow byte 0x0c047fff8000 describes (or, again, shadows) program addresses 0x602000000000..0x602000000007 which includes the address you accessed
  • the next shadow byte at 0x0c047fff8001 describes program addresses 0x602000000008..0x60200000000F
  • both of those have value fa, meaning "Heap left redzone"
  • the next shadow byte at 0x0c047fff8002 describes program addresses 0x602000000010..0x602000000007 and has value 05, meaning 5 bytes are addressable. These are the 5 bytes of your heap allocation.

All of this is consistent with the part of the error you did understand.

  1. However, I am still confused because the legend says that fa means "heap left redzone," yet it appears to the right of the 05 and to the left of it. Why are there no "heap right redzones?"

    I don't know what the directionality really means, here. Heaps typically grow in one direction initially (traditionally up as the stack grows down), but can be fragmented, released, coalesced and re-allocated. Is the gutter between two allocations "right," or "left," or both, or neither? All we need to know is that it's a poisoned heap region that was never allocated to the user.

    Maybe it should just be "Heap redzone", if there is no orientation corresponding to the stack left/mid/right values.

  2. In this example, ASAN says that the program went 11 bytes out of the 5-byte region, yet it shows far more fas than that.

    each fa represents eight bytes, as the legend says. So if you'd accessed anything from nine to fifteen bytes before the allocation (modulo arithmetic errors), it would have shown up in the same shadow byte. If you'd accessed one to eight bytes before, it would have shown up in the next shadow byte (right before the 05).

    The rest of the fas are just a map of the surrounding area, which doesn't appear helpful in this case but might be in others.

  3. Is there any proper, detailed documentation which actually explains what these terms "heap left redzone", "stack mid redzone", "Global redzone", etc mean?

    No idea. They seem to follow fairly naturally from the use case though - you hit a red zone = you accessed an address you shouldn't. You can always just read the code, eg. asan_internal.h defines the kAsanHeapLeftRedzoneMagic value, and asan_allocator.cpp poisons shadow bytes with it.

  4. What is a "Shadow byte/address" in this context?

    Just for completeness, a shadow byte is a byte that shadows a group of eight normally-accessible program bytes and tracks some information about them useful to the sanitizer.

    A shadow address is the address of a shadow byte.

like image 180
Useless Avatar answered Oct 23 '22 19:10

Useless