Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are common values for uninitialized memory for debugging?

A long time ago I learned about filling unused / uninitialized memory with 0xDEADBEEF so that in a debugger or a crash report if I ever see that value I know I'm looking at uninitialized memory. I saw from a crash report iOS uses 0xBBADBEEF.

What other creative values have people used? Do any particular values have any kind of specific benefit?

The most obvious benefit of values that turn into words is that, at least of most people, if the words are in their language they stick out easily where as some strictly numeric value is less likely to stick out.

But, maybe there are other reason to pick numbers? For example an odd number might crash a processors (68000) for example on certain memory accesses so it's probably better to pick 0x0BADBEEF over 0xBADBEEF0. Are their any other values (maybe processor specific) that have a concrete benefit for using for uninitialized memory?

like image 222
gman Avatar asked Aug 26 '16 05:08

gman


People also ask

What is the value of an uninitialized variable?

An uninitialized variable has an undefined value, often corresponding to the data that was already in the particular memory location that the variable is using. This can lead to errors that are very hard to detect since the variable's value is effectively random, different values cause different errors or none at all.

Which section of the Windows program memory is used to store global non initialized variables?

Uninitialized data segment or bss contains all the uninitialized global and static variables. Stack stores all local variables and arguments of functions. They also store a function return address of the instruction, which is to be executed after a function call. Stack and heap grow opposite to each other.


2 Answers

Generally speaking, you want a value which is unlikely to happen to "work" when interpreted as either an integer, a pointer, or a string. So, here are a few constraints:

  • Don't use a value that's a multiple of the smallest "usual" alignment on your target architecture. For x86, that's 4 (bytes), so no values that are divisible by 4. This ensures that if the value is interpreted as a pointer, it'll be obviously-incorrect. If you're on a non-x86 architecture, you might even be able to use a value that will cause an alignment trap if used as a pointer.

  • Don't use a value which could reasonably be a small (positive or negative) integer. Your typical "int" variable in a C program never gets larger than 1,000 or so, so don't use small numbers as your empty data fill.

  • Don't use a value which is composed entirely of valid ASCII characters. Make sure there's at least one byte in there with the high bit set. These days, you'd want to make sure they weren't valid UTF-8 or possibly UTF-16 values, either.

  • Don't have any zero bytes in the value. There are too many cases where this would work out to be "helpful" to keeping the program from crashing - terminating a string, giving a non-int field a reasonable-looking value, etc.

  • Don't use a single (or two) byte values, repeated over and over. Having a full-word length pattern can make it easier to determine how your wild pointer ended up pointing where it is, at least narrowing down which operations offset it from the start of the pattern.

  • Don't use a value that maps to an valid address for a "typical" process. If the highest bits are set, it'll typically take a whole lot of malloc() before your process will grow large enough to make that a valid address.

Perhaps unsurprisingly, patterns like 0xDEADBEEF meet basically all of these requirements.

like image 91
Mark Bessey Avatar answered Nov 15 '22 05:11

Mark Bessey


One technical term for values like this is "poison value".

Hex numbers that form English words are called Hexspeak. Wikipedia's Hexspeak article pretty much answers this question, cataloguing many known constants in use for various things, including several that are used as poison values / canaries / sanity checks, as well as other uses like error codes or IPv6 addresses.


I seem to recall some variation of 0xBADF00D. (maybe with a repeated letter like your 2nd example).

There's also 0xDEADC0DE. (Googling for where I've seen this used found the wikipedia article linked above).


Other English words in hex I've seen: Java .class files use 0xCAFEBABE as the magic number (first 4 bytes of the file). As a play on this, I guess, the Jikes JVM uses 0xDEADBABE as a sanity check constant.

Apparently Java wasn't the first user of 0xCAFEBABE. Wikipedia says "It was originally created by NeXTSTEP developers as a reference to the baristas at Peet's Coffee & Tea", and was used by the people developing Java before they thought of the name "Java". So it didn't come out of Java -> coffee (if anything the other way around), it's just plain old non-feminist tech culture. :(


re: update: Choosing a good value. For a poison value (not an error code), you want all the bytes to be different and not 0x00 or 0xFF, since those are probably the most likely values for an errant single-byte store. This applies especially for things like stack canaries (to detect buffer overruns), or other cases where detecting that it didn't get overwritten is important.

Your speculation about picking an odd value makes a lot of sense. Not being a valid memory address in the virtual memory layout of typical processes is a big advantage. Failing noisily as early as possible is optimal for debugging. Anyway, this probably means that having the high bit set is a good idea, so 0x0... is probably not a good idea.

like image 22
Peter Cordes Avatar answered Nov 15 '22 05:11

Peter Cordes