Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

format string vulnerability - printf

Why does this print the value of the memory address at 0x08480110? I'm not sure why there are 5 %08x arguments - where does that take you up the stack?

address = 0x08480110
address (encoded as 32 bit le string): "\x10\x01\x48\x08"
printf ("\x10\x01\x48\x08_%08x.%08x.%08x.%08x.%08x|%s|");

This example is taken from page 11 of this paper http://crypto.stanford.edu/cs155/papers/formatstring-1.2.pdf

like image 283
Vikas Yendluri Avatar asked Apr 15 '11 06:04

Vikas Yendluri


People also ask

What causes format string vulnerability?

Originally thought harmless, format string exploits can be used to crash a program or to execute harmful code. The problem stems from the use of unchecked user input as the format string parameter in certain C functions that perform formatting, such as printf() .

What is the most common cause of format string attacks?

Most format string attacks exploit the C programming language. But other programming languages, such as Python, can also be vulnerable to format string attacks if its format() functions are not properly configured.

What is format string vulnerabilities in cybersecurity?

A format string vulnerability is a bug where user input is passed as the format argument to printf , scanf , or another function in that family. The format argument has many different specifies which could allow an attacker to leak data if they control the format argument to printf .

What can be the impact of a format string vulnerability?

This could lead to overwriting sensitive variables such as passwords and username of the program and lead to a complete compromise. Attackers can also make use of this vulnerability to perform arbitrary writes in the dtors table of the program and make the program execute malicious code.


1 Answers

I think that the paper provides its printf() examples in a somewhat confusing way because the examples use string literals for format strings, and those don't generally permit the type of vulnerability being described. The format string vulnerability as described here depends on the format string being provided by user input.

So the example:

printf ("\x10\x01\x48\x08_%08x.%08x.%08x.%08x.%08x|%s|");

Might better be presented as:

/* 
 * in a real program, some user input source would be copied 
 * into the `outstring` buffer 
 */
char outstring[80] = "\x10\x01\x48\x08_%08x.%08x.%08x.%08x.%08x|%s|";

printf(outstring);

Since the outstring array is an automatic, the compiler will likely put it on the stack. After copying the user input to the outstring array, it'll look like the following as 'words' on the stack (assuming little endian):

outstring[0c]               // etc...
outstring[08] 0x30252e78    // from "x.%0"
outstring[04] 0x3830255f    // from "_%08"
outstring[00] 0x08480110    // from the ""\x10\x01\x48\x08"

The compiler will put other items on the stack as it sees fit (other local variables, saved registers, whatever).

When the printf() call is about to be made, the stack might look like:

outstring[0c]               // etc...
outstring[08] 0x30252e78    // from "x.%0"
outstring[04] 0x3830255f    // from "_%08"
outstring[00] 0x08480110    // from the ""\x10\x01\x48\x08"
var1
var2
saved ECX
saved EDI

Note that I'm completely making those entries up - each compiler will use the stack in different ways (so a format string vulnerability has to be custom crafted for a particular exact scenario. In other words, you won't always use 5 dummy format specifiers like in this example - as the attacker you'd need to figure out how many dummies the particular vulnerability would need.

Now to call printf(), the argument (the address of outstring) is pushed on to the stack and printf() is called, so the argument area of the stack looks like:

outstring[0c]               // etc...
outstring[08] 0x30252e78    // from "x.%0"
outstring[04] 0x3830255f    // from "_%08"
outstring[00] 0x08480110    // from the ""\x10\x01\x48\x08"
var1
var2
var3
saved ECX
saved EDI
&outstring   // the one real argument to `printf()`

However, printf doesn't really know anything about how many arguments have been placed on the stack for it - it goes by the format specifiers it finds in the format string (the one argument it's 'sure' to get). So printf() gets the format string argument and starts processing it. When it gets to the 1st "%08x" that will correspond to the 'saved EDI' in my example, then next "%08x" will print the saved ECX' and so on. So the "%08x" format specifiers are just eating up data on the stack until it gets back to the string the attacker was able to input. Determining how many of those are needed is something an attacker would do by a kind of trial and error (probably by a test run that has a whole slew of "%08x" formats until he can 'see' where the format string starts).

Anyway, when printf() gets to processing the "%s" format specifier, it has consumed all the stack entries up to where the outstring buffer resides. The "%s" specifier treats its stack entry as a pointer, and the string that the user has put into that buffer has been carefully crafted to have a binary representation of 0x08480110, so printf() will print out whatever is at that address as an ASCIIZ string.

like image 80
Michael Burr Avatar answered Sep 17 '22 23:09

Michael Burr