I was reading a codebreakers journal article on self-modifying code and there was this code snippet: <pre class="prettyprint"><code>void Demo(int (*_printf) (const char *,...)) { _printf("Hello, OSIX!n"); return; } int main(int argc, char* argv[]) { char buff[1000]; int (*_printf) (const char *,...); int (*_main) (int, char **); void (*_Demo) (int (*) (const char *,...)); _printf=printf; int func_len = (unsigned int) _main - (unsigned int) _Demo; for (int a=0; a<func_len; a++) buff[a] = ((char *) _Demo)[a]; _Demo = (void (*) (int (*) (const char *,...))) &buff[0]; _Demo(_printf); return 0; } </code></pre> This code supposedly executed Demo() on the stack. I understand most of the code, but the part where they assign 'func_len' confuses me. As far as i can tell, they're subtracting one random pointer address from another random pointer address. Someone care to explain?

The code is relying on knowledge of the layout of functions from the compiler - which may not be reliable with other compilers. The <code>func_len</code> line, once corrected to include the <code>-</code> that was originally missing, determines the length of the function <code>Demo</code> by subtracting the address in <code>_Demo</code> (which is is supposed to contain the start address of <code>Demo()</code>) from the address in <code>_main</code> (which is supposed to contain the start address of <code>main()</code>). This is presumed to be the length of the function <code>Demo</code>, which is then copied byte-wise into the buffer <code>buff</code>. The address of <code>buff</code> is then coerced into a function pointer and the function then called. However, since neither <code>_Demo</code> nor <code>_main</code> is actually initialized, the code is buggy in the extreme. Also, it is not clear that an <code>unsigned int</code> is big enough to hold pointers accurately; the cast should probably be to a <code>uintptr_t</code> from <code><stdint.h></code> or <code><inttypes.h></code>. This works if the bugs are fixed, if the assumptions about the code layout are correct, if the code is position-independent code, and if there are no protections against executing data space. It is unreliable, non-portable and not recommended. But it does illustrate, if it works, that code and data are very similar. I remember pulling a similar stunt between two processes, copying a function from one program into shared memory, and then having the other program execute that function from shared memory. It was about a quarter of a century ago, but the technique was similar and 'worked' for the machine it was tried on. I've never needed to use the technique since, thank goodness!

Self Modifying Code [C++]

Tags:

c++

self-modifying

I was reading a codebreakers journal article on self-modifying code and there was this code snippet:

void Demo(int (*_printf) (const char *,...))
{ 
      _printf("Hello, OSIX!n"); 
      return; 
} 
int main(int argc, char* argv[]) 
{ 
  char buff[1000]; 
  int (*_printf) (const char *,...); 
  int (*_main) (int, char **); 
  void (*_Demo) (int (*) (const char *,...)); 
  _printf=printf; 
  int func_len = (unsigned int) _main - (unsigned int) _Demo; 
  for (int a=0; a<func_len; a++) 
    buff[a] = ((char *) _Demo)[a]; 
  _Demo = (void (*) (int (*) (const char *,...))) &buff[0]; 
  _Demo(_printf); 
  return 0; 
}

This code supposedly executed Demo() on the stack. I understand most of the code, but the part where they assign 'func_len' confuses me. As far as i can tell, they're subtracting one random pointer address from another random pointer address.

Someone care to explain?

542

asked Apr 26 '11 06:04

Gogeta70

2 Answers

The code is relying on knowledge of the layout of functions from the compiler - which may not be reliable with other compilers.

The func_len line, once corrected to include the - that was originally missing, determines the length of the function Demo by subtracting the address in _Demo (which is is supposed to contain the start address of Demo()) from the address in _main (which is supposed to contain the start address of main()). This is presumed to be the length of the function Demo, which is then copied byte-wise into the buffer buff. The address of buff is then coerced into a function pointer and the function then called. However, since neither _Demo nor _main is actually initialized, the code is buggy in the extreme. Also, it is not clear that an unsigned int is big enough to hold pointers accurately; the cast should probably be to a uintptr_t from <stdint.h> or <inttypes.h>.

This works if the bugs are fixed, if the assumptions about the code layout are correct, if the code is position-independent code, and if there are no protections against executing data space. It is unreliable, non-portable and not recommended. But it does illustrate, if it works, that code and data are very similar.

I remember pulling a similar stunt between two processes, copying a function from one program into shared memory, and then having the other program execute that function from shared memory. It was about a quarter of a century ago, but the technique was similar and 'worked' for the machine it was tried on. I've never needed to use the technique since, thank goodness!

100

answered Sep 27 '22 23:09

Jonathan Leffler

This code uses uninitialized variables _main and _Demo, so it cannot work in general. Even if they meant something different, they probably assumed some specific ordering of functions in memory.

My opinion: don't trust this article.

answered Sep 27 '22 21:09

Yakov Galka

Related questions
                            
                                AnyIterator and boost iterator facade
                            
                                Increasing MAXIMUM_WAIT_OBJECTS for WaitforMultipleObjects
                            
                                typedef struct and enum, why? [duplicate]
                            
                                2D Platformer Collision Handling
                            
                                Intellisense in vs2010 with c++
                            
                                memory leak when deleting derived class with base-class pointer
                            
                                Unexpected order of evaluation (compiler bug?) [duplicate]
                            
                                How to avoid sorting in map
                            
                                HTTP Client library for C++
                            
                                Is it possible to write constructor assignment operator only mentioning specially -copied members?
                            
                                Do the accessors affect the performance of an application?
                            
                                Advantages/Disadvantages of Header Files
                            
                                Choosing between template instantiation with pointer arguments
                            
                                Is it ok to cast a STL container with Base type to Derived type?
                            
                                Creating a set<char> in C++
                            
                                C++ library for storing settings in XML
                            
                                How is the expression x---y parsed? Is it a legal expression?
                            
                                How to detect C++ compiler with macro in Xcode?
                            
                                java final methods vs c++ nonvirtual functions
                            
                                Unresolved external symbols in Visual Studio 2010

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With