Is it possible to compile a C++ (or the like) program without generating the executable file but writing it and executing it directly from memory? For example with <code>GCC</code> and <code>clang</code>, something that has a similar effect to: <pre class="prettyprint"><code>c++ hello.cpp -o hello.x && ./hello.x $@ && rm -f hello.x </code></pre> In the command line. But without the burden of writing an executable to disk to immediately load/rerun it. (If possible, the procedure may not use disk space or at least not space in the current directory which might be read-only).

Possible? Not the way you seem to wish. The task has two parts: <h3>1) How to get the binary into memory</h3> When we specify <code>/dev/stdout</code> as output file in Linux we can then pipe into our program <code>x0</code> that reads an executable from stdin and executes it: <pre class="prettyprint"><code> gcc -pipe YourFiles1.cpp YourFile2.cpp -o/dev/stdout -Wall | ./x0 </code></pre> In <code>x0</code> we can just read from stdin until reaching the end of the file: <pre class="prettyprint"><code>int main(int argc, const char ** argv) { const int stdin = 0; size_t ntotal = 0; char * buf = 0; while(true) { /* increasing buffer size dynamically since we do not know how many bytes to read */ buf = (char*)realloc(buf, ntotal+4096*sizeof(char)); int nread = read(stdin, buf+ntotal, 4096); if (nread<0) break; ntotal += nread; } memexec(buf, ntotal, argv); } </code></pre> It would also be possible for <code>x0</code> directly execute the compiler and read the output. This question has been answered here: Redirecting exec output to a buffer or file Caveat: I just figured out that for some strange reason this does not work when I use pipe <code>|</code> but works when I use the <code>x0 < foo</code>. Note: If you are willing to modify your compiler or you do JIT like LLVM, clang and other frameworks you could directly generate executable code. However for the rest of this discussion I assume you want to use an existing compiler. <h3>Note: Execution via temporary file</h3> Other programs such as UPX achieve a similar behavior by executing a temporary file, this is easier and more portable than the approach outlined below. On systems where <code>/tmp</code> is mapped to a RAM disk for example typical servers, the temporary file will be memory based anyway. <pre class="prettyprint"><code>#include<cstring> // size_t #include <fcntl.h> #include <stdio.h> // perror #include <stdlib.h> // mkostemp #include <sys/stat.h> // O_WRONLY #include <unistd.h> // read int memexec(void * exe, size_t exe_size, const char * argv) { /* random temporary file name in /tmp */ char name[15] = "/tmp/fooXXXXXX"; /* creates temporary file, returns writeable file descriptor */ int fd_wr = mkostemp(name, O_WRONLY); /* makes file executable and readonly */ chmod(name, S_IRUSR | S_IXUSR); /* creates read-only file descriptor before deleting the file */ int fd_ro = open(name, O_RDONLY); /* removes file from file system, kernel buffers content in memory until all fd closed */ unlink(name); /* writes executable to file */ write(fd_wr, exe, exe_size); /* fexecve will not work as long as there in a open writeable file descriptor */ close(fd_wr); char *const newenviron[] = { NULL }; /* -fpermissive */ fexecve(fd_ro, argv, newenviron); perror("failed"); } </code></pre> Caveat: Error handling is left out for clarities sake. Includes for sake of brevity. Note: By combining step <code>main()</code> and <code>memexec()</code> into a single function and using <code>splice(2)</code> for copying directly between <code>stdin</code> and <code>fd_wr</code> the program could be significantly optimized. <h3>2) Execution directly from memory</h3> One does not simply load and execute an ELF binary from memory. Some preparation, mostly related to dynamic linking, has to happen. There is a lot of material explaining the various steps of the ELF linking process and studying it makes me believe that theoretically possible. See for example this closely related question on SO however there seems not to exist a working solution. Update UserModeExec seems to come very close. Writing a working implementation would be very time consuming, and surely raise some interesting questions in its own right. I like to believe this is by design: for most applications it is strongly undesirable to (accidentially) execute its input data because it allows code injection. What happens exactly when an ELF is executed? Normally the kernel receives a file name and then creates a process, loads and maps the different sections of the executable into memory, performs a lot of sanity checks and marks it as executable before passing control and a file name back to the run-time linker <code>ld-linux.so</code> (part of libc). The takes care of relocating functions, handling additional libraries, setting up global objects and jumping to the executables entry point. AIU this heavy lifting is done by <code>dl_main()</code> (implemented in libc/elf/rtld.c). Even <code>fexecve</code> is implemented using a file in <code>/proc</code> and it is this need for a file name that leads us to reimplement parts of this linking process. Libraries <ul> <li>UserModeExec</li> <li> libelf -- read, modify, create ELF files</li> <li> eresi -- play with elfes</li> <li> OSKit (seems like a dead project though)</li> </ul> Reading <ul> <li> http://www.linuxjournal.com/article/1060?page=0,0 -- introduction</li> <li> http://wiki.osdev.org/ELF -- good overview</li> <li> http://s.eresi-project.org/inc/articles/elf-rtld.txt -- more detailed Linux-specific explanation</li> <li> http://www.codeproject.com/Articles/33340/Code-Injection-into-Running-Linux-Application -- how to get to hello world</li> <li> http://www.acsu.buffalo.edu/~charngda/elf.html -- nice reference of ELF structure</li> <li> Loaders and Linkers by John Levine -- deeoer explanation of linking</li> </ul> Related Questions at SO <ul> <li>Linux user-space ELF loader</li> <li>ELF Dynamic loader symbol lookup ordering</li> <li>load-time ELF relocation</li> <li>How do global variables get initialized by the elf loader</li> </ul> So it seems possible, you decide whether is also practical.

How to compile and execute from memory directly?

Tags:

c++

linux

Is it possible to compile a C++ (or the like) program without generating the executable file but writing it and executing it directly from memory?

For example with GCC and clang, something that has a similar effect to:

Click to copy

c++ hello.cpp -o hello.x && ./hello.x $@ && rm -f hello.x

In the command line.

But without the burden of writing an executable to disk to immediately load/rerun it.

(If possible, the procedure may not use disk space or at least not space in the current directory which might be read-only).

411

asked Dec 03 '12 19:12

alfC

1 Answers

Possible? Not the way you seem to wish. The task has two parts:

1) How to get the binary into memory

When we specify /dev/stdout as output file in Linux we can then pipe into our program x0 that reads an executable from stdin and executes it:

Click to copy

  gcc -pipe YourFiles1.cpp YourFile2.cpp -o/dev/stdout -Wall | ./x0

In x0 we can just read from stdin until reaching the end of the file:

Click to copy

int main(int argc, const char ** argv) {     const int stdin = 0;     size_t ntotal = 0;     char * buf = 0;     while(true)     {         /* increasing buffer size dynamically since we do not know how many bytes to read */         buf = (char*)realloc(buf, ntotal+4096*sizeof(char));         int nread = read(stdin, buf+ntotal, 4096);          if (nread<0) break;         ntotal += nread;     }     memexec(buf, ntotal, argv);  }

It would also be possible for x0 directly execute the compiler and read the output. This question has been answered here: Redirecting exec output to a buffer or file

Caveat: I just figured out that for some strange reason this does not work when I use pipe | but works when I use the x0 < foo.

Note: If you are willing to modify your compiler or you do JIT like LLVM, clang and other frameworks you could directly generate executable code. However for the rest of this discussion I assume you want to use an existing compiler.

Note: Execution via temporary file

Other programs such as UPX achieve a similar behavior by executing a temporary file, this is easier and more portable than the approach outlined below. On systems where /tmp is mapped to a RAM disk for example typical servers, the temporary file will be memory based anyway.

Click to copy

#include<cstring> // size_t #include <fcntl.h> #include <stdio.h> // perror #include <stdlib.h> // mkostemp #include <sys/stat.h> // O_WRONLY #include <unistd.h> // read int memexec(void * exe, size_t exe_size, const char * argv) {     /* random temporary file name in /tmp */     char name[15] = "/tmp/fooXXXXXX";      /* creates temporary file, returns writeable file descriptor */     int fd_wr = mkostemp(name,  O_WRONLY);     /* makes file executable and readonly */     chmod(name, S_IRUSR | S_IXUSR);     /* creates read-only file descriptor before deleting the file */     int fd_ro = open(name, O_RDONLY);     /* removes file from file system, kernel buffers content in memory until all fd closed */     unlink(name);     /* writes executable to file */     write(fd_wr, exe, exe_size);     /* fexecve will not work as long as there in a open writeable file descriptor */     close(fd_wr);     char *const newenviron[] = { NULL };     /* -fpermissive */     fexecve(fd_ro, argv, newenviron);     perror("failed"); }

Caveat: Error handling is left out for clarities sake. Includes for sake of brevity.

Note: By combining step main() and memexec() into a single function and using splice(2) for copying directly between stdin and fd_wr the program could be significantly optimized.

2) Execution directly from memory

One does not simply load and execute an ELF binary from memory. Some preparation, mostly related to dynamic linking, has to happen. There is a lot of material explaining the various steps of the ELF linking process and studying it makes me believe that theoretically possible. See for example this closely related question on SO however there seems not to exist a working solution.

Update UserModeExec seems to come very close.

Writing a working implementation would be very time consuming, and surely raise some interesting questions in its own right. I like to believe this is by design: for most applications it is strongly undesirable to (accidentially) execute its input data because it allows code injection.

What happens exactly when an ELF is executed? Normally the kernel receives a file name and then creates a process, loads and maps the different sections of the executable into memory, performs a lot of sanity checks and marks it as executable before passing control and a file name back to the run-time linker ld-linux.so (part of libc). The takes care of relocating functions, handling additional libraries, setting up global objects and jumping to the executables entry point. AIU this heavy lifting is done by dl_main() (implemented in libc/elf/rtld.c).

Even fexecve is implemented using a file in /proc and it is this need for a file name that leads us to reimplement parts of this linking process.

Libraries

UserModeExec
libelf -- read, modify, create ELF files
eresi -- play with elfes
OSKit (seems like a dead project though)

Reading

http://www.linuxjournal.com/article/1060?page=0,0 -- introduction
http://wiki.osdev.org/ELF -- good overview
http://s.eresi-project.org/inc/articles/elf-rtld.txt -- more detailed Linux-specific explanation
http://www.codeproject.com/Articles/33340/Code-Injection-into-Running-Linux-Application -- how to get to hello world
http://www.acsu.buffalo.edu/~charngda/elf.html -- nice reference of ELF structure
Loaders and Linkers by John Levine -- deeoer explanation of linking

18 revs, 2 users 95%

Related questions
                            
                                Create WCF service for unmanaged C++ clients
                            
                                Deep copy vs Shallow Copy [duplicate]
                            
                                Does C++ call destructors for global and class static variables?
                            
                                C++ variable has initializer but incomplete type?
                            
                                Can you make custom operators in C++?
                            
                                Select class constructor using enable_if
                            
                                Initializing a union with a non-trivial constructor
                            
                                Differences between Conditional variables, Mutexes and Locks
                            
                                Do we have closures in C++?
                            
                                std::vector needs to have dll-interface to be used by clients of class 'X<T> warning
                            
                                What is a converting constructor in C++ ? What is it for?
                            
                                How do I organize members in a struct to waste the least space on alignment?
                            
                                Is 'bool' a basic datatype in C++?
                            
                                C++ implicit copy constructor for a class that contains other objects
                            
                                What if I write return statement in constructor?
                            
                                How to get memory usage under Windows in C++
                            
                                Why does `int ;` compile fine in C, but not in C++?
                            
                                Are abstract methods and pure virtual functions the same thing?
                            
                                Remove a key from a C++ map
                            
                                Can I use ' == ' to compare two vectors. I tried it and seems to be working fine. But I don't know whether it will work in more complex situations

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to compile and execute from memory directly?

Tags:

c++

linux

alfC

People also ask

1 Answers

1) How to get the binary into memory

Note: Execution via temporary file

2) Execution directly from memory

18 revs, 2 users 95%

Recent Activity

Donate For Us