Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to compile and execute from memory directly?

Tags:

c++

linux

Is it possible to compile a C++ (or the like) program without generating the executable file but writing it and executing it directly from memory?

For example with GCC and clang, something that has a similar effect to:

c++ hello.cpp -o hello.x && ./hello.x $@ && rm -f hello.x 

In the command line.

But without the burden of writing an executable to disk to immediately load/rerun it.

(If possible, the procedure may not use disk space or at least not space in the current directory which might be read-only).

like image 411
alfC Avatar asked Dec 03 '12 19:12

alfC


People also ask

What is compiling and execution?

In general sense compiling means converting source code into executable code. During compilation syntax checking and converting java source code into byte code is done. While executing the executable code is simply executed and output is displayed.

Do we need to compile a program before execution?

Because computer can't understand the source code directly. It will understand only object level code. Source codes are human readable format but the system cannot understand it.

Why should we compiled before execution?

Compiled languages are converted directly into machine code that the processor can execute. As a result, they tend to be faster and more efficient to execute than interpreted languages. They also give the developer more control over hardware aspects, like memory management and CPU usage.


1 Answers

Possible? Not the way you seem to wish. The task has two parts:

1) How to get the binary into memory

When we specify /dev/stdout as output file in Linux we can then pipe into our program x0 that reads an executable from stdin and executes it:

  gcc -pipe YourFiles1.cpp YourFile2.cpp -o/dev/stdout -Wall | ./x0 

In x0 we can just read from stdin until reaching the end of the file:

int main(int argc, const char ** argv) {     const int stdin = 0;     size_t ntotal = 0;     char * buf = 0;     while(true)     {         /* increasing buffer size dynamically since we do not know how many bytes to read */         buf = (char*)realloc(buf, ntotal+4096*sizeof(char));         int nread = read(stdin, buf+ntotal, 4096);          if (nread<0) break;         ntotal += nread;     }     memexec(buf, ntotal, argv);  } 

It would also be possible for x0 directly execute the compiler and read the output. This question has been answered here: Redirecting exec output to a buffer or file

Caveat: I just figured out that for some strange reason this does not work when I use pipe | but works when I use the x0 < foo.

Note: If you are willing to modify your compiler or you do JIT like LLVM, clang and other frameworks you could directly generate executable code. However for the rest of this discussion I assume you want to use an existing compiler.

Note: Execution via temporary file

Other programs such as UPX achieve a similar behavior by executing a temporary file, this is easier and more portable than the approach outlined below. On systems where /tmp is mapped to a RAM disk for example typical servers, the temporary file will be memory based anyway.

#include<cstring> // size_t #include <fcntl.h> #include <stdio.h> // perror #include <stdlib.h> // mkostemp #include <sys/stat.h> // O_WRONLY #include <unistd.h> // read int memexec(void * exe, size_t exe_size, const char * argv) {     /* random temporary file name in /tmp */     char name[15] = "/tmp/fooXXXXXX";      /* creates temporary file, returns writeable file descriptor */     int fd_wr = mkostemp(name,  O_WRONLY);     /* makes file executable and readonly */     chmod(name, S_IRUSR | S_IXUSR);     /* creates read-only file descriptor before deleting the file */     int fd_ro = open(name, O_RDONLY);     /* removes file from file system, kernel buffers content in memory until all fd closed */     unlink(name);     /* writes executable to file */     write(fd_wr, exe, exe_size);     /* fexecve will not work as long as there in a open writeable file descriptor */     close(fd_wr);     char *const newenviron[] = { NULL };     /* -fpermissive */     fexecve(fd_ro, argv, newenviron);     perror("failed"); } 

Caveat: Error handling is left out for clarities sake. Includes for sake of brevity.

Note: By combining step main() and memexec() into a single function and using splice(2) for copying directly between stdin and fd_wr the program could be significantly optimized.

2) Execution directly from memory

One does not simply load and execute an ELF binary from memory. Some preparation, mostly related to dynamic linking, has to happen. There is a lot of material explaining the various steps of the ELF linking process and studying it makes me believe that theoretically possible. See for example this closely related question on SO however there seems not to exist a working solution.

Update UserModeExec seems to come very close.

Writing a working implementation would be very time consuming, and surely raise some interesting questions in its own right. I like to believe this is by design: for most applications it is strongly undesirable to (accidentially) execute its input data because it allows code injection.

What happens exactly when an ELF is executed? Normally the kernel receives a file name and then creates a process, loads and maps the different sections of the executable into memory, performs a lot of sanity checks and marks it as executable before passing control and a file name back to the run-time linker ld-linux.so (part of libc). The takes care of relocating functions, handling additional libraries, setting up global objects and jumping to the executables entry point. AIU this heavy lifting is done by dl_main() (implemented in libc/elf/rtld.c).

Even fexecve is implemented using a file in /proc and it is this need for a file name that leads us to reimplement parts of this linking process.

Libraries

  • UserModeExec
  • libelf -- read, modify, create ELF files
  • eresi -- play with elfes
  • OSKit (seems like a dead project though)

Reading

  • http://www.linuxjournal.com/article/1060?page=0,0 -- introduction
  • http://wiki.osdev.org/ELF -- good overview
  • http://s.eresi-project.org/inc/articles/elf-rtld.txt -- more detailed Linux-specific explanation
  • http://www.codeproject.com/Articles/33340/Code-Injection-into-Running-Linux-Application -- how to get to hello world
  • http://www.acsu.buffalo.edu/~charngda/elf.html -- nice reference of ELF structure
  • Loaders and Linkers by John Levine -- deeoer explanation of linking

Related Questions at SO

  • Linux user-space ELF loader
  • ELF Dynamic loader symbol lookup ordering
  • load-time ELF relocation
  • How do global variables get initialized by the elf loader

So it seems possible, you decide whether is also practical.

like image 164
18 revs, 2 users 95% Avatar answered Oct 14 '22 06:10

18 revs, 2 users 95%