Is it possible to compile a C++ (or the like) program without generating the executable file but writing it and executing it directly from memory?
For example with GCC
and clang
, something that has a similar effect to:
c++ hello.cpp -o hello.x && ./hello.x $@ && rm -f hello.x
In the command line.
But without the burden of writing an executable to disk to immediately load/rerun it.
(If possible, the procedure may not use disk space or at least not space in the current directory which might be read-only).
In general sense compiling means converting source code into executable code. During compilation syntax checking and converting java source code into byte code is done. While executing the executable code is simply executed and output is displayed.
Because computer can't understand the source code directly. It will understand only object level code. Source codes are human readable format but the system cannot understand it.
Compiled languages are converted directly into machine code that the processor can execute. As a result, they tend to be faster and more efficient to execute than interpreted languages. They also give the developer more control over hardware aspects, like memory management and CPU usage.
Possible? Not the way you seem to wish. The task has two parts:
When we specify /dev/stdout
as output file in Linux we can then pipe into our program x0
that reads an executable from stdin and executes it:
gcc -pipe YourFiles1.cpp YourFile2.cpp -o/dev/stdout -Wall | ./x0
In x0
we can just read from stdin until reaching the end of the file:
int main(int argc, const char ** argv) { const int stdin = 0; size_t ntotal = 0; char * buf = 0; while(true) { /* increasing buffer size dynamically since we do not know how many bytes to read */ buf = (char*)realloc(buf, ntotal+4096*sizeof(char)); int nread = read(stdin, buf+ntotal, 4096); if (nread<0) break; ntotal += nread; } memexec(buf, ntotal, argv); }
It would also be possible for x0
directly execute the compiler and read the output. This question has been answered here: Redirecting exec output to a buffer or file
Caveat: I just figured out that for some strange reason this does not work when I use pipe |
but works when I use the x0 < foo
.
Note: If you are willing to modify your compiler or you do JIT like LLVM, clang and other frameworks you could directly generate executable code. However for the rest of this discussion I assume you want to use an existing compiler.
Other programs such as UPX achieve a similar behavior by executing a temporary file, this is easier and more portable than the approach outlined below. On systems where /tmp
is mapped to a RAM disk for example typical servers, the temporary file will be memory based anyway.
#include<cstring> // size_t #include <fcntl.h> #include <stdio.h> // perror #include <stdlib.h> // mkostemp #include <sys/stat.h> // O_WRONLY #include <unistd.h> // read int memexec(void * exe, size_t exe_size, const char * argv) { /* random temporary file name in /tmp */ char name[15] = "/tmp/fooXXXXXX"; /* creates temporary file, returns writeable file descriptor */ int fd_wr = mkostemp(name, O_WRONLY); /* makes file executable and readonly */ chmod(name, S_IRUSR | S_IXUSR); /* creates read-only file descriptor before deleting the file */ int fd_ro = open(name, O_RDONLY); /* removes file from file system, kernel buffers content in memory until all fd closed */ unlink(name); /* writes executable to file */ write(fd_wr, exe, exe_size); /* fexecve will not work as long as there in a open writeable file descriptor */ close(fd_wr); char *const newenviron[] = { NULL }; /* -fpermissive */ fexecve(fd_ro, argv, newenviron); perror("failed"); }
Caveat: Error handling is left out for clarities sake. Includes for sake of brevity.
Note: By combining step main()
and memexec()
into a single function and using splice(2)
for copying directly between stdin
and fd_wr
the program could be significantly optimized.
One does not simply load and execute an ELF binary from memory. Some preparation, mostly related to dynamic linking, has to happen. There is a lot of material explaining the various steps of the ELF linking process and studying it makes me believe that theoretically possible. See for example this closely related question on SO however there seems not to exist a working solution.
Update UserModeExec seems to come very close.
Writing a working implementation would be very time consuming, and surely raise some interesting questions in its own right. I like to believe this is by design: for most applications it is strongly undesirable to (accidentially) execute its input data because it allows code injection.
What happens exactly when an ELF is executed? Normally the kernel receives a file name and then creates a process, loads and maps the different sections of the executable into memory, performs a lot of sanity checks and marks it as executable before passing control and a file name back to the run-time linker ld-linux.so
(part of libc). The takes care of relocating functions, handling additional libraries, setting up global objects and jumping to the executables entry point. AIU this heavy lifting is done by dl_main()
(implemented in libc/elf/rtld.c).
Even fexecve
is implemented using a file in /proc
and it is this need for a file name that leads us to reimplement parts of this linking process.
Libraries
Reading
Related Questions at SO
So it seems possible, you decide whether is also practical.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With