Background: I am looking at developing a package manager similar to portage in Gentoo Linux ( I may end up forking portage). For those that know little about Gentoo it is a source based distro, which means that all packages are compiled from source code. Currently it is possible to compile a program into object files and then into executable's.
$ gcc -c a.c -o a.o
$ gcc -c b.c -o b.o
$ gcc a.o b.o -o executable
The improvements I would like to make to portage are the following.
Reasoning: I am an Arch linux user who loves the idea of a source based distribution but cannot be bothered with the enormous task of keeping my system up to date. I also do most of my work on a laptop computer with a small hard drive, hence the reason behind de-compiling/un-linking the executable to object files rather that just keeping the object files which take up a large amount of space. It would also likely decrease the overall compile time of the system as the need to re-compile most of the source code would be greatly reduced. It would also allow for an easy way to change the USE flags on a package without the need to completely re-compile.
Question: Is it possible to compile object files into an executable and then to de-compile back into object files. An example of this is below.
$ gcc -c a.c -o a.o
$ gcc -c b.c -o b.o
$ gcc a.o b.o -o executable
and then
$ SomeCommand executable
output << a.o b.o
If this is not currently possible. Would it be doable to modify a version of GNU's linker "$ld
" to log the changes it makes when linking object files, so as to make intentionally make the program "reverse Engineerable" ???
Edit: Another use for this would be too separate a singular object file from an executable of a large project to swap the separated object file with a new one and to re-link again. This would reduce the overhead of re-linking large projects from many different files when only one is updated. This would allow for incremental compilation on the binary level.
No, this is not possible. A large amount of the linker's work is replacing symbolic references (valid for any combination of object files being linked together) with numeric offsets (valid only for the particular way the linker decided to lay out that particular combination of object files, that particular time). Once the references are "baked" in this way, they cannot be recovered.
It might be doable if you alter/configure ld
to keep sections for each object file apart and also keeps the relocation table for each object file in the executable. Also you have to make sure ld
stores the object file names in the executable if you want to get the original file names.
Basically a linker could just join the object files together and then do the relocations, if the relocations are inversible you should be able to reverse the process.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With