I'm currently looking for a way to add data to an already compiled ELF executable, i.e. embedding a file into the executable without recompiling it.
I could easily do that by using cat myexe mydata > myexe_with_mydata
, but I couldn't access the data from the executable because I don't know the size of the original executable.
Does anyone have an idea of how I could implement this ? I thought of adding a section to the executable or using a special marker (0xBADBEEFC0FFEE
for example) to detect the beginning of the data in the executable, but I do not know if there is a more beautiful way to do it.
Thanks in advance.
ELF files are used by two tools: the linker and the loader. A linker combines multiple ELF files into an executable or a library and a loader loads the executable ELF file in the memory of the process.
In computing, the Executable and Linkable Format (ELF, formerly named Extensible Linking Format), is a common standard file format for executable files, object code, shared libraries, and core dumps.
out could be the binary that will be running in the Computer/board and the elf is the file that gives information about the binary. For example it may provide information at which adress of the binary each variable is located.
You could add the file to the elf file as a special section with objcopy(1):
objcopy --add-section sname=file oldelf newelf
will add the file to oldelf and write the results to newelf (oldelf won't be modified)
You can then use libbfd to read the elf file and extract the section by name, or just roll your own code that reads the section table and finds you section. Make sure to use a section name that doesn't collide with anything the system is expecting -- as long as your name doesn't start with a .
, you should be fine.
I've created a small library called elfdataembed which provides a simple interface for extracting/referencing sections embedded using objcopy
. This allows you to pass the offset/size to another tool, or reference it directly from the runtime using file descriptors. Hopefully this will help someone in the future.
It's worth mentioning this approach is more efficient than compiling to a symbol, as it allows external tools to reference the data without needing to be extracted, and it also doesn't require the entire binary to be loaded into memory in order to extract/reference it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With