Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to add (and use) binary data to compiled executable?

There are several questions dealing with some aspects of this problem, but neither seems to answer it wholly. The whole problem can be summarized as follows:

  • You have an already compiled executable (obviously expecting the use of this technique).
  • You want to add an arbitrarily sized binary data to it (not necessarily by itself which would be another nasty problem to deal with).
  • You want the already compiled executable to be able to access this added binary data.

My particular use-case would be an interpreter, where I would like to make the user able to produce a single file executable out of an interpreter binary and the code he supplies (the interpreter binary being the executable which would have to be patched with the user supplied code as binary data).

A similar case are self-extracting archives, where a program (the archiving utility, such as zip) is capable to construct such an executable which contains a pre-built decompressor (the already compiled executable), and user-supplied data (the contents of the archive). Obviously no compiler or linker is involved in this process (Thanks, Mathias for the note and pointing out 7-zip).

Using existing questions a particular path of solution shows along the following examples:

appending data to an exe - This deals with the aspect of adding arbitrary data to arbitrary exes, without covering how to actually access it (basically simple append usually works, also true with Unix's ELF format).

Finding current executable's path without /proc/self/exe - In companion with the above, this would allow getting a file name to use for opening the exe, to access the added data. There are many more of these kind of questions, however neither focuses especially on the problem of getting a path suitable for the purpose of actually getting the binary opened as a file (which goal alone might (?) be easier to accomplish - truly you don't even need the path, just the binary opened for reading).

There also may be other, probably more elegant ways around this problem than padding the binary and opening the file for reading it in. For example could the executable be made so that it becomes rather trivial to patch it later with the arbitrarily sized data so it appears "within" it being in some proper data segment? (I couldn't really find anything on this, for fixed size data it should be trivial though unless the executable has some hash)

Can this be done reasonably well with as little deviation from standard C as possible? Even more or less cross-platform? (At least from maintenance standpoint) Note that it would be preferred if the program performing the adding of the binary data didn't rely on compiler tools to do it (which the user might not have), but solutions necessiting those might also be useful.

Note the already compiled executable criteria (the first point in the above list), which requires a completely different approach than solutions described in questions like C/C++ with GCC: Statically add resource files to executable/library or SDL embed image inside program executable , which ask for embedding data compile-time.

Additional notes:

The problems with the obvious approach outlined above and suggested in some comments, that to just append to the binary and use that, are as follows:

  • Opening the currently running program's binary doesn't seem something trivial (opening the executable for reading is, but not finding the path to supply to the file open call, at least not in a reasonably cross-platform manner).
  • The method of acquiring the path may provide an attack surface which probably wouldn't exist otherwise. This means that a potential attacker could trick the program to see different binary data (provided by him) like which the executable actually has, exposing any vulnerability which might reside in the parser of the data.
like image 703
Jubatian Avatar asked Sep 07 '15 11:09

Jubatian


1 Answers

It depends on how you want other systems to see your binary.

Digital signed in Windows

The exe format allows for verifying the file has not been modified since publishing. This would allow you to :-

  1. Compile your file
  2. Add your data packet
  3. Sign your file and publish it.

The advantage of following this system, is that "everybody" agrees your file has not been modified since signing.

The easiest way to achieve this scheme, is to use a resource. Windows resources can be added post- linking. They are protected by the authenticode digital signature, and your program can extract the resource data from itself.

It used to be possible to increase the signature to include binary data. Unfortunately this has been banned. There were binaries which used data in the signature section. Unfortunately this was used maliciously. Some details here msdn blog

Breaking the signature

If re-signing is not an option, then the result would be treated as insecure. It is worth noting here, that appended data is insecure, and can be modified without people being able to tell, but so is the code in your binary.

Appending data to a binary does break the digital signature, and also means the end-user can't tell if the code has been modified.

This means that any self-protection you add to your code to ensure the data blob is still secure, would not prevent your code from being modified to remove the check.

Running module

Windows GetModuleFileName allows the running path to be found.

Linux offers /proc/self or /proc/pid.

Unix does not seem to have a method which is reliable.

Data reading

The approach of the zip format, is to have a directory written to the end of the file. This means the data can be found at the end of the location, and then looked backwards for the start of the data. The advantage here, is the data blob is signposted from the end of the data, rather than the natural start.

like image 113
mksteve Avatar answered Sep 19 '22 08:09

mksteve