Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Embedding binary data in a script efficiently

I have seen some installation files (huge ones, install.sh for Matlab or Mathematica, for example) for Unix-like systems, they must have embedded quite a lot of binary data, such as icons, sound, graphics, etc, into the script. I am wondering how that can be done, since this can be potentially useful in simplifying file structure.

I am particularly interested in doing this with Python and/or Bash.

Existing methods that I know of in Python:

  1. Just use a byte string: x = b'\x23\xa3\xef' ..., terribly inefficient, takes half a MB for a 100KB wav file.
  2. base64, better than option 1, enlarge the size by a factor of 4/3.

I am wondering if there are other (better) ways to do this?

like image 992
qed Avatar asked Oct 22 '25 15:10

qed


2 Answers

You can use base64 + compression (using bz2 for instance) if that suits your data (e.g., if you're not embedding already compressed data).

For instance, to create your data (say your data consist of 100 null bytes followed by 200 bytes with value 0x01):

>>> import bz2
>>> bz2.compress(b'\x00' * 100 + b'\x01' * 200).encode('base64').replace('\n', '')
'QlpoOTFBWSZTWcl9Q1UAAABBBGAAQAAEACAAIZpoM00SrccXckU4UJDJfUNV'

And to use it (in your script) to write the data to a file:

import bz2
data = 'QlpoOTFBWSZTWcl9Q1UAAABBBGAAQAAEACAAIZpoM00SrccXckU4UJDJfUNV'
with open('/tmp/testfile', 'w') as fdesc:
    fdesc.write(bz2.decompress(data.decode('base64')))
like image 149
Pierre Avatar answered Oct 25 '25 06:10

Pierre


Here's a quick and dirty way. Create the following script called MyInstaller:

#!/bin/bash

dd if="$0" of=payload bs=1 skip=54

exit

Then append your binary to the script, and make it executable:

cat myBinary >> myInstaller
chmod +x myInstaller

When you run the script, it will copy the binary portion to a new file specified in the path of=. This could be a tar file or whatever, so you can do additional processing (unarchiving, setting execute permissions, etc) after the dd command. Just adjust the number in "skip" to reflect the total length of the script before the binary data starts.

like image 30
Ivan X Avatar answered Oct 25 '25 04:10

Ivan X