Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Embedding binary into elf with objcopy may cause alignment issues?

There have been a number of posts on stackoverflow and other places detailing how to embed binary blobs into elf binaries.

Embedding binary blobs using gcc mingw

and

C/C++ with GCC: Statically add resource files to executable/library

being the most complete answers.

But there's a possible issue which noone mentions. Here's a quicky foo.txt coverted to foo.o:

$ objdump -x foo.o 
foo.o:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .data         0000000d  00000000  00000000  00000034  2**0
                  CONTENTS, ALLOC, LOAD, DATA
SYMBOL TABLE:
00000000 l    d  .data  00000000 .data
0000000d g       .data  00000000 _binary_foo_txt_end
0000000d g       *ABS*  00000000 _binary_foo_txt_size
00000000 g       .data  00000000 _binary_foo_txt_start

Now, I don't really grok all this output - is there documentation for this stuff??? I guess most of it is obvious enough "g" is global and "l" is local etc etc...

What stands out is the alignment for the .data segment set at 0. Does that mean what I think it means? ie: When it comes to linking, the linker will go "ah yeah, wherever..."

If you embed char data or are working on an x86 then you'll never notice. But if you embed int data or, as I'm doing, 16 and 32 bit data on an ARM, then you could get an alignment trap at any point.

My gut feeling is that this means that either objcopy needs another option to specify alignment of the binary blob, or it's broken and you shouldn't use this method at all.

like image 869
carveone Avatar asked Feb 22 '23 21:02

carveone


2 Answers

To answer my own question, I'd assert that objcopy is broken in this instance. I believe that using assembly is likely the best way to go here using Gnu as. Unfortunately I'm now linux machine-less so can't test this properly but I'll put this answer here in case someone finds it or wants to check:

.section ".rodata"
.align 4 # which either means 4 or 2**4 depending on arch!

.global _binary_file_bin_start
.type _binary_file_bin_start, @object
_binary_file_bin_start:
.incbin file.bin

.align 4
.global _binary_file_bin_end
_binary_file_bin_end:

The underscores are the traditional way to annoy yourself with C/asm interoperability. In other words they vanish with MS/Borland compilers under Windows.

like image 191
carveone Avatar answered Mar 05 '23 18:03

carveone


Create a linker script "lscript.ld"

MEMORY
{
   memory : ORIGIN = 0x00000000, LENGTH = 0x80000000
}

SECTIONS
{
.data (ALIGN(4)) : {
   *(.data)
   *(.data.*)
   __data_end = .;
} > memory

.text (ALIGN(4)) : {
   *(.text)
   *(.text.*)
   __text_end = .;
} > memory

_end = .;
}

Link your file:

gcc  -Wl,-T -Wl,lscript.ld -o linked_foo.elf foo.o

Find all the extraneous stuff added in linking:

objdump -x linked_foo.elf

Objcopy again, to remove the extra stuff:

objcopy --remove-section ".init_array" (repeat as necessary) --strip-all --keep-symbol "_binary_foo_txt_start" --keep-symbol "_binary_foo_txt_end" --keep-symbol "_binary_foo_txt_size" linked_foo.elf final_foo.elf

That gets you an elf file at 2**2 alignement.

like image 20
JulesC Avatar answered Mar 05 '23 18:03

JulesC