Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Determine target ISA extensions of binary file in Linux (library or executable)

People also ask

What are binary files in Linux?

Any file on a Linux system that isn't a text file is considered a binary file--from system commands and libraries to image files and compiled programs.

What are binaries and libraries?

Binary files include image files, sound files, executable (i.e., runnable) programs and compressed data files. typically done by a linker. In computer science, a library is a collection of subroutines or classes used to develop software. Libraries contain code and data that provide services to independent programs.


The unix.linux file command is great for this. It can generally detect the target architecture and operating system for a given binary (and has been maintained on and off since 1973. wow!)

Of course, if you're not running under unix/linux - you're a bit stuck. I'm currently trying to find a java based port that I can call at runtime.. but no such luck.

The unix file command gives information like this:

hex: ELF 32-bit LSB executable, ARM, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.4.17, not stripped

More detailed information about the details of the architecture are hinted at with the (unix) objdump -f <fileName> command which returns:

architecture: arm, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x0000876c

This executable was compiled by a gcc cross compiler (compiled on an i86 machine for the ARM processor as a target)


I decide to add one more solution for any, who got here: personally in my case the information provided by the file and objdump wasn't enough, and the grep isn't much of a help -- I resolve my case through the readelf -a -W.

Note, that this gives you pretty much info. The arch related information resides in the very beginning and the very end. Here's an example:

ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           ARM
  Version:                           0x1
  Entry point address:               0x83f8
  Start of program headers:          52 (bytes into file)
  Start of section headers:          2388 (bytes into file)
  Flags:                             0x5000202, has entry point, Version5 EABI, soft-float ABI
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         8
  Size of section headers:           40 (bytes)
  Number of section headers:         31
  Section header string table index: 28
...
Displaying notes found at file offset 0x00000148 with length 0x00000020:
  Owner                 Data size   Description
  GNU                  0x00000010   NT_GNU_ABI_TAG (ABI version tag)
    OS: Linux, ABI: 2.6.16
Attribute Section: aeabi
File Attributes
  Tag_CPU_name: "7-A"
  Tag_CPU_arch: v7
  Tag_CPU_arch_profile: Application
  Tag_ARM_ISA_use: Yes
  Tag_THUMB_ISA_use: Thumb-2
  Tag_FP_arch: VFPv3
  Tag_Advanced_SIMD_arch: NEONv1
  Tag_ABI_PCS_wchar_t: 4
  Tag_ABI_FP_rounding: Needed
  Tag_ABI_FP_denormal: Needed
  Tag_ABI_FP_exceptions: Needed
  Tag_ABI_FP_number_model: IEEE 754
  Tag_ABI_align_needed: 8-byte
  Tag_ABI_align_preserved: 8-byte, except leaf SP
  Tag_ABI_enum_size: int
  Tag_ABI_HardFP_use: SP and DP
  Tag_CPU_unaligned_access: v6

I think you need a tool that checks every instruction, to determine exactly which set it belongs to. Is there even an offical name for the specific set of instructions implemented by the C3 processor? If not, it's even hairier.

A quick'n'dirty variant might be to do a raw search in the file, if you can determine the bit pattern of the disallowed instructions. Just test for them directly, could be done by a simple objdump | grep chain, for instance.


To answer the ambiguity of whether a Via C3 is a i686 class processor: It's not, it's an i586 class processor.

Cyrix never produced a true 686 class processor, despite their marketing claims with the 6x86MX and MII parts. Among other missing instructions, two important ones they didn't have were CMPXCHG8b and CPUID, which were required to run Windows XP and beyond.

National Semiconductor, AMD and VIA have all produced CPU designs based on the Cyrix 5x86/6x86 core (NxP MediaGX, AMD Geode, VIA C3/C7, VIA Corefusion, etc.) which have resulted in oddball designs where you have a 586 class processor with SSE1/2/3 instruction sets.

My recommendation if you come across any of the CPUs listed above and it's not for a vintage computer project (ie. Windows 98SE and prior) then run screaming away from it. You'll be stuck on slow i386/486 Linux or have to recompile all of your software with Cyrix specific optimizations.


Expanding upon @Hi-Angel's answer I found an easy way to check the bit width of a static library:

readelf -a -W libsomefile.a | grep Class: | sort | uniq

Where libsomefile.a is my static library. Should work for other ELF files as well.