Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Endianness swap without ntohs

Tags:

c++

endianness

I am writing an ELF analyzer, but I'm having some trouble converting endianness properly. I have functions to determine the endianness of the analyzer and the endiannness of the object file.

Basically, there are four possible scenarios:

  1. A big endian compiled analyzer run on a big endian object file
    • nothing needs converted
  2. A big endian compiled analyzer run on a little endian object file
    • the byte order needs swapped, but ntohs/l() and htons/l() are both null macros on a big endian machine, so they won't swap the byte order. This is the problem
  3. A little endian compiled analyzer run on a big endian object file
    • the byte order needs swapped, so use htons() to swap the byte order
  4. A little endian compiled analyzer run on a little endian object file.
    • nothing needs converted

Is there a function I can use to explicitly swap byte order/change endianness, since ntohs/l() and htons/l() take the host's endianness into account and sometimes don't convert? Or do I need to find/write my own swap byte order function?

like image 964
xdumaine Avatar asked Apr 26 '12 21:04

xdumaine


People also ask

Does endianness matter for a single byte?

Again, endian-ness does not matter if you have a single byte. If you have one byte, it's the only data you read so there's only one way to interpret it (again, because computers agree on what a byte is).

What is the need of endianness byte swapping?

So Endianness comes into picture when you are sending and receiving data across the network from one host to another host. If the sender and receiver computer have different Endianness, then there is a need to swap the Endianness so that it is compatible.

What does endianness depend on?

Broadly speaking, the endianness in use is determined by the CPU. Because there are a number of options, it is unsurprising that different semiconductor vendors have chosen different endianness for their CPUs.

What is the difference between little endianness and big endianness?

Big-endian is an order in which the "big end" (most significant value in the sequence) is stored first, at the lowest storage address. Little-endian is an order in which the "little end" (least significant value in the sequence) is stored first.


2 Answers

In Linux there are several conversion functions in endian.h, which allow to convert between arbitrary endianness:

uint16_t htobe16(uint16_t host_16bits);
uint16_t htole16(uint16_t host_16bits);
uint16_t be16toh(uint16_t big_endian_16bits);
uint16_t le16toh(uint16_t little_endian_16bits);

uint32_t htobe32(uint32_t host_32bits);
uint32_t htole32(uint32_t host_32bits);
uint32_t be32toh(uint32_t big_endian_32bits);
uint32_t le32toh(uint32_t little_endian_32bits);

uint64_t htobe64(uint64_t host_64bits);
uint64_t htole64(uint64_t host_64bits);
uint64_t be64toh(uint64_t big_endian_64bits);
uint64_t le64toh(uint64_t little_endian_64bits);

Edited, less reliable solution. You can use union to access the bytes in any order. It's quite convenient:

union {
    short number;
    char bytes[sizeof(number)];
};
like image 57
Rafał Rawicki Avatar answered Oct 12 '22 12:10

Rafał Rawicki


I think it's worth raising The Byte Order Fallacy article here, by Rob Pyke (one of Go's author).

If you do things right -- ie you do not assume anything about your platforms byte order -- then it will just work. All you need to care about is whether ELF format files are in Little Endian or Big Endian mode.

From the article:

Let's say your data stream has a little-endian-encoded 32-bit integer. Here's how to extract it (assuming unsigned bytes):

i = (data[0]<<0) | (data[1]<<8) | (data[2]<<16) | (data[3]<<24);

If it's big-endian, here's how to extract it:

i = (data[3]<<0) | (data[2]<<8) | (data[1]<<16) | (data[0]<<24);

And just let the compiler worry about optimizing the heck out of it.

like image 39
Matthieu M. Avatar answered Oct 12 '22 10:10

Matthieu M.