Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between Big Endian and little Endian Byte order

What is the difference between Big Endian and Little Endian Byte order ?

Both of these seem to be related to Unicode and UTF16. Where exactly do we use this?

like image 784
web dunia Avatar asked Mar 31 '09 15:03

web dunia


People also ask

What is little-endian byte order?

Little Endian Byte Order: The least significant byte (the "little end") of the data is placed at the byte with the lowest address. The rest of the data is placed in order in the next three bytes in memory. In these definitions, the data, a 32-bit pattern, is regarded as a 32-bit unsigned integer.

What is better big-endian or little-endian?

The benefit of little endianness is that a variable can be read as any length using the same address. One benefit of big-endian is that you can read 16-bit and 32-bit values as most humans do; from left to right.

What is big little byte order?

Big endian machine: Stores data big-end first. When looking at multiple bytes, the first byte (lowest address) is the biggest. Little endian machine: Stores data little-end first. When looking at multiple bytes, the first byte is smallest.

What are the different data types and sizes in the little-endian and big-endian?

Little and big endian are two ways of storing multibyte data-types ( int, float, etc). In little endian machines, last byte of binary representation of the multibyte data-type is stored first. On the other hand, in big endian machines, first byte of binary representation of the multibyte data-type is stored first.


2 Answers

Big-Endian (BE) / Little-Endian (LE) are two ways to organize multi-byte words. For example, when using two bytes to represent a character in UTF-16, there are two ways to represent the character 0x1234 as a string of bytes (0x00-0xFF):

Byte Index:      0  1 --------------------- Big-Endian:     12 34 Little-Endian:  34 12 

In order to decide if a text uses UTF-16BE or UTF-16LE, the specification recommends to prepend a Byte Order Mark (BOM) to the string, representing the character U+FEFF. So, if the first two bytes of a UTF-16 encoded text file are FE, FF, the encoding is UTF-16BE. For FF, FE, it is UTF-16LE.

A visual example: The word "Example" in different encodings (UTF-16 with BOM):

Byte Index:   0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 ------------------------------------------------------------ ASCII:       45 78 61 6d 70 6c 65 UTF-16BE:    FE FF 00 45 00 78 00 61 00 6d 00 70 00 6c 00 65 UTF-16LE:    FF FE 45 00 78 00 61 00 6d 00 70 00 6c 00 65 00 

For further information, please read the Wikipedia page of Endianness and/or UTF-16.

like image 195
Ferdinand Beyer Avatar answered Sep 19 '22 09:09

Ferdinand Beyer


Ferdinand's answer (and others) are correct, but incomplete.

Big Endian (BE) / Little Endian (LE) have nothing to do with UTF-16 or UTF-32. They existed way before Unicode, and affect how the bytes of numbers get stored in the computer's memory. They depend on the processor.

If you have a number with the value 0x12345678 then in memory it will be represented as 12 34 56 78 (BE) or 78 56 34 12 (LE).

UTF-16 and UTF-32 happen to be represented on 2 respectively 4 bytes, so the order of the bytes respects the ordering that any number follows on that platform.

like image 20
Mihai Nita Avatar answered Sep 21 '22 09:09

Mihai Nita