What is the difference between Big Endian and Little Endian Byte order ? Both of these seem to be related to Unicode and UTF16. Where exactly do we use this?

Ferdinand's answer (and others) are correct, but incomplete. Big Endian (BE) / Little Endian (LE) have nothing to do with UTF-16 or UTF-32. They existed way before Unicode, and affect how the bytes of numbers get stored in the computer's memory. They depend on the processor. If you have a number with the value <code>0x12345678</code> then in memory it will be represented as <code>12 34 56 78</code> (BE) or <code>78 56 34 12</code> (LE). UTF-16 and UTF-32 happen to be represented on 2 respectively 4 bytes, so the order of the bytes respects the ordering that any number follows on that platform.

Difference between Big Endian and little Endian Byte order

2 Answers

Big-Endian (BE) / Little-Endian (LE) are two ways to organize multi-byte words. For example, when using two bytes to represent a character in UTF-16, there are two ways to represent the character 0x1234 as a string of bytes (0x00-0xFF):

Byte Index:      0  1 --------------------- Big-Endian:     12 34 Little-Endian:  34 12

In order to decide if a text uses UTF-16BE or UTF-16LE, the specification recommends to prepend a Byte Order Mark (BOM) to the string, representing the character U+FEFF. So, if the first two bytes of a UTF-16 encoded text file are FE, FF, the encoding is UTF-16BE. For FF, FE, it is UTF-16LE.

A visual example: The word "Example" in different encodings (UTF-16 with BOM):

Byte Index:   0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 ------------------------------------------------------------ ASCII:       45 78 61 6d 70 6c 65 UTF-16BE:    FE FF 00 45 00 78 00 61 00 6d 00 70 00 6c 00 65 UTF-16LE:    FF FE 45 00 78 00 61 00 6d 00 70 00 6c 00 65 00

For further information, please read the Wikipedia page of Endianness and/or UTF-16.

195

answered Sep 19 '22 09:09

Ferdinand Beyer

Ferdinand's answer (and others) are correct, but incomplete.

Big Endian (BE) / Little Endian (LE) have nothing to do with UTF-16 or UTF-32. They existed way before Unicode, and affect how the bytes of numbers get stored in the computer's memory. They depend on the processor.

If you have a number with the value 0x12345678 then in memory it will be represented as 12 34 56 78 (BE) or 78 56 34 12 (LE).

UTF-16 and UTF-32 happen to be represented on 2 respectively 4 bytes, so the order of the bytes respects the ordering that any number follows on that platform.

answered Sep 21 '22 09:09

Mihai Nita

Related questions
                            
                                How does uʍop-ǝpᴉsdn text work?
                            
                                How to match Cyrillic characters with a regular expression
                            
                                Regular Expression Arabic characters and numbers only
                            
                                How to get rid of non-ascii characters in ruby
                            
                                removing emojis from a string in Python
                            
                                Regex to match Egyptian Hieroglyphics [closed]
                            
                                Should I use accented characters in URLs?
                            
                                Can UTF-8 contain zero byte?
                            
                                Are email addresses allowed to contain non-alphanumeric characters?
                            
                                Difference between MBCS and UTF-8 on Windows
                            
                                Font Awesome & Unicode
                            
                                SQLite, python, unicode, and non-utf data
                            
                                How do I remove the BOM character from my xml file [duplicate]
                            
                                Python NLTK: SyntaxError: Non-ASCII character '\xc3' in file (Sentiment Analysis -NLP)
                            
                                Fixing broken UTF-8 encoding
                            
                                How can I get a Unicode character's code?
                            
                                Why does the size of this Python String change on a failed int conversion
                            
                                How do I turn off Unicode in a VC++ project?
                            
                                grepping binary files and UTF16
                            
                                How do I fix this missing semicolon syntax error in Javascript?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Difference between Big Endian and little Endian Byte order

Tags:

unicode

utf-16

endianness

web dunia

People also ask

2 Answers

Ferdinand Beyer

Mihai Nita

Recent Activity

Donate For Us