Working my way through some some Python code, I'm noticing there are a number of different representations for hexadecimal values. For example, if I choose a number like so: <pre class="prettyprint"><code>xx = '\x03\xff' </code></pre> Then the following command (a version of which I'm using to convert little endian to big endian) <pre class="prettyprint"><code>yy = hex(struct.unpack('>H', xx)[0]) </code></pre> will return: <pre class="prettyprint"><code>'0x3ff' </code></pre> However, this command <pre class="prettyprint"><code>zz = xx.encode('hex') </code></pre> will return: <pre class="prettyprint"><code>'03ff' </code></pre> Finally, printing just the value out will return this <pre class="prettyprint"><code>'\x03\xff' </code></pre> From the looks of it there are three different types of hex then. <ol> <li><code>'\xFF'</code></li> <li><code>'0xFF'</code></li> <li><code>'FF'</code></li> </ol> What's the difference? Bonus points if someone could suggest a better way of converting a little endian to a big endian number. The above method for <code>yy</code> won't work for numbers larger than two bytes obstinately enough and I'm working with some hex strings that are 16 bytes long (including values that don't correspond to an ascii/integer value)

Anything using <code>\x</code> is a string escape code, which happens to use hex notation; other escape codes include <code>\n</code> for newlines, <code>\'</code> for a literal quote, etc. A python string is a sequence of bytes and you can specify literal values outside the ASCII printable range using such characters. When Python echoes a string value back at you in the interpreter, or you print the result of a <code>repr()</code> call on a string, Python will use such escapes to represent any byte that cannot be printed as a ASCII character instead: <pre class="prettyprint"><code>>>> chr(65) 'A' >>> chr(11) '\x0b' </code></pre> The <code>hex()</code> function returns a very specific string representation, as does <code>.encode('hex')</code> with the difference being that the former includes the <code>0x</code> prefix. There are two more methods to produce such string representations; using the <code>'%x'</code> and <code>'%X'</code> string formatters, which use lowercase or uppercase letters for the representation. <pre class="prettyprint"><code>>>> hex(11) '0xb' >>> '\x0b'.encode('hex') '0b' >>> '%x' % (11,) 'b' >>> '%X' % (11,) 'B' </code></pre> These are all string representations though (a series of ASCII characters), and have the same relation to the original data as <code>str(number)</code> is to integer data; you have changed the type and are further away from your goal of changing the byte ordering. Changing a piece of binary information from little-ending to big-endian requires that you know the size of that piece of information. If all you have are short integers, then you need to flip every two bytes around, but if you have normal (long) integers, then you have 4 bytes per value and you need to reverse each 4 bytes. Using the <code>struct</code> module is, I think, an excellent approach because you have to specify the value type. The following would interpret <code>xx</code> as a big-endian unsigned short int, then pack it back to a binary representation as a little-endian unsigned short int: <pre class="prettyprint"><code>>>> import struct >>> xx = '\x03\xff' >>> struct.pack('<H', *struct.unpack('>H', xx)) '\xff\x03' </code></pre>

'\xFF' represents the string containing the character with ASCII code 255. E.g.: <code>print '\x41'</code> gives 'A' (because this is the character with ASCII code 41) the <code>xx.encode('hex')</code> and <code>hex(struct.unpack('>H', xx)[0])</code> just give a human readable hexadecimal representation of the ASCII codes the string xx contains. This means that the resulting string contains a number of characters between a and f or 0 and 9.

Difference between different hex types/representations in Python

Tags:

Working my way through some some Python code, I'm noticing there are a number of different representations for hexadecimal values. For example, if I choose a number like so:

xx = '\x03\xff'

Then the following command (a version of which I'm using to convert little endian to big endian)

yy = hex(struct.unpack('>H', xx)[0])

will return:

'0x3ff'

However, this command

zz = xx.encode('hex')

will return:

'03ff'

Finally, printing just the value out will return this

'\x03\xff'

From the looks of it there are three different types of hex then.

'\xFF'
'0xFF'
'FF'

What's the difference?

Bonus points if someone could suggest a better way of converting a little endian to a big endian number. The above method for yy won't work for numbers larger than two bytes obstinately enough and I'm working with some hex strings that are 16 bytes long (including values that don't correspond to an ascii/integer value)

359

asked Oct 29 '12 14:10

stephenfin

2 Answers

Anything using \x is a string escape code, which happens to use hex notation; other escape codes include \n for newlines, \' for a literal quote, etc. A python string is a sequence of bytes and you can specify literal values outside the ASCII printable range using such characters. When Python echoes a string value back at you in the interpreter, or you print the result of a repr() call on a string, Python will use such escapes to represent any byte that cannot be printed as a ASCII character instead:

>>> chr(65)
'A'
>>> chr(11)
'\x0b'

The hex() function returns a very specific string representation, as does .encode('hex') with the difference being that the former includes the 0x prefix. There are two more methods to produce such string representations; using the '%x' and '%X' string formatters, which use lowercase or uppercase letters for the representation.

>>> hex(11)
'0xb'
>>> '\x0b'.encode('hex')
'0b'
>>> '%x' % (11,)
'b'
>>> '%X' % (11,)
'B'

These are all string representations though (a series of ASCII characters), and have the same relation to the original data as str(number) is to integer data; you have changed the type and are further away from your goal of changing the byte ordering.

Changing a piece of binary information from little-ending to big-endian requires that you know the size of that piece of information. If all you have are short integers, then you need to flip every two bytes around, but if you have normal (long) integers, then you have 4 bytes per value and you need to reverse each 4 bytes.

Using the struct module is, I think, an excellent approach because you have to specify the value type. The following would interpret xx as a big-endian unsigned short int, then pack it back to a binary representation as a little-endian unsigned short int:

>>> import struct
>>> xx = '\x03\xff'
>>> struct.pack('<H', *struct.unpack('>H', xx))
'\xff\x03'

189

answered Sep 20 '22 20:09

Martijn Pieters

'\xFF' represents the string containing the character with ASCII code 255.

E.g.: print '\x41' gives 'A' (because this is the character with ASCII code 41)

the xx.encode('hex') and hex(struct.unpack('>H', xx)[0]) just give a human readable hexadecimal representation of the ASCII codes the string xx contains. This means that the resulting string contains a number of characters between a and f or 0 and 9.

answered Sep 19 '22 20:09

Vortexfive

Related questions
                            
                                Node JS workers - any need for them?
                            
                                npm git repository not updating versions
                            
                                How to play system sound on iOS without vibration?
                            
                                What is a name that can represent both a file or directory?
                            
                                what does link href="#" do? [duplicate]
                            
                                Does Resharper tell me a css class is unknown because it's on a CDN?
                            
                                Can Java infer type arguments from type parameter bounds?
                            
                                Where does the name `Psycopg` come from? [closed]
                            
                                How can you specify the order of properties in a javascript object for a MongoDB index in node.js?
                            
                                Best way to efficiently find high density regions
                            
                                PHP equivalent of .Net Entity Framework [closed]
                            
                                Can std::hash be used to hash function pointers?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With