I am using Python to convert some files to a binary format, but I've run into an odd snare.
import struct
s = struct.Struct('Bffffff')
print s.size
28
Obviously the expected size would be 25
, but it appears to be interpreting the first byte (B
) as a 4-byte integer of some kind. It will also write out a 4-byte integer instead of a byte.
A work-around exists, namely separating the B
out into a separate struct
, like so:
import struct
s1 = struct.Struct('B')
s2 = struct.Struct('ffffff')
print s1.size + s2.size
25
Is there any explanation for this behavior?
From the docs
Padding is only automatically added between successive structure members. No padding is added at the beginning or the end of the encoded struct.
If you test
>>> import struct
>>> s1 = struct.Struct('B')
>>> print s1.size
1
>>> s1 = struct.Struct('f')
>>> print s1.size
4
So when you add it is 25
... But the other way round, B
is 1 and the rest are 4
so it will be padded to make it 4
thus the answer is 28
Consider this example
>>> s1 = struct.Struct('Bf')
>>> print s1.size
8
Again here B
is 1
and padded 3
and f
is 4
so finally it comes up to 8
which is as expected.
As mentioned here to override it you will have to use non-native methods
>>> s1 = struct.Struct('!Bf')
>>> print s1.size
5
No padding is added when using non-native size and alignment, e.g. with ‘<’, ‘>’, ‘=’, and ‘!’.
Unless you specify any character for byte order, alignment, struct
use native byte order, alignment(@
); which cause padding.
By explicitly specifying byte order, you can get what you want:
>>> struct.Struct('!Bffffff').size # network byte order
25
>>> struct.Struct('=Bffffff').size # native byte order, no alignment.
25
>>> struct.Struct('>Bffffff').size # big endian
25
>>> struct.Struct('<Bffffff').size # little endian
25
>>> struct.Struct('@Bffffff').size # native byte order, alignment. (+ native size)
28
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With