I am trying to read one short and long from a binary file using python struct
.
But the
print(struct.calcsize("hl")) # o/p 16
which is wrong, It should have been 2 bytes for short and 8 bytes for long. I am not sure i am using the struct
module the wrong way.
When i print the value for each it is
print(struct.calcsize("h")) # o/p 2
print(struct.calcsize("l")) # o/p 8
Is there a way to force python to maintain the precision on datatypes
?
Python struct calcsize() This function calculates and returns the size of the String representation of struct with a given format. Size is calculated in terms of bytes.
Python does not exactly have the same thing as a struct in Matlab. You can achieve something like it by defining an empty class and then defining attributes of the class. You can check if an object has a particular attribute using hasattr.
The struct module in Python is used to convert native Python data types such as strings and numbers into a string of bytes and vice versa. What this means is that users can parse binary files of data stored in C structs in Python.
The struct. pack() converts a list of values into corresponding string types. The user should specify both the format and order of the values that should be converted.
By default struct alignment rules, 16 is the correct answer. Each field is aligned to match its size, so you end up with a short
for two bytes, then six bytes of padding (to reach the next address aligned to a multiple of eight bytes), then eight bytes for the long
.
You can use a byte order prefix (any of them disable padding), but they also disable machine native sizes (so struct.calcsize("=l")
will be a fixed 4 bytes on all systems, and struct.calcsize("=hl")
will be 6 bytes on all systems, not 10, even on systems with 8 byte long
s).
If you want to compute struct sizes for arbitrary structures using machine native types with non-default padding rules, you'll need to go to the ctypes
module, define your ctypes.Structure
subclass with the desired _pack_
setting, then use ctypes.sizeof
to check the size, e.g.:
from ctypes import Structure, c_long, c_short, sizeof
class HL(Structure):
_pack_ = 1 # Disables padding for field alignment
# Defines (unnamed) fields, a short followed by long
_fields_ = [("", c_short),
("", c_long)]
print(sizeof(HL))
which outputs 10
as desired.
This could be factored out as a utility function if needed (this is a simplified example that doesn't handle all struct
format codes, but you can expand if needed):
from ctypes import *
FMT_TO_TYPE = dict(zip("cb?hHiIlLqQnNfd",
(c_char, c_byte, c_bool, c_short, c_ushort, c_int, c_uint,
c_long, c_ulong, c_longlong, c_ulonglong,
c_ssize_t, c_size_t, c_float, c_double)))
def calcsize(fmt, pack=None):
'''Compute size of a format string with arbitrary padding (defaults to native)'''
class _(Structure):
if pack is not None:
_pack_ = pack
_fields_ = [("", FMT_TO_TYPE[c]) for c in fmt]
return sizeof(_)
which, once defined, lets you compute sizes padded or unpadded like so:
>>> calcsize("hl") # Defaults to native "natural" alignment padding
16
>>> calcsize("hl", 1) # pack=1 means no alignment padding between members
10
This is what the doc says:
By default, the result of packing a given C struct includes pad bytes in order to maintain proper alignment for the C types involved; similarly, alignment is taken into account when unpacking. This behavior is chosen so that the bytes of a packed struct correspond exactly to the layout in memory of the corresponding C struct. To handle platform-independent data formats or omit implicit pad bytes, use
standard
size and alignment instead ofnative
size and alignment
Changing it from standard to native is pretty easy: you just append the prefix =
before the format characters.
print(struct.calcsize("=hl"))
EDIT
Since from the native to standard mode, some default sizes are changed, you have two options:
keeping the native mode, but switching the format characters, in this way: struct.calcsize("lh")
. In C even the order of your variable inside the struct is important. Here the padding is 8 bytes, it means that every variable has to be referenced at multiple of 8 bytes.
Using the format characters of the standard mode, so: struct.calcsize("=hq")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With