I have this little problem that has been bugging me for the past hour or so.
string = b'-'
t = struct.pack(">h%ds" % len(string), len(string), string)
print(t)
the result of this pack is b'\x00\x01-'
The problem I'm having is that I can't figure out how to unpack the result b'\x00\x01-' so that it's just '-', Yes. I know I can just remove the crap in the front but it gets a little bit more complicated. I tried to simplify it here. Hopefully someone can assist me. :)
Normally you wouldn't use struct.pack
to put a length header and the value together. Instead you would just do struct.pack(">h", len(data))
, send that over the line (for example in network protocol) and then send the data. No need to create a new bytes buffer.
In your case, you could simply do:
dataLength, = struct.unpack(">h", t[:2])
data = t[2:2+dataLength]
but as I said, if you have a socket-based application for instance, it would be like so:
header = receive(2)
dataLength, = struct.unpack(">h", header)
data = receive(dataLength)
import struct
string = b'-'
fmt=">h%ds" % len(string)
Here you are packing both the length and the string:
t = struct.pack(fmt, len(string), string)
print(repr(t))
# '\x00\x01-'
So when you unpack, you should expect to get two values back, i.e., the length and the string:
length,string2=struct.unpack(fmt,t)
print(repr(string2))
# '-'
In general, if you don't know how the string was packed, then there is no sure-fire way to recover the data. You'd just have to guess!
If you know the data is composed of the length of the string, and then the string itself, then you could try trial-and-error:
import struct
string = b'-'
fmt=">h%ds" % len(string)
t = struct.pack(fmt, len(string), string)
print(repr(t))
for endian in ('>','<'):
for fmt,size in (('b',1),('B',1),('h',2),('H',2),('i',4),('I',4),
('l',4),('L',4),('q',8),('Q',8)):
fmt=endian+fmt
try:
length,=struct.unpack(fmt,t[:size])
except struct.error:
pass
else:
fmt=fmt+'{0}s'.format(length)
try:
length,string2=struct.unpack(fmt,t)
except struct.error:
pass
else:
print(fmt,length,string2)
# ('>h1s', 1, '-')
# ('>H1s', 1, '-')
It might be possible to compose an ambiguous string t
which has multiple valid unpackings which would lead to different string2
s, however. I'm not sure.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With