Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python struct.unpack not working

Tags:

python

I'm trying to run this:

def ReadWord(fid,fmt,Addr):
    fid.seek(Addr)
    s = fid.readline(2)
    s = unpack(fmt + 'h', s)
    if(type(s) == tuple):
        return s[0]
    else:
        return s    

with:

len(s) = 2
len(fmt) = 1
calcsize(fmt) = 0
calcsize(fmt + 'h') = 2

However, Python returns:

struct.error: unpack requires a string argument of length 4

According to python struct.unpack documentation :

The string must contain exactly the amount of data required by the format (len(string) must equal calcsize(fmt)).

So if the length of my string is 2 and calcsize of fmt+'h' is also 2, why does python say "unpack requires a string argument of length 4" ??

EDIT :

Thanks for all your answers. Here is the full code:

http://qtwork.tudelft.nl/gitdata/users/guen/qtlabanalysis/analysis_modules/general/lecroy.py

So as you can see in the read_timetrace function, fmt is set to '<' or '>' in a if...else statement. Printing it confirmes that.

But you should also know that I'm working on windowsx64 (for work).

EDIT2

Here's the full traceback, sorry for the mistake.

Traceback (most recent call last):
  File "C:\Users\maxime.vast\Desktop\Test Campaign Template\Test Suite\Include\readLecroyTRCFile.py", line 139, in <module>
    read_timetrace("C:\Users\maxime.vast\Desktop\Test Campaign Template\Test Suite\Traces\KL.ES.001.001.trc")
  File "C:\Users\maxime.vast\Desktop\Test Campaign Template\Test Suite\Include\readLecroyTRCFile.py", line 60, in read_timetrace
    WAVE_ARRAY_1        = ReadLong(fid, fmt, aWAVE_ARRAY_1)
  File "C:\Users\maxime.vast\Desktop\Test Campaign Template\Test Suite\Include\readLecroyTRCFile.py", line 100, in ReadLong
    s = unpack(fmt + 'l', s)
struct.error: unpack requires a string argument of length 4
[Finished in 0.2s]

EDIT3:

I replaced readline by read and add :

print "len(s) ", len(s)
print "len(fmt) ", len(fmt)
print "calcsize(fmt) ", calcsize(fmt)
print "calcsize(fmt + 'h') ", calcsize(fmt + 'h')
print "fmt ", fmt

to ReadLong function.

Here's the new traceback :

len(s)  4
len(fmt)  1
calcsize(fmt)  0
calcsize(fmt + 'h')  2
fmt  <
len(s)  4
len(fmt)  1
calcsize(fmt)  0
calcsize(fmt + 'h')  2
fmt  <
len(s)  4
len(fmt)  1
calcsize(fmt)  0
calcsize(fmt + 'h')  2
fmt  <
len(s)  1
len(fmt)  1
calcsize(fmt)  0
calcsize(fmt + 'h')  2
fmt  <
Traceback (most recent call last):
  File "C:\Users\maxime.vast\Desktop\Test Campaign Template\Test Suite\Include\readLecroyTRCFile.py", line 143, in <module>
    read_timetrace("C:\Users\maxime.vast\Desktop\Test Campaign Template\Test Suite\Traces\KL.ES.001.001.trc")
  File "C:\Users\maxime.vast\Desktop\Test Campaign Template\Test Suite\Include\readLecroyTRCFile.py", line 60, in read_timetrace
    WAVE_ARRAY_1        = ReadLong(fid, fmt, aWAVE_ARRAY_1)
  File "C:\Users\maxime.vast\Desktop\Test Campaign Template\Test Suite\Include\readLecroyTRCFile.py", line 104, in ReadLong
    s = unpack(fmt + 'l', s)
struct.error: unpack requires a string argument of length 4
[Finished in 0.2s]
like image 562
Maxime VAST Avatar asked Sep 11 '15 09:09

Maxime VAST


People also ask

What does struct unpack do in Python?

unpack() This function converts the strings of binary representations to their original form according to the specified format.

Does Python have struct?

Python offers several data types that you can use to implement records, structs, and data transfer objects.

How do you use structures in Python?

The module struct is used to convert the native data types of Python into string of bytes and vice versa. We don't have to install it. It's a built-in module available in Python3. The struct module is related to the C languages.

What is H in Python?

Quoted -> If it is h it specifies that the numeric value should be truncated to a 16-bit value before converting.


2 Answers

FWIW, you should be using read(2), not readline(2). And if the fmt string really is '>' you should not be getting that error. Here's a short demo that performs as expected.

from struct import unpack

fname = 'qbytes'

#Create a file of all byte values
with open(fname, 'wb') as f:
    f.write(bytearray(range(256)))

def ReadWord(fid, fmt, addr):
    fid.seek(addr)
    s = fid.read(2)
    s = unpack(fmt + 'h', s)
    return s[0]

fid = open(fname, 'rb')

for i in range(16):
    addr = i
    n = 256*i + i+1
    #Interpret file data as big-endian
    print i, ReadWord(fid, '>', addr), n

fid.close()

output

0 1 1
1 258 258
2 515 515
3 772 772
4 1029 1029
5 1286 1286
6 1543 1543
7 1800 1800
8 2057 2057
9 2314 2314
10 2571 2571
11 2828 2828
12 3085 3085
13 3342 3342
14 3599 3599
15 3856 3856

BTW, struct.unpack() always returns a tuple, even if the return value is a single item.


Using readline(2) on a binary file can give unexpected results. In my test file in the above code there's a (Linux-style) newline \xa0 in the file. So if you change s = fid.read(2) to s = fid.readline(2) everything works fine at first, but on line 10 it crashes because it only reads a single byte, due to that newline char:

from struct import unpack

fname = 'qbytes'

#Create a file of all byte values
with open(fname, 'wb') as f:
    f.write(bytearray(range(256)))

def ReadWord(fid, fmt, addr):
    fid.seek(addr)
    s = fid.readline(2)
    print repr(s),
    s = unpack(fmt + 'h', s)
    return s[0]

with open(fname, 'rb') as fid:
    for i in range(16):
        addr = i
        n = 256*i + i+1
        #Interpret file data as big-endian
        print i, ReadWord(fid, '>', addr), n

output

0 '\x00\x01' 1 1
1 '\x01\x02' 258 258
2 '\x02\x03' 515 515
3 '\x03\x04' 772 772
4 '\x04\x05' 1029 1029
5 '\x05\x06' 1286 1286
6 '\x06\x07' 1543 1543
7 '\x07\x08' 1800 1800
8 '\x08\t' 2057 2057
9 '\t\n' 2314 2314
10 '\n'
Traceback (most recent call last):
  File "./qtest.py", line 30, in <module>
    print i, ReadWord(fid, '>', addr), n
  File "./qtest.py", line 22, in ReadWord
    s = unpack(fmt + 'h', s)
struct.error: unpack requires a string argument of length 2

postscript

You have several functions in your code that almost do the same thing. That breaks the DRY principle: Don't Repeat Yourself. Here's one way to fix that, using partial function application. See the functools docs for more info.

from functools import partial

def ReadNumber(fid, datalen=1, fmt='>', conv='b', addr=0):
    fid.seek(addr)
    s = fid.read(datalen)
    if len(s) != datalen:
        raise IOError('Read %d bytes but expected %d at %d' % (len(s), datalen, addr)) 
    return unpack(fmt+conv, s)[0]

ReadByte = partial(ReadNumber, datalen=1, conv='b') 
ReadWord = partial(ReadNumber, datalen=2, conv='h') 
ReadLong = partial(ReadNumber, datalen=4, conv='l') 
ReadFloat = partial(ReadNumber, datalen=4, conv='f') 
ReadDouble = partial(ReadNumber, datalen=8, conv='d') 

You need to use keywords to call these new functions. Eg,

ReadLong(fid, fmt='>', addr=addr)

True, that's slightly more long-winded, but it makes the code a little more readable.

like image 67
PM 2Ring Avatar answered Sep 21 '22 09:09

PM 2Ring


The length of the format is rather unimportant on its own. What’s important is what kind of formats you specify there. There are for example format specifications which specify one byte or even eight bytes. So it really depends on the format how many characters there should be in s.

For example:

>>> struct.unpack('b', 'A')
(65,)
>>> struct.unpack('L', 'A')

Traceback (most recent call last):
  File "<pyshell#3>", line 1, in <module>
    struct.unpack('L', 'A')
error: unpack requires a string argument of length 4
>>> struct.unpack('L', 'AAAA')
(1094795585,)

If fmt is really > as you say, then it should work fine:

>>> struct.unpack('>h', 'AA')
(16705,)

So I assume that when the error appears, fmt is not just >, but something else that would consume an additional 2 bytes. Try printing fmt before the unpack.

like image 42
poke Avatar answered Sep 25 '22 09:09

poke