Reading and storing arbitrary byte length integers from a file

Question

I am attempting to speed up a binary file parser I wrote last year by doing the parsing/data accumulation in numpy. numpy's ability to define customized data structures and slurp data from a binary file into them looks like what I need, except some of the fields in these files are unsigned integers of "nonstandard" length (e.g. 6 bytes). Since I am using Python 2.7, I made my own emulated version of int.from_bytes to handle these fields, but if there is any way to read these fields to integers natively in numpy, that would obviously be much faster and preferable.

ecatmur · Accepted Answer

Numpy doesn't support arbitrary-bytelength integers, and using ctypes bitfields would be more trouble than it's worth.

I'd suggest using vectorised slicing to convert your data to the next-higher standard size integer:

buf = "000000111111222222"
a = np.ndarray(len(buf), np.dtype('>i1'), buf)
e = np.zeros(len(buf) / 6, np.dtype('>i8'))
for i in range(3):
    e.view(dtype='>i2')[i + 1::4] = a.view(dtype='>i2')[i::3]
[hex(x) for x in e]

Reading and storing arbitrary byte length integers from a file

Tags:

python

numpy

dpitch40

1 Answers

ecatmur

Recent Activity

Donate For Us

Reading and storing arbitrary byte length integers from a file

Tags:

python

numpy

dpitch40

1 Answers

ecatmur

Related questions

Recent Activity

Donate For Us