Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Storage size of array.array in a file

Tags:

python

file-io

I'm using array.arrayto store many fixed-size numerical records in binary format to a large file, which I would like to process in parallel in chunks, by writing e.g array.array('l', range(20)).tofile(fout). How can I calculate the offsets to use with seek to ensure that I chunk at record boundaries?

like image 931
AatG Avatar asked Dec 19 '25 19:12

AatG


1 Answers

Let's take an array object:

>>> import array
>>> a = array.array('l', range(20))

The size of each element, in bytes:

>>> a.itemsize
4

Write it out:

>>> f = open('array.dat', "wb")
>>> a.tofile(f)
>>> f.close()

Sanity check:

>>> import os
>>> os.stat('array.dat').st_size
80L
>>> len(a) * a.itemsize
80

So the file has the expected number of bytes. Read up the value at "index", say, 7:

>>> f = open('array.dat', 'rb')
>>> f.seek(7 * a.itemsize)
>>> raw = f.read(a.itemsize)
>>> import struct
>>> struct.unpack(a.typecode, raw)
(7,)

Clear?

like image 171
Tim Peters Avatar answered Dec 22 '25 10:12

Tim Peters



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!