Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Write (large amount of) zeros into a binary file

This might be a stupid question but I'm unable to find a proper answer to it. I want to store (don't ask why) a binary representation of a (2000, 2000, 2000) array of zeros into disk, binary format. The traditional approach to achieve so would be:

with open('myfile', 'wb') as f:
    f.write('\0' * 4 * 2000 * 2000 * 2000)  # 4 bytes = float32

But that would imply creating a very large string which is not necessary at all. I know of two other options:

  • Iterate over the elements and store one byte at a time (extremely slow)

  • Create a numpy array and flush it to disk (as memory expensive as the string creation in the example above)

I was hopping to find something like write(char, ntimes) (as it exists in C and other languages) to copy on disk char ntimes at C speed, and not at Python-loops speed, without having to create such a big array on memory.

like image 877
Imanol Luengo Avatar asked Oct 30 '22 01:10

Imanol Luengo


2 Answers

I don't know why you are making such a fuss about "Python's loop speed", but writing in the way of

for i in range(2000 * 2000):
     f.write('\0' * 4 * 2000)  # 4 bytes = float32

will tell the OS to write 8000 0-bytes. After write returns, in the next loop run it is called again.

It might be that the loop is executed slightly slower than it would be in C, but that definitely won't make a difference.

If it is ok to have a sparse file, you as well can seek to the desired file size's position and then truncate the file.

like image 194
glglgl Avatar answered Nov 15 '22 07:11

glglgl


This would be a valid answer to fill a file from Python using numpy's memmap:

shape = (2000, 2000, 2000) # or just (2000 * 2000 * 2000,)
fp = np.memmap(filename, dtype='float32', mode='w+', shape=shape)
fp[...] = 0
like image 36
Imanol Luengo Avatar answered Nov 15 '22 05:11

Imanol Luengo