Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Writing binary data to middle of a sparse file

I need to compile a binary file in pieces with pieces arriving in random order (yes, its a P2P project)

def write(filename, offset, data) 
    file.open(filename, "ab")
    file.seek(offset) 
    file.write(data) 
    file.close()

Say I have a 32KB write(f, o, d) at offset 1MB into file and then another 32KB write(f, o, d) at offset 0

I end up with a file 65KB in length (i.e. the gap consisting of 0s between 32KB - 1MB is truncated/disappears)

I am aware this may appear an incredibly stupid question, but I cannot seem to figure it out from the file.open(..) modes

Advice gratefully received.

*** UPDATE

My method to write P2P pieces ended up as follows (for those who may glean some value from it)

def writePiece(self, filename, pieceindex, bytes, ipsrc, ipdst, ts): 
    file = open(filename,"r+b")
    if not self.piecemap[ipdst].has_key(pieceindex):
        little = struct.pack('<'+'B'*len(bytes), *bytes) 
        # Seek to offset based on piece index 
        file.seek(pieceindex * self.piecesize)
        file.write(little)
        file.flush()
        self.procLog.info("Wrote (%d) bytes of piece (%d) to %s" % (len(bytes), pieceindex, filename))

    # Remember we have this piece now in case duplicates arrive 
    self.piecemap[ipdst][pieceindex] = True
    file.close()

Note: I also addressed some endian issues using struct.pack which plagued me for a while.

For anyone wondering, the project I am working on is to analyse BT messages captured directly off the wire.

like image 330
codeasone Avatar asked Aug 04 '10 16:08

codeasone


2 Answers

>>> import os
>>> filename = 'tempfile'
>>> def write(filename,data,offset):
...     try:
...         f = open(filename,'r+b')
...     except IOError:
...         f = open(filename,'wb')
...     f.seek(offset)
...     f.write(data)
...     f.close()
...
>>> write(filename,'1' * (1024*32),1024*1024)
>>> write(filename,'1' * (1024*32),0)
>>> os.path.getsize(filename)
1081344
like image 83
MattH Avatar answered Nov 15 '22 05:11

MattH


You opened the file in append ("a") mode. All writes are going to the end of the file, irrespective of the calls to seek().

like image 26
Marcelo Cantos Avatar answered Nov 15 '22 05:11

Marcelo Cantos