Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - Small Change to a Huge File

Tags:

python

This is a theoretical question as I don't have an actual problem, but I got to wondering ...

If I had a huge file, say many gigs long and I wanted to change a single byte and I knew the offset of that byte, how could I do this efficiently? Is there a way to do this without rewriting the entire file and only writing the single byte?

I'm not seeing anything in the Python file api that would let me write to a particular offset in a file.

like image 788
fthinker Avatar asked Mar 16 '12 15:03

fthinker


3 Answers

As long as you don't need to insert or delete bytes, you can open the file in "r+" mode, use the seek method to position the file object at the byte to change, and write out one byte.

It may be more efficient to use the lower-level os.open, os.lseek, os.read, and os.write operations, which do not do any application-level buffering.

If you do need to insert or delete bytes, sorry, you're out of luck: there is no way to do that without rewriting the entire file (from the point of the first insertion or deletion). This is a limitation of the POSIX (and AFAIK also Windows) low-level file APIs, not of Python specifically.

like image 151
zwol Avatar answered Oct 04 '22 21:10

zwol


You can seek() to a position and write a single byte. It will overwrite what's there, rather than inserting.

like image 32
ForeverConfused Avatar answered Oct 04 '22 21:10

ForeverConfused


Seek to that position in the file and write a single byte. File objects in Python have a seek method that takes in an integer offset from some constant:

seek(offset[, whence])

The whence argument is optional and defaults to 0 (absolute file positioning); other values are 1 (seek relative to the current position) and 2 (seek relative to the file's end).

like image 30
Hunter McMillen Avatar answered Oct 02 '22 21:10

Hunter McMillen