Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing a line in a file without rewriting the entire file (in PHP)

Lets say I have a modestly sized text file (~850kb, 10,000+ lines)

And I want to replace a particular line (or several) spread out amongst the file.

Current methods for doing this include re-writing the whole file. The current method I use is read through the entire file line by line, writing to a .tmp file, and once I am done, I rename() the tmp file to the original source file.

It works, but it is slow. And of course, as the file grows, so will execution times.

Is there another way (using PHP) to get the job done without having to rewrite the entire file every time a line or two need to be replaced or removed?

Thanks! I looked around and could not find an answer to this on stackoverflow.

like image 392
Kovo Avatar asked Jan 17 '23 00:01

Kovo


2 Answers

If the replacement is EXACTLY the same size as the original line, then you can simply fwrite() at that location in the file and all's well. But if it's a different length (shorter OR longer), you will have to rewrite the portion of the file that comes AFTER the replacement.

There is no way around this. Shorter 'new' lines will leave a gap in the file, and longer 'new' lines would overwrite the first part of the next line.

Basically, you're asking if it's possible to insert a piece of wood in the middle of another board without having to move the original board around.

like image 143
Marc B Avatar answered Jan 26 '23 00:01

Marc B


You can't, because of the way files are stored on common filesystems. A file always takes up one or more 'blocks' of disk space, where blocks are for example 4096 bytes in size. A file that has one byte of data, will still occupy one whole block (consuming 4096 bytes of available disk space), while a file of 4097 bytes will occupy two blocks (taking up 8192 bytes).

If you remove one byte from a file, there will be a gap of one byte inside one of the blocks it occupies, which is not possible to store on disk. You have to shift all other bytes one byte to the beginning of the file, which will affect the current and all following blocks.

The other way around, adding bytes in the middle of a block, shows the same problem: you'll have one or more bytes that don't fit in the 4096 bytes anymore, so they'll have to shift into the next block, and so on, until the end of the file (and all blocks) has been reached.

The only place where you can have non-occupied bytes in a block, is at the end of the last block that forms a file.

like image 33
CodeCaster Avatar answered Jan 25 '23 22:01

CodeCaster