Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the best (most efficient) way to search for content in a file and change it with PHP? [duplicate]

Tags:

file

php

io

I have a file that I'm reading with PHP. I want to look for some lines that start with some white space and then some key words I'm looking for (for example, "project_name:") and then change other parts of that line.

Currently, the way I handle this is to read the entire file into a string variable, manipulate that string and then write the whole thing back to the file, fully replacing the entire file (via fopen( filepath, "wb" ) and fwrite()), but this feels inefficient. Is there a better way?

like image 824
Don Rhummy Avatar asked Oct 22 '22 08:10

Don Rhummy


1 Answers

Update: After finishing my function I had time to benchmark it. I've used a 1GB large file for testing but the results where unsatisfying :|

Yes, the memory peak allocation is significantly smaller:

  • standard solution: 1,86 GB
  • custom solution: 653 KB (4096 bytes buffersize)

But compared to the following solution there is just a slight performance boost:

ini_set('memory_limit', -1);

file_put_contents(
    'test.txt',
    str_replace('the', 'teh', file_get_contents('test.txt'))
);

the script above tooks ~16 seconds, the custom solution took ~13 seconds.

Resume: The custome solution is slight faster on large files and consumes much less memory(!!!).

Also if you want to run this in a web server environment the custom solution is better as many concurrent scripts would likely consume the whole available memory of the system.


Original Answer:

The only thing that comes in mind, is to read the file in chunks which fit the file systems block size and write the content or modified content back to a temporary file. After finish processing you use rename() to overwrite the original file.

This would reduce the memory peak and should be significantly faster if the file is really large.

Note: On a linux system you can get the file system block size using:

sudo dumpe2fs /dev/yourdev | grep 'Block size'

I got 4096

Here comes the function:

function freplace($search, $replace, $filename, $buffersize = 4096) {

    $fd1 = fopen($filename, 'r');
    if(!is_resource($fd1)) {
        die('error opening file');
    }   

    // the tempfile can be anywhere but on the same partition as the original
    $tmpfile = tempnam('.', uniqid());
    $fd2 = fopen($tmpfile, 'w+');

    // we store len(search) -1 chars from the end of the buffer on each loop
    // this is the maximum chars of the search string that can be on the 
    // border between two buffers
    $tmp = ''; 
    while(!feof($fd1)) {
        $buffer = fread($fd1, $buffersize);
        // prepend the rest from last one
        $buffer = $tmp . $buffer;
        // replace
        $buffer = str_replace($search, $replace, $buffer);
        // store len(search) - 1 chars from the end of the buffer
        $tmp = substr($buffer, -1 * (strlen($search)) + 1); 
        // write processed buffer (minus rest)
        fwrite($fd2, $buffer, strlen($buffer) - strlen($tmp));
    };  

    if(!empty($tmp)) {
        fwrite($fd2, $tmp);
    }   

    fclose($fd1);   
    fclose($fd2);
    rename($tmpfile, $filename);
}

Call it like this:

freplace('foo', 'bar', 'test.txt');
like image 102
14 revs Avatar answered Oct 27 '22 09:10

14 revs