Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing a line from the middle of a large file php

Tags:

file

php

I am trying to remove a line from the middle of a large file. ( > 20MB). I know the position in the file of the start of the line to be removed.

Here is what I currently have.

/**
 * Removes a line at a position from the file
 * @param  [int] $position  The position at the start of the line to be removed
 */
public function removeLineAt($position)
{
    $fp = fopen($this->filepath, "rw+");
    fseek($fp, $position);

    $nextLinePosition = $this->getNextLine($position, $fp);
    $lengthRemoved = $position - $nextLinePosition;
    $fpTemp = fopen('php://temp', "rw+");

    // Copy the bottom half (starting at line below the line to be removed)
    stream_copy_to_stream($fp, $fpTemp, -1, $nextLinePosition);

    // Seek to the start of the line to be removed
    fseek($fp, $position);
    rewind($fpTemp);

    // Copy the bottom half over the line to be removed
    stream_copy_to_stream($fpTemp, $fp);        

    fclose($fpTemp);
    fclose($fp);
}

However, while the code above does indeed remove the line from the file; since the temporary file is shorter than the original file. The tail end of the original file still exists and is doubled.

For Example: Original file was

  1. a
  2. b
  3. c
  4. d
  5. e

The file after removing the line may look like

  1. a
  2. b
  3. d
  4. e
  5. e

I have thought about somehow trimming the end of the main file by the $lengthRemoved amount however I can't think of an easy way to do that either.

Any suggestions?

  • Note: The file has > 200,000 lines, sometimes > 300,000. I feel as if loading the entire file into an array (memory) would be pretty inefficient. Which is why I tried the above approach but ran into that one issue

For others who are looking for an answer here is the final function I came up with thanks to your help! Modify it to fit your needs.

/**
 * Removes a line at a position from the file
 * @param  [int] $position  The position at the start of the line to be removed
 */
public function removeLineAt($position)
{
    $fp = fopen($this->filepath, "rw+");
    fseek($fp, $position);

    $nextLinePosition = $this->getNextLine($position, $fp);
    $lengthRemoved = $position - $nextLinePosition;
    $fpTemp = fopen('php://temp', "rw+");

    // Copy the bottom half (starting at line below the line to be removed)
    stream_copy_to_stream($fp, $fpTemp, -1, $nextLinePosition);

    // Remove the difference
    $newFileSize = ($this->totalBytesInFile($fp) + $lengthRemoved);
    ftruncate($fp, $newFileSize);

    // Seek to the start of the line to be removed
    fseek($fp, $position);
    rewind($fpTemp);

    // Copy the bottom half over the line to be removed
    stream_copy_to_stream($fpTemp, $fp);        

    fclose($fpTemp);
    fclose($fp);
}
like image 850
user4775085 Avatar asked Sep 03 '15 16:09

user4775085


2 Answers

Since your file is very large you may want to use the sed command via exec if your php install will allow you to use that function.

exec("sed '3d' fileName.txt");

Where the 3 indicates the desired line number.

like image 74
cmorrissey Avatar answered Oct 02 '22 12:10

cmorrissey


I think you are pretty close to a solution.

I would stick to your idea of removing the $lengthRemoved from the end of the file and would suggest to use ftruncate($handle, $size); before the fclose(), where size is the size to truncate to (size = originalFilesize - lengthRemoved).

http://www.php.net/manual/en/function.ftruncate.php

like image 35
Jens A. Koch Avatar answered Oct 02 '22 14:10

Jens A. Koch