Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting one line in a huge file with PHP

Tags:

php

How can i get a particular line in a 3 gig text file. The lines are delimited by \n. And i need to be able to get any line on demand.

How can this be done? Only one line need be returned. And i would not like to use any system calls.

Note: There is the same question elsewhere regarding how to do this in bash. I would like to compare it with the PHP equiv.

Update: Each line is the same length the whole way thru.

like image 417
JavaRocky Avatar asked May 08 '10 12:05

JavaRocky


2 Answers

Without keeping some sort of index to the file, you would need to read all of it until you've encountered x number of \n characters. I see that nickf has just posted some way of doing that, so I won't repeat it.

To do this repeatedly in an efficient manner, you will need to build an index. Store some known file positions for certain (or all) line numbers once, which you can then use to seek to the right location using fseek.

Edit: if each line is the same length, you do not need the index.

$myfile = fopen($fileName, "r");
fseek($myfile, $lineLength * $lineNumber);
$line = fgets($myfile);
fclose($myfile);

Line number is 0 based in this example, so you may need to subtract one first. The line length includes the \n character.

like image 121
Thorarin Avatar answered Oct 19 '22 23:10

Thorarin


There is little discussion of the problem and no mention is made of how the 'one line' should be referenced (by number, some value within it, etc.) so below is just a guess as to what you're wanting.

If you're not averse to using an object (it might be 'too high level', perhaps) and wish to reference the line by offset, then SplFileObject (available as of PHP 5.1.0) could be used. See the following basic example:

$file = new SplFileObject('myreallyhugefile.dat');
$file->seek(12345689); // seek to line 123456790
echo $file->current(); // or simply, echo $file

That particular method (seek) requires scanning through the file line-by-line. However, if as you say all the lines are the same length then you can instead use fseek to get where you want to go much, much faster.

$line_length = 1024; // each line is 1 KB line
$file->fseek($line_length * 1234567); // seek lots of bytes
echo $file->current(); // echo line 1234568
like image 42
salathe Avatar answered Oct 19 '22 23:10

salathe