Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading large text files efficiently

I have a couple of huge (11mb and 54mb) files that I need to read to process the rest of the script. Currently I'm reading the files and storing them in an array like so:

$pricelist = array();
$fp = fopen($DIR.'datafeeds/pricelist.csv','r');
while (($line = fgetcsv($fp, 0, ",")) !== FALSE) { 
    if ($line) { 
        $pricelist[$line[2]] = $line;
    }
}
fclose($fp);

.. but I'm constantly getting memory overload messages from my webhost. How do I read it more efficiently?

I don't need to store everything, I already have the keyword which exactly matches the array key $line[2] and I need to read just that one array/line.

like image 881
eozzy Avatar asked Mar 26 '15 20:03

eozzy


People also ask

How can I read a large text file?

The best way to view extremely large text files is to use… a text editor. Not just any text editor, but the tools meant for writing code. Such apps can usually handle large files without a hitch and are free. Large Text File Viewer is probably the simplest of these applications.

How do I read a 100gb file in Python?

Method 1: The first approach makes use of iterator to iterate over the file. In this technique, we use the fileinput module in Python. The input() method of fileinput module can be used to read files.


1 Answers

If you know the key why don't you filter out by the key? And you can check memory usage with memory_get_usage() function to see how much memory allocated after you fill your $pricelist array.

echo memory_get_usage() . "\n";
$yourKey = 'some_key';
$pricelist = array();
$fp = fopen($DIR.'datafeeds/pricelist.csv','r');
while (($line = fgetcsv($fp, 0, ",")) !== FALSE) { 
    if (isset($line[2]) && $line[2] == $yourKey) { 
        $pricelist[$line[2]] = $line;
        break;
        /* If there is a possiblity to have multiple lines
        we can store each line in a separate array element
        $pricelist[$line[2]][] = $line;
        */
    }
}
fclose($fp);
echo memory_get_usage() . "\n";
like image 63
Ugur Avatar answered Sep 23 '22 19:09

Ugur