Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is CodeIgniter exhausing allowed memory size?

I am getting a memory exhausted error where I should not be taking up any memory!

The application is on Windows 8 Server / IIS i / PHP 5.5/ CodeIgniter / MS SQL Server

The error is as follows:

[23-May-2014 10:56:57 America/New_York] PHP Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 1992 bytes) in C:\inetpub\wwwroot\application\models\DW_import.php on line 112

[23-May-2014 11:07:34 America/New_York] PHP Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 2438 bytes) in C:\inetpub\wwwroot\application\models\DW_import.php on line 113

The script looks for several different CSV files in a directory to import into a database. Keep in mind the import files are huge, some up to 4 Gigs of data. So far as I can see, there are not variables that continually aggregating data that could lead to this issue. The script (model) being run (there is no view for this controller, only model) is as follows:

DW_import.php

<?php

class dw_import extends CI_Model {

    public function import(){

        global $file,$errLogFile,$logFile,$tableName, $fieldList, $file, $count, $line, $query;

        $this->load->database(); // init db connection

        // map file types to database tables
        $fileToDBArr = array(
            'Customers' => 'customer',
            'Customers_Historical' => 'customer_historical',
            'Orders' => 'order',
            'Customer_AR_Aggs' => 'customer_ar_aging_agg'
        );

        // extend timeout of this script
        ini_set('max_execution_time', 3600);

        // error handler to log errors and continue processing
        function myErrorHandler($errno,$errstr,$errfile,$errline){

            global $file,$errLogFile,$logFile,$tableName, $fieldList, $file, $count, $line, $query;
            // error - store in DB
            //echo "<br>[$errno $errstr $errfile $errline $tableName $file $count] $errLogFile<br>";
            $err = "#$errno $errstr $errfile on line $errline :: Table $tableName File $file Row# $count Headers: $fieldList Data: $line";
            echo $err;
            file_put_contents($errLogFile,$err,FILE_APPEND);
        };

        set_error_handler("myErrorHandler");

        // set temp error log file
        $errLogFile = "C:/Data_Updates/logs/general." . date('YmdHis') . ".errLog";

        // loop thru file types
        foreach($fileToDBArr as $fileType=>$table){
            // get the files for this import type
            $fileArr = glob('C:/Data_Updates/'.$fileType.'.*');
            sort($fileArr,SORT_STRING); // sort so earlier files (by date in file name) will process first

            // loop thru files found
            foreach($fileArr as $file){
                // set log file paths specific to this import file
                $errLogFile = str_replace('Data_Updates/','Data_Updates/logs/',$file) . "." . date('YmdHis') . ".errLog";
                $logFile = str_replace('Data_Updates/','Data_Updates/logs/',$file) . "." . date('YmdHis') . ".log";

                file_put_contents($logFile,"---BEGIN---",FILE_APPEND); // log

                // lets get the file type and translate it into a table name
                preg_match('/C:\/Data_Updates\/([^\.]+)/',$file,$matches);
                $fileType = $matches[1];
                $tableName = $fileToDBArr[$fileType];


                // lets get the first row as a field list
                $fp = fopen($file,'r');
                //$fieldList = str_replace('"','',fgets($fp));

                // counters to track status
                $count = 0;
                $startPoint = 0;

                // see if continuation, set startPoint to last row imported from file
                $query = "SELECT max(import_line) as maxline FROM $tableName WHERE import_file = '" . addslashes($file) . "'";
                $result = $this->db->query($query);

                foreach($result->result() as $row) $startPoint = $row->maxline+1; // set the startPoint if this is continuation

                file_put_contents($logFile,"\nstartPoint $startPoint",FILE_APPEND); // log      

                // loop thru file lines
                while (!feof($fp)) {
                    $line = fgets($fp);
                    // reformat those pesky dates from m/d/y to y-m-d
                    $line = preg_replace('/, ?(\d{1,2})\/(\d{1,2})\/(\d{4})/',',${3}-${1}-${2}',$line);

                    if(!$count){
                        // header row - set aside to use for column headers on insert statements
                        $fieldList = str_replace('"','',$line);
                        file_put_contents($logFile,"\nHeaders: $fieldList",FILE_APPEND); // log
                    } elseif($count >= $startPoint && trim($line)) {

                        // data row - insert into DB
                        $lineArr = str_getcsv($line); // turn this CSV line into an array
                        // build the insert query
                        $query = "INSERT INTO $tableName ($fieldList,import_date,import_file,import_line)
                        VALUES (";
                        foreach($lineArr as $k=>$v) $query .= ($v !== '') ? "'".addslashes(utf8_encode($v))."'," : " NULL,";
                        $query .= "now(),'" . addslashes($file). "',$count)
                        ON DUPLICATE KEY UPDATE ";
                        foreach(explode(',',$fieldList) as $k=>$v) $query .= "\n$v=" . (($lineArr[$k] !== '') ? "\"" . addslashes(utf8_encode($lineArr[$k])) . "\"" : "NULL") . ", ";
                        $query .= "import_date = now(),import_file='" . addslashes($file) . "',import_line = $count ";


                        if(!$this->db->query($query)) {
                            trigger_error('db error ' . $this->db->_error_number() . ' ' . $this->db->_error_message());
                            $status = 'error ';
                        } else {
                            $status = 'success ';   
                        };

                        file_put_contents($logFile,"row: $count status: $status data: $line",FILE_APPEND); // log'

                    } else {
                        // skipped - this row was already imported from this file
                        // removed log to speed up
                        file_put_contents($logFile,"row: $count status: SKIPPED data: $line",FILE_APPEND); // log
                    }; // if $count
                    $count++;
                }; // while $fp
                fclose($fp);

                // file complete - move file to archive
                rename($file,str_replace('Data_Updates/','Data_Updates/archive/',$file));
                file_put_contents($logFile,"-- END --",FILE_APPEND); // log
            }; // each $fileArr

        }; // each $globArr

    } // end import function
} // end class 

?>

Any help would be appreciated!

******** EDIT


Based on recommendations from several folks, I have added some changes. These changes only effect the "data row insert to DB" section of the loop logic. You can see the addition of logging to track the memory_get_peak_usage, addition of unset() and clearcachestat(). below the code is some of the log data:

                        file_put_contents($logFile,memory_get_peak_usage() . " line 1 \n\r",FILE_APPEND); 
                        // data row - insert into DB
                        if(isset($lineArr)) unset($lineArr); 
                        file_put_contents($logFile,memory_get_peak_usage() . " line 1.1 \n\r",FILE_APPEND);
                        $lineArr = str_getcsv($line); // turn this CSV line into an array
                        // build the insert query
                        file_put_contents($logFile,memory_get_peak_usage() . " line 2 lineArr size: " . strlen(implode(',',$lineArr)) . "\n\r",FILE_APPEND);
                        if(isset($query)) unset($query);  
                        file_put_contents($logFile,memory_get_peak_usage() . " line 2.1 lineArr size: " . strlen(implode(',',$lineArr)) . "\n\r",FILE_APPEND);
                        $query = "INSERT INTO $tableName ($fieldList,import_date,import_file,import_line)
                        VALUES (";
                        file_put_contents($logFile,memory_get_peak_usage() . " line 2.2 lineArr size: " . strlen(implode(',',$lineArr)) . "\n\r",FILE_APPEND);
                        foreach($lineArr as $k=>$v) $query .= ($v !== '') ? "'".addslashes(utf8_encode($v))."'," : " NULL,";
                        $query .= "now(),'" . addslashes($file). "',$count)
                        ON DUPLICATE KEY UPDATE ";
                        file_put_contents($logFile,memory_get_peak_usage() . " line 2.3 lineArr size: " . strlen(implode(',',$lineArr)) . "\n\r",FILE_APPEND);

                        foreach(explode(',',$fieldList) as $k=>$v) $query .= "\n$v=" . (($lineArr[$k] !== '') ? "\"" . addslashes(utf8_encode($lineArr[$k])) . "\"" : "NULL") . ", ";
                        file_put_contents($logFile,memory_get_peak_usage() . " line 2.4 lineArr size: " . strlen(implode(',',$lineArr)) . "\n\r",FILE_APPEND);
                        $query .= "import_date = now(),import_file='" . addslashes($file) . "',import_line = $count ";
                        file_put_contents($logFile,memory_get_peak_usage() . " line 3 query size: " . strlen($query) . "\n\r",FILE_APPEND);

                        if(!$this->db->query($query)) {
                            trigger_error('db error ' . $this->db->_error_number() . ' ' . $this->db->_error_message());
                            $status = 'error ';
                        } else {
                            $status = 'success ';   
                        };

                        clearstatcache();

Log data:(leftmost number is the result of the memory_get_peak_usage() call

2724960 line 1.1 
2724960 line 2 lineArr size: 194
2724960 line 2.1 lineArr size: 194
2724960 line 2.2 lineArr size: 194
2724960 line 2.3 lineArr size: 194
2727392 line 2.4 lineArr size: 194
2727392 line 3 query size: 2346

2727392 line 1 
2727392 line 1.1 
2727392 line 2 lineArr size: 194
2727392 line 2.1 lineArr size: 194
2727392 line 2.2 lineArr size: 194
2727392 line 2.3 lineArr size: 194
2729944 line 2.4 lineArr size: 194
2729944 line 3 query size: 2346

2729944 line 1 
2729944 line 1.1 
2729944 line 2 lineArr size: 194
2729944 line 2.1 lineArr size: 194
2729944 line 2.2 lineArr size: 194
2729944 line 2.3 lineArr size: 194
2732448 line 2.4 lineArr size: 194
2732448 line 3 query size: 2346

2732448 line 1.1 
2732448 line 2 lineArr size: 194
2732448 line 2.1 lineArr size: 194
2732448 line 2.2 lineArr size: 194
2732448 line 2.3 lineArr size: 194
2735088 line 2.4 lineArr size: 194
2735088 line 3 query size: 2346

NOTE that the memory is still growing between line 2.3 and 2.4 which is the following line of code:

foreach(explode(',',$fieldList) as $k=>$v) $query .= "\n$v=" . (($lineArr[$k] !== '') ? "\"" . addslashes(utf8_encode($lineArr[$k])) . "\"" : "NULL") . ", ";

Any ideas?

like image 869
mwex501 Avatar asked May 23 '14 16:05

mwex501


1 Answers

Found the answer:

$this->load->database(); // init db connection, already in code
$this->db->save_queries = false; // ADD THIS LINE TO SOLVE ISSUE

This is a lovely undocumented setting in CodeIgniter. CI apparently saves queries by default, even a certain amount of data is saved relative to insert / update queries. With the massive amount of inserts run during this import process, this memory leak became very consequential. Setting CI to not save queries solved the problem.

What threw me off was that memory_get_peak_usage() was reporting the memory usage increasing before the insert query was run, not during it (PHP bug?).

As a final reality check, I removed all the other optimization recommendations (unset, clearstatcache, etc) and verified that they had no positive impact on the memory issue.

like image 52
mwex501 Avatar answered Oct 12 '22 09:10

mwex501