Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

using PHP & MySQL... How do I free memory?

Requirement:

We have two similar tables in two servers. First table in server having unique key columns A,B,C and we are inserting Table1 rows into Table2 which having unique key columns B,C,D.

Table1 is having approx 5 millions rows and Table2 will insert approx .3 millions rows due to different unique key columns constraints.

Here requirement is to fetch all the rows from Table1 and insert into Table2 if no same record is existing in Table2 and in case of record match, increase the count and update the 'cron_modified_date' column in Table2.

PHP version is 5.5 and MySQL version is 5.7 for this setup and DB server is having 6 GB RAM.

While executing below script, processing speed is getting very slow after processing 2 millions records and RAM is not freeing up and after sometime all RAM is consumed by script and after that script is not processing at all.

As you can see, I am resetting the variables and closing the DB connection as well but its not freeing up the DB server RAM. After some reading, i came to know, may be PHP garbage collection needs to call manually to free up the resources but its also not freeing up the RAM.

What I am doing wrong here and how to process millions records using the PHP, MYSQL?

Any other way to free up the RAM while executing the script and so that script should compete the execution?

/* Fetch records count for batch insert*/

$queryCount = "SELECT count(*) as totalRecords FROM TABLE1 where created_date > = '2018-02-10'";
$rowsCount = $GLOBALS['db']->execRaw( $queryCount)->fetchAll();

$recordsPerIteration = 50000 ;
$totalCount = $rowsCount[0]['totalRecords']; 
$start = 0;

gc_disable() ;
if ( $totalCount > 0 ) {
    while ( $totalCount > 0 ) {
    $query = "SELECT *  FROM TABLE1
                WHERE where created_date > = '2018-02-10'
                ORDER BY suggestion_id DESC 
                LIMIT ".$start.",".$recordsPerIteration;

    print "sql is $query" ;

    $getAllRows = $GLOBALS['db']->execRaw( $query )->fetchAll();
    $GLOBALS['db']->queryString = null;
    $GLOBALS['db']->close() ;

    foreach ($getAllRows as  $getRow) {

        $insertRow  = " INSERT INTO TABLE2 (
                            Name,
                            Company,
                            ProductName,
                            Status,
                            cron_modified_date)
                VALUE (   
                            ".$GLOBALS['db_ab']->quote($getRow['Name']).", 
                            ".$GLOBALS['db_ab']->quote($getRow['Company']).", 
                            ".$GLOBALS['db_ab']->quote($getRow['ProductName']).",
                            ".$getRow['Status'].",
                            ".$GLOBALS['db_ab']->quote($getRow['created_date'])."
                        )
                    ON DUPLICATE KEY UPDATE count = (count + 1) , cron_modified_date =  '".$getRow['created_date']."'" ;

                $GLOBALS['db_ab']->execRaw( $insertRow ) ;
                $GLOBALS['db_ab']->queryString = null;
                $getRow = null;
                $insertRow = null;
                $GLOBALS['db_ab']->close() ;
           }
          gc_enable() ;
          $totalCount   = $totalCount- $recordsPerIteration;
          $start        += $recordsPerIteration ;
          $getAllRows = null;
          gc_collect_cycles() ;
    }

}

Solution


After suggestions provided by @ABelikov and few hit & trail methods... Finally below code is working perfectly fine and freeing up the RAM after every 50K records inserts.

Below are major findings

  • Free up the DB connections variables after every major operations which involves large data operations and reconnect the DB so that DB buffer flushed out.
  • Club the insert statements and execute the inserts in one go. Don't execute single record insertion in loop.

    Thanks guys for valuable suggestions and helping out.

    /* Fetch records count for batch insert*/
    
    
    $queryCount = "SELECT count(*) as totalRecords FROM TABLE1 where created_date > = '2018-02-10'";
    $rowsCount = $GLOBALS['db']->execRaw( $queryCount)->fetchAll();
    
    $recordsPerIteration = 50000 ;
    $totalCount = $rowsCount[0]['totalRecords']; 
    $start = 0;
    
    if ( $totalCount > 0 ) {
       while ( $totalCount > 0 ) {
           $query = "SELECT *  FROM TABLE1
                WHERE where created_date > = '2018-02-10'
                ORDER BY suggestion_id DESC 
                LIMIT ".$start.",".$recordsPerIteration;
    
    print "sql is $query" ;
    
    $getAllRows = $GLOBALS['db']->execRaw( $query )->fetchAll();
    $GLOBALS['db']->queryString = null;
    $GLOBALS['db']->close() ;
    
    $insertRow  = " INSERT INTO TABLE2 (
                            Name,
                            Company,
                            ProductName,
                            Status,
                            cron_modified_date)
                VALUE (  " ;
    
    
    foreach ($getAllRows as  $getRow) {
    
    
            $insertRow  .= (".$GLOBALS['db_ab']->quote($getRow['Name']).", 
                            ".$GLOBALS['db_ab']->quote($getRow['Company']).", 
                            ".$GLOBALS['db_ab']->quote($getRow['ProductName']).",
                            ".$getRow['Status'].",
                            ".$GLOBALS['db_ab']->quote($getRow['created_date'])."),";
            }
    
    $insertRow=rtrim($insertRow,','); // Remove last ','
    $insertRow.= " ON DUPLICATE KEY UPDATE count = (count + 1) , cron_modified_date =  '".$getRow['created_date']."'" ;
    
    $GLOBALS['db_ab']->execRaw( $insertRow ) ;              
    //Flushing all data to freeup RAM
    $GLOBALS['db_ab'] = null ;
    $GLOBALS['db'] = null ;
    $insertRow = null;
    
    $totalCount = $totalCount- $recordsPerIteration;
    $start      += $recordsPerIteration ;
    $getAllRows = array();
    $getAllRows = null;
    print " \n Records needs to process ".$totalCount."\n";
    
    }
    
    }
    
like image 521
EternalSunShine Avatar asked Nov 08 '22 10:11

EternalSunShine


1 Answers

1. Insert multiple rows solution

You can speed up you script by using "insert multiple rows" see here https://dev.mysql.com/doc/refman/5.5/en/insert.html

INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);

You need keep only VALUES part in your foreach and move out all other

 $insertRow  = " INSERT INTO TABLE2 (
                             Name,
                             Company,
                             ProductName,
                             Status,
                             cron_modified_date) VALUES ";
 foreach ($getAllRows as  $getRow) {
     $insertRow.="(".$GLOBALS['db_ab']->quote($getRow['Name']).",
                   ".$GLOBALS['db_ab']->quote($getRow['Company']).", 
                   ".$GLOBALS['db_ab']->quote($getRow['ProductName']).",
                   ".$getRow['Status'].",
                   ".$GLOBALS['db_ab']->quote($getRow['created_date'])."),";

 }
 $insertRow=rtrim($insertRow,','); // Remove last ','
 $insertRow .= " ON DUPLICATE KEY UPDATE count = (count + 1) , cron_modified_date =  '".$getRow['created_date']."'" ;
 $GLOBALS['db_ab']->execRaw( $insertRow ) ;
 $GLOBALS['db_ab']->queryString = null;
 $getRow = null;
 $insertRow = null;
 $GLOBALS['db_ab']->close() ;

That will help only if your foreach "body" usually runs more than one time

2. MySQL sever-side solution

Try to use TRANSACTION https://dev.mysql.com/doc/refman/5.7/en/commit.html http://php.net/manual/en/pdo.begintransaction.php

Just begin one at start of script and commit at end. Depend on you server it can realy help. But be careful! It depend on your MySQL server config sets. Need testing.

like image 81
ABelikov Avatar answered Nov 14 '22 23:11

ABelikov