Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Handle same function running and processing the same data at the same time

I have a php system that that allow customer to buy things (make an order) from our system using e-wallet (store credit).

here's the database example

**sales_order**
+--------+-------+----------+--------+--------------+-----------+
|order_id| price |product_id| status |already_refund|customer_id|
+--------+-------+----------+--------+--------------+-----------+
|   1    | 1000  |    1     |canceled|      1       |     2     |
|   2    | 2000  |    2     |pending |      0       |     2     |
|   3    | 3000  |    3     |complete|      0       |     1     | 
+--------+-------+----------+--------+--------------+-----------+

**ewallet**
+-----------+-------+
|customer_id|balance|
+-----------+-------+
|     1     | 43200 |
|     2     | 22500 |
|     3     | 78400 |
+-----------+-------+

table sales_order contain the order that customer made, the column already_refund is for a flag that canceled order already refunded.

I'm running a cron every 5 minutes to check if order with status pending can be canceled and after that it can refund the money to the customer ewallet

function checkPendingOrders(){
   $orders = $this->orderCollection->filter(['status'=>'pending']);
   foreach($orders as $order){
     //check if order is ready to be canceled
     $isCanceled = $this->isCanceled($order->getId());
     if($isCanceled === false) continue;
     if($order->getAlreadyRefund() == '0'){ // check if already refund
       $order->setAlredyRefund('1')->save();
       $this->refund($order->getId()); //refund the money to customer ewallet
     }
     $order->setStatus('canceled')->save();
   }
}

The problem the 2 different cron schedule can process the same data at the same time using this function and it will make the refund process can be called twice , so the customer will receive double refund amount. How can i handle this kind of problem, when a 2 same function running at the same time to process same data ? the if clause that i made can't handle this kind of issue

update

i've tried to use microtime in session as validation and lock the table row in MySQL, so at the beginning i set the variable to contain the microtime , than when i stored in a unique session generated by order_id , and then i add a condition to match the microtime value with the session before locking the Table Row and update my ewallet table

function checkPendingOrders(){
   $orders = $this->orderCollection->filter(['status'=>'pending']);
   foreach($orders as $order){
     //assign unique microtime to session
     $mt = round(microtime(true) * 1000);
     if(!isset($_SESSION['cancel'.$order->getId()])) $_SESSION['cancel'.$order->getId()] = $mt;
     //check if order is ready to be canceled
     $isCanceled = $this->isCanceled($order->getId());
     if($isCanceled === false) continue;
     if($order->getAlreadyRefund() == '0'){ // check if already refund
       $order->setAlreadyRefund('1')->save();
       //check if microtime is the same as the first one that running
       if($_SESSION['cancel'.$order->getId()] == $mt){
        //update using lock row
        $this->_dbConnection->beginTransaction(); 
        $sqlRaws[] =  "SELECT * FROM ewallet WHERE customer_id = ".$order->getCustomerId()." FOR UPDATE;";
        $sqlRaws[] =  "UPDATE ewallet SET balance =(balance+".$order->getPrice().") WHERE customer_id = ".$order->getCustomerId().";";
        foreach ($sqlRaws as $sqlRaw) {
          $this->_dbConnection->query($sqlRaw);
        }
        $this->_dbConnection->commit(); 

       }
     }
     unset($_SESSION['cancel'.$order->getId()]);
     $order->setStatus('canceled')->save();
   }
}

but the problem still persist when i'm doing a strees test, because there is a case when the same function process the same data at the same microtime and start mysql transaction at the same exact time

like image 969
Hunter Avatar asked Aug 21 '19 07:08

Hunter


3 Answers

@Rick James Answer is great as always, he just didn't tell you which data you need to lock.

First just let me comment on what you said

but the problem still persist when i'm doing a strees test,

Concurrency-aware applications are not tested by stress tests only because you are not controlling what is going to happen and you might be unlucky and the test results in the good results, while you still have a sneaky bug in your application - and trust me concurrency bugs are the worst :( -

You need to open 2 clients(DB sessions) and simulate the race condition by your hand, opening 2 connections in MySQL workbench is enough.

Let's do it, open 2 connections in your client (MySQL Workbench or phpMyAdmin) and execute these statements in this order, think of them as your PHP script running at the same time.

**sales_order**
+--------+-------+----------+--------+--------------+-----------+
|order_id| price |product_id| status |already_refund|customer_id|
+--------+-------+----------+--------+--------------+-----------+
|   1    | 1000  |    1     |canceled|      1       |     2     |
|   2    | 2000  |    2     |pending |      0       |     2     |
|   3    | 3000  |    3     |complete|      0       |     1     | 
+--------+-------+----------+--------+--------------+-----------+


(SESSION 1) > select * from sales_order where status = 'pending';
-- result 1 row (order_id 2)
(SESSION 2) > select * from sales_order where status = 'pending';
-- result 1 row (order_id 2)
/*
 >> BUG: Both sessions are reading that order 2 is pending and already_refund is 0

 your session 1 script is going to see that this guy needs to cancel
 and his already_refund column is 0 so it will increase his wallet with 2000
*/
(SESSION 1) > update sales_order set  status = 'canceled' , already_refund = 1
              where  order_id = 2
(SESSION 1) > update ewallet set balance = balance + 2000 where customer_id = 2
/*
 same with your session 2 script : it is going to see that this guy needs
 to cancel and his already_refund column is 0 so it will increase his 
 wallet with 2000
*/
(SESSION 2) > update sales_order set  status = 'canceled' , already_refund = 1
              where  order_id = 2
(SESSION 2) > update ewallet set balance = balance + 2000 where customer_id = 2

Now customer 2 will be happy because of this, and this case is what you asked the question for (imagine if 5 sessions could read the order before it is already_refund is updated to 1 by one of them, customer 2 will be super happy as he is getting 5 * 2000 )

me: Now take your time and think of this scenario, how do you think you can protect yourself from this ? .. ?

you: Locking as @Rick said

me: exactly!

you: ok, now I will go and lock the ewallet table

me : Noo, you need to lock sales_order so SESSION 2 can't read the data until SESSION1 finishes it's work, now let's change the scenario by applying the lock.

(SESSION 1) > START TRANSACTION;
-- MySQL > OK;
(SESSION 2) > START TRANSACTION;
-- MySQL > OK;
(SESSION 1) > select * from sales_order where status = 'pending' FOR UPDATE;
-- MySQL > OK result 1 row (order_id 2)
(SESSION 2) > select * from sales_order where status = 'pending' FOR UPDATE;
-- MySQL > WAAAAAAAAAAAAAAAIT ...... THE DATA IS LOCKED
/*
 now session 2 is waiting for the result of the select query .....

 and session 1 is going to see that this guy needs to cancel and his
 already_refund column is 0 so it will increase his  wallet with 2000
*/
(SESSION 1) > update sales_order set  status = 'canceled' , already_refund = 1
          where  order_id = 2
(SESSION 1) > update ewallet set balance = balance + 2000 where customer_id = 2;
(SESSION 2) >  :/  I am still waiting for the result of the select .....
(SESSION 1) > COMMIT;
-- MySQL > OK , now I will release the lock so any other session can read the data
-- MySQL > I will now execute the select statement of session 2
-- MySQL > the result of the select statement of session 2 is 0 rows
(SESSION 2) >  /* 0 rows ! no pending orders ! 
               Ok just end the transaction, there is nothing to do*/

Now you are happy not customer 2!

Note1:

SELECT * from sales_order where status = 'pending' FOR UPDATE applied in this code might not lock only pending orders as it uses a search condition on status column and not using a unique index

The MySQL manual stated

For locking reads (SELECT with FOR UPDATE or FOR SHARE), UPDATE, and DELETE statements, the locks that are taken depend on whether the statement uses a unique index with a unique search condition, or a range-type search condition.
.......

For other search conditions, and for non-unique indexes, InnoDB locks the index range scanned ...

(and this is one of the most things I hate about MySQL. I wish I lock only the rows returned by the select statement :( )

Note2

I don't know about your application, but if this cron mission is only to cancel the pending orders, then get rid of it and just start the cancellation process when the user cancels his order.

Also if the already_refund column is always updated to 1 along with the status column is updated to canceled then "a canceled order means he is also refunded", and get rid of the already_refund column, extra data = extra work and extra problems


MySQL documentation examples of locking reads scroll down to "Locking Read Examples"

like image 90
Accountant م Avatar answered Oct 09 '22 03:10

Accountant م


If the tables are not already ENGINE=InnoDB, switch tables to InnoDB. See http://mysql.rjweb.org/doc.php/myisam2innodb

Wrap any sequence of operations that needs to be 'atomic' in a "transaction":

START TRANSACTION;
...
COMMIT;

If you have supporting SELECTs in the transaction add FOR UPDATE:

SELECT ... FOR UPDATE;

this blocks other connections.

Check for errors after every SQL statement. If you get a "deadlock" of "wait timeout", the start the transaction over.

Rip out all the "microtime", LOCK TABLES, etc.

The clasic example of a "deadlock" is when one connection grabs two rows and another connection grabs the same rows, but in the opposite order. One of the transactions will be aborted by InnoDB, and anything it has done (inside the transaction) will be undone.

Another thing that can occur is when both connections grab the same rows in the same order. One continues running to completion, while the other is blocked until that completion. There is a default timeout of a generous 50 seconds before an error is given. Normally both go to completion (one after the other) and you are none the wiser.

like image 29
Rick James Avatar answered Oct 09 '22 02:10

Rick James


The microtime idea will add complexity to your code. The $order->getAlreadyRefund() could be getting a value from memory, so it is not a reliable source of truth.

However you can rely on a single update with the conditions that it only updates if the status is still 'pending' and already_refund is still 0. You will have an SQL statement like this:

UPDATE
  sales_order
SET
  status = 'canceled',
  already_refund = %d
where
  order_id = 1
  and status = 'pending'
  and already_refund = 0;

You just need to write a method for your model that will execute the above SQL called setCancelRefund() and you might have something simplier like this:

<?php

function checkPendingOrders() {
   $orders = $this->orderCollection->filter(['status'=>'pending']);

   foreach($orders as $order) {
     //check if order is ready to be canceled
     $isCanceled = $this->isCanceled($order->getId());
     if ($isCanceled === false) {
        continue;
     }

     if ($order->getAlreadyRefund() == '0') { // check if already refund

        // Your new method should do the following
        // UPDATE sales_order SET status = 'canceled', already_refund = 1 where order_id = %d and status = 'pending' and already_refund = 0; 
        $affected_rows = $order->setCancelRefund();        

        if ($affected_rows == 0) {
            continue;
        }

        $this->refund($order->getId()); //refund the money to customer ewallet
     }

   }
}
like image 29
jasonwubz Avatar answered Oct 09 '22 02:10

jasonwubz