Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Handle Queue race condition in PHP Symfony with MySQL Database

I've an application in Symfony that needs to send Emails/Notificatios from the App. Since the Email/Notifications sending process takes time, so I decided to put them in Queue and process the Queue periodically. Hence I can decrease the response time for the Requests involving the Email/Notification dispatch.

The Cron Job(a php script - Symfony route) to process the queue runs every 30 seconds, and checks if there are any unsent Emails/Notifications if found it gets all data from the Queue Table and starts sending them. When an Email/Notification is sent, the Queue Table row status flag is updated to show that it's sent.

Now, when there are more Emails in Queue which could take more than 30 seconds to send. Another Cron Job also start running and starts sending emails from the Queue. Hence resulting in duplicate Emails/Notifications dispatch.

My Table structure for Email Queue is as follows :

|-------------------------------------|
| id | email | body | status | sentat |
|-------------------------------------|

My Ideas to resolve this issue are as follows :

  1. Set a flag in Database that a Cron Job is running, and no other Cron Jobs should proceed if found the flag set.
  2. Update status as 'sent' for all records and then start sending Emails/Notifications.

So my question is, are there any efficient approach to process Queues? Is there any Symfony Bundle/Feature to do such specific task?

like image 798
Deepak Chaudhary Avatar asked Sep 06 '16 10:09

Deepak Chaudhary


3 Answers

So my question is, are there any efficient approach to process Queues? Is there any Symfony Bundle/Feature to do such specific task?

You can take enqueue-bundle plus doctrine dbal transport.

It already takes care of race conditions and other stuff.

like image 131
Maksim Kotlyar Avatar answered Nov 14 '22 16:11

Maksim Kotlyar


Regarding your suggestions:

  1. What if the cronjob process dies (for whatever reason) and cannot clean up the flag? A flag is not a good idea, I think. If you would like to follow this approach, you should not use a boolean, but rather either a process ID or a timestamp, so that you can check if the process is still alive or if it started a suspiciously long time ago without cleaning up.

  2. Same question: what if the process dies? You don’t want to mark the mails as sent before they are sent.

I guess I’d probably use two fields: one for marking a record as “sending in progress” (thus telling other processes to skip this record) and another one for marking it as “sending successfully completed”. I’d write a timestamp to both, so that I can (automatically or manually) find those records where the “sending in progress” is > X seconds in the past, which would be an indicator for a died process.

like image 23
BlueM Avatar answered Nov 14 '22 15:11

BlueM


You can use Database Transactions here. Rest will be handled by database locking mechanism and concurrency control. Generally whatever DML/DCL/DDL commands you are giving, are treated as isolated Transactions. In your Question, if 2nd cron job will read the rows(before 1st cron job will update it as sent) , it will find the email unsent, and try to send it again. and before 2nd cron job will update it as sent, if 3rd job will find it unsent, it will do same. So it can cause big problem for you.

whatever approach you will take, there will be Race Condition. So let the database allow to do it. there are many concurrency control methods you can refer.

BEGIN_TRANSACTION

/* Perform your actions here. N numbers of read/write */

END_TRANSACTION

Still there is one problem with this solution. You will find at one stage that, when number of read/write operation will increase, some inconsistency still remains.

Here comes isolation level of the database, It is the factor that will define how much 2 transactions are isolated from each other, and how to schedule them to run concurrently.

You can set isolation level as per your requirements. Remember that, concurrency is inversely proportional to isolation level. So analyse your Read/Write statements, figure out which level you need. Do not use higher level then that. I am giving some links, which may help you

http://www.ibm.com/developerworks/data/zones/informix/library/techarticle/db_isolevels.html

Difference between read commit and repeatable read

http://dev.mysql.com/doc/refman/5.7/en/innodb-transaction-isolation-levels.htm

If you can post your database operations here. I can suggest you some possible isolation level

like image 1
Nikunj Acharya Avatar answered Nov 14 '22 16:11

Nikunj Acharya