Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using a distributed processing system for FOREGROUND requests in PHP

I'm familiar with php-resque and other job-processing systems for handling background jobs, but I don't think it'll do what I need.

In this case, I have an incoming web service request that needs to perform multiple (2-4) independent callouts to external systems and return with a consolidated response to the client. Each callout might take 300-500ms, so I want each callout to be performed in parallel so that the entire process takes no more than 500ms+/- total.

my problem with php-resque and other systems is that waiting even 1 second to start issuing those callouts is too long to wait, and I'm considering another approach.

What I'm thinking:

  1. each individual callout is described and stored in a database with a given unique request ID
  2. we kick off the jobs immediately as a asynchronous php process (aka "worker process")
  3. Each worker writes its result back to the job record and indicates that it's complete
  4. meanwhile, we poll the job table every 50-100ms to check on the status of each job
  5. when each is complete, we parse the results as necessary and return the response.

Of course, we'd implement a timeout for each request and the overall process...

Thoughts? Am I wrong? Could php-resque kick off multiple jobs in parallel virtually instantly?

like image 772
rhuff Avatar asked Oct 04 '12 07:10

rhuff


2 Answers

Your plan should work, but I think you could avoid all the database communication and even polling by using the PHP Process Control functions.

  1. Fork your main process as many times as tasks you need it to run in parallel. See: pcntl_fork
  2. Perform your tasks in those forked processes and let them exit normally.
  3. The process that initiates the tasks should wait for them to all complete by listening for their SIGCHLD signals as they exit. Or if they don't before your chosen timeout, then send the SIGTERM signal to them to clean up. See: pcntl_sigtimedwait and posix_kill.

You will have to use these functions in a PHP CLI script, though, because...

http://www.php.net/manual/en/intro.pcntl.php

Process Control should not be enabled within a web server environment and unexpected results may happen if any Process Control functions are used within a web server environment.

But your web server could easily exec() your CLI script, which will do all the hard work, return the status of those tasks, etc.

like image 106
jimp Avatar answered Oct 26 '22 02:10

jimp


If your external calls are simply HTTP requests, you could simply use curl and do multiple requests. They seem to do exactly what you need.

If it's something different, I can highly recommend Gearman.

If you want to get your hands dirty and write your own daemon, I would suggest skipping the IPC functions, and go for something a bit more high-level, such as ZeroMQ, and perhaps use Supervisord to restart the PHP processes if they die. It's relatively hard to write long-running PHP processes, so you have to build this with the notion in place that the external scripts will die randomly and prepare to handle that gracefully.

like image 44
Evert Avatar answered Oct 26 '22 01:10

Evert