Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Queuing long running tasks in a web application

A user can perform an action on our web app which takes anywhere from 100ms to 10 seconds, I wish to return a result to the browser immediately and then show the results to the user once the task has finished processing. The action is syncing data from a third party and is implemented as a class library (DLL).

Normally it's suggested to use a queue like RabbitMQ or MSMQ and have a worker which writes the results to a database which is polled by an AJAX request from the browser to check for updates.

However the aim is to reduce the latency so it's as close to running the task synchronously as possible while being able to handle spikes in processing the long running task without affecting the rest of the website.

How should the backend be architected? In my mind, the process would be: starting the task, running the task with minimal latency, notifying the end user the task is finished (ASAP) and finally displaying results in the browser.

Long Running Task. Credits: Haishi. Source: <code>http://haishibai.blogspot.co.uk/2012/12/dealing-with-long-running-jobs.html</code>

Examples

Generating sitemaps with http://www.xml-sitemaps.com/ uses chunked transfer encoding to send a <script> tag every second to call a Javascript function to update the page with the latest status.

Checking SSL certificates with https://www.ssllabs.com/ssltest/ seems to refresh the whole page with an updated status.

like image 582
Marcus Avatar asked Aug 18 '14 22:08

Marcus


People also ask

How do you handle long-running tasks?

The recommended way to handle long-running tasks is to use an asynchronous approach. This means that the long-running task is executed in a separate thread, and the UI is not blocked while the task is running.

What is queue in Web server?

The Web-Queue-Worker architecture defines a web portion that handles HTTP requests and a worker portion that handles time or processing-intensive operations. A queue is used for asynchronous communication between the web and the worker.


2 Answers

This situation is relatively simple, and I would not recommend polling at all.

Consider using a regular Ajax approach: part of the page is able to refresh without the rest of the page. So that part (ajax part) is synchronous on its own, but asynchronous from the whole page's point of view (because it refreshes without reloading the whole page).

So, when that information is required to be calculated, ajax part of the page is submitted as a regular request. When the request processing is done, that part of the page has access to the response right away and displays the results.

Advantage is that you don't have polling overhead, as well as the results are displayed on the screen right away (ASAP - as you asked). Also, only one request is working on this, instead of several possibly missed requests when polling.

like image 162
Tengiz Avatar answered Sep 22 '22 20:09

Tengiz


Have you considered using WF4 in conjunction with SignalR?

We use WF4 to handle back end processing and it performs quite nicely. We store requests in a job request table, the workflow engine (a service we wrote that runs wf4 in the backend) picks up the request, processes the work and then marks the job as completed.

SignalR can then be used to inform the client that the job is complete. Scaling is relatively easy (chuckling as I know 'easy' its always fraught with details) as you can spin up more services to process requests. Each engine would mark the request as being processed so the others don't pick it up.

I've used wf4 on large scale projects where the services were load balanced and we were able to get very decent throughput.

like image 34
Dave Avatar answered Sep 20 '22 20:09

Dave