Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Run long running express API process in seperate thread NodeJs

I have an API call that takes about 5-10 minutes to process. I set a timeout method around it so that I get an immediate API response with a status of queued.

Simple visual below

doWork(object) => { /*... Takes 5 minutes */ }

app.post('/longProcess',(req,res)=> {
    setTimeout(this.doWork(req.body), 1000);
    res.send({ status: 'queued' });
})

This works for the first request giving an immediate response. But the second request is locked waiting for doWork to finish.

Instead of a using SetTimeout what I'd really like to do is send the longProcess to a separate single-thread that queues and processes theses one by one.

Any suggestions?

like image 619
Proximo Avatar asked Oct 17 '16 11:10

Proximo


1 Answers

The problem

The problem is not that doWork() takes a lot of time, but that it apparently blocks your thread for its entire lifetime and doesn't get the event loop any chance to run.

Probable causes

This can be caused by several things and I can only guess here since you didn't show the source of doWork() or even described what it does and how. For example:

  • Your doWork() may use blocking operations like fs.readFileSync() or other functions with Sync in their name.
  • Your doWork() may have a for or while loop that spins for 5-10 minutes and blocks the event loop while doing so.
  • Your code does some serious number crunching that is not divided into steps to let the event loop roll between those steps.

Generally, your doWork() could take hours to run if it doesn't block your main thread and it shouldn't stop other connections from getting served even for a millisecond.

Solutions

Stop blocking the thread

The simplest solution to that problem could be avoiding the blocking function calls (those with the Sync suffix or your own functions), long running loops and heavy computations that are not divided into short steps.

For examples:

  • Instead of using readFileSync() use readFile()
  • Instead of long running for/while loops, use process.nextTick()
  • Instead of very deep recursion (possibly thanks to TCO), use loops divided into parts with process.nextTick()

If the above solutions cannot be applied (which I have no way of telling since I know nothing about your doWork() function) then you may take another approach. There are some others things that you can do.

Spawn a process

Another solution would be to use child_process to spawn a different process every time you start the long running task. Your main process could be notified when the child ends doing its job and react accordingly but it wouldn't be blocked while waiting. See: https://nodejs.org/api/child_process.html

Use a queue

You can also use a queue of pending jobs and process them by other processes with no effect on your main program that would only schedule new tasks and not do them or wait for them. Usually queues like that are done with Redis but it can also be done with CouchDB or MongoDB. You need to have some central registry of pending tasks from which your worker processes could take them. There are many modules to do that in Node, for example:

  • http://automattic.github.io/kue/
  • https://www.npmjs.com/package/bull
  • https://www.npmjs.com/package/bee-queue
  • https://www.npmjs.com/package/node-taskman
  • https://www.npmjs.com/package/cluster-master
  • https://www.npmjs.com/package/agenda
  • https://www.npmjs.com/package/worker-farm

See the documentation of those modules to see which one suits your needs best.

like image 99
rsp Avatar answered Oct 22 '22 22:10

rsp