Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Polling a database versus triggering program from database?

I have a process wherein a program running in an application server must access a table in an Oracle database server whenever at least one row exists in this table. Each row of data relates to a client requesting some number crunching performed by the program. The program can only perform this number crunching serially (that is, for one client at a time rather than multiple clients in parallel).

Thus, the program needs to be informed of when data is available in the database for it to process. I could either

  1. have the program poll the database, or
  2. have the database trigger the program.

QUESTION 1: Is there any conventional wisdom why one approach might be better than the other?

QUESTION 2: I wonder if programs have any issues "running" for months at a time (would any processes in the server stop or disrupt the program from running? -- if so I don't know how I'd learn there was a problem unless from angry customers). Anyone have experience running programs on a server for a long time without issues? Or, if the server does crash, is there a way to auto-start a (i.e. C language executable) program on it after the server re-boots, thus not requiring a human to start it specifically?

Any advice appreciated.

UPDATE 1: Client is waiting for results, but a couple seconds additional delay (from polling) isn't a deal breaker.

like image 365
ggkmath Avatar asked Mar 14 '12 21:03

ggkmath


1 Answers

I would like to give a more generic answer...

There is no right answer that applies every time. Some times you need a trigger, and some times is better to poll.

But… 9 out of 10 times, polling is much more efficient, safe and fast than triggering.

It's really simple. A trigger needs to instantiate a single program, of whatever nature, for every shot. That is just not efficient most of the time. Some people will argue that that is required when response time is a factor, but even then, half of the times polling is better because:

1) Resources: With triggers, and say 100 messages, you will need resources for 100 threads, with 1 thread processing a packet of 100 messages you need resources for 1 program.

2) Monitoring: A thread processing packets can report time consumed constantly on a defined packet size, clearly indicating how it is performing and when and how is performance being affected. Try that with a billion triggers jumping around…

3) Speed: Instantiating threads and allocating their resources is very expensive. And don’t get me started if you are opening a transaction for each trigger. A simple program processing a say 100 meessage packet will always be much faster that initiating 100 triggers…

3) Reaction time: With polling you can not react to things on line. So, the only exception allowed to use polling is when a user is waiting for the message to be processed. But then you need to be very careful, because if you have lots of clients doing the same thing at the same time, triggering might respond LATER, than if you where doing fast polling.

My 2cts. This has been learned the hard way ..

like image 83
Alex Vaz Avatar answered Nov 10 '22 00:11

Alex Vaz