Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Doing DB Queries Verus Storing Items in A collection?

I am trying to make to make a reminder system and I am using quartz for my scheduling. However I come up with a couple possible ways how to do what I need to do but I am not sure what the best way is and how to test it.

Basically I have a reminder system that users can set reminders. It is like Google Calendar. You set the date and time that your event is and then you set a reminder by saying "remind me 15 minutes before"

So you could have a event on May 10th, 2011 9:59am and you could say reminded me "15 minutes before"

So that would be May 10th, 10:44am.

I will be in a hosted environment. (My site and the scheduling will be running of in the same environment and even in the same solution. So it can't slow down the users browsing my site by much.)

I am also using nhibernate and fluent nhibernate to do the db querying. I am using asp.net mvc 3 for my web site.

Option 1.

Do a database query every minutes and get all reminders that should be sent out in that minute. This of course will mean a database query every minute and probably too intensive for a shared environment.

Option 2.

Do a database query every 5 minutes and grab all the reminders that should be sent in that 5 minute block and store them in a collection(so memory) and then check every minute which ones need to be sent out.

This of course lessens the amount of queries done but not sure if this will get extremely memory intensive.

Option 3

Same as Option 2 but send a query every 15 minutes and store in a collection.

This of course means alot less databases queries but more stored in memory.

Option 4

Do a database query every 15 minutes and get all reminders in that block and fire them out immediately.

This means they won't be stored in memory very long and reduced amount of queries. However depending on when the user set to be reminded the email could arrive alot earlier then they set.

For instance they said remind me at 10:44am. I would have my scheduler start at 10:00am and it would grab from 10:00am to 10:15am and then 10:15am to 10:30am then 10:30am to 10:45am.

So that email would actually arrive 14 mins earlier then intended.

like image 472
chobo2 Avatar asked May 10 '11 16:05

chobo2


1 Answers

Here is how I would solve this problem.

  • At the DB Tier I would create a simple queue. This list of messages would also include a send time. When queried this list would have the next item at the top.

  • The message agent would query this list and act on the top item or sleep till the top item on the list comes due.

One of the advantages of this technique is that you don't have the acting agent applying business rules for when it checks the queue. If you want it to wake up every minute (for example to check if there are new messages which need to be sent out) then you just make sure this queue always has an event every minute (this event could have a type that does not send a message, a "wake up" message has no targets). The agent will wake up and perform the check. Then if you want to apply more complicated scheduling rules they are easy. You don't have to recode the agent you just have to change what messages are put in the queue. (For example check every 10 mins when the system is in high use and every 20 mins when it is low use and stop checking during nightly backup). This can all be done (and changed) without changing the code on your agent.


A simple real world example

QueueTable
----------
ID int
deliverTime datetime
nagCount int
expireTime datetime
active bool
processed datetime (null)
' maybe some audit stuf...
' content of the message -- or external link
' etc

START: The agent makes a call like this

SELECT TOP 1 * 
FROM QueueTable
WHERE active = true and processed is null
ORDER BY deliverTime DESC

The agent then looks to see what the deliverTime time is:

  • If it has passed or in the next fuzzy boundry (1 sec?) it sends the message then sets processed to the curent time in the db and loops back to START:

  • If it is in the future it sleeps till that deliverTime or sets an event to wake it up at that time (depends on platform).

I originally had processed as a boolean but if you use the null to equal not processed then it can double as an audit field.


Example to check every 10 mins no matter what.

How this works: Because the results are sorted by time the soonest one will show up at the top. What we do is add in an item 10 mins from now into the result set. Thus the top item will never be more than 10 mins from the current time.

SELECT TOP 1 * 
FROM QueueTable
WHERE active = true and processed is null
UNION ALL
SELECT NULL, DATEADD(min,GETDATE(),10), null, null, false, null, ...
ORDER BY deliverTime DESC

Note, active column is being used as a flag here to show no action will be performed. This record is just a marker to wake up the agent. This method cal also adjust this depending on other rules (eg time of day because at night you don't need to check as often etc.)

like image 179
Hogan Avatar answered Sep 22 '22 23:09

Hogan