Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to gracefully handle thousands of Quartz misfires?

We have an application that needs to

  1. nightly reprocess large amounts of data, and

  2. reprocess large amounts of data on demand.

In both of these cases, around 10,000 quartz jobs get spawned and then run. In the case of nightly, we have one quartz cron job that spawns the 10,000 jobs which each individually do the work of processing the data.

The issue that we have is that we are running with around 30 threads, so naturally the quartz jobs misfire, and continue to misfire until everything is processed. The processing can take up to 6 hours. Each of these 10,000 jobs pertain to a specific domain object that can processed in parallel and are completely independent. Each of the 10,000 jobs can take a variable amount of time (from half a second to a minute).

My question is:

  1. Is there a better way to do this?

  2. If not, what is the best way for us to schedule/setup our quartz jobs so that a minimal amount of time is spent thrashing and dealing with misfires?

A note about or architecture: We are running two clusters with three nodes apiece. The version of quartz is a bit old (2.0.1), and clustering is enabled in the quartz.properties file.

like image 803
Brett McLain Avatar asked Dec 27 '13 16:12

Brett McLain


3 Answers

In both of these cases, around 10,000 quartz jobs get spawned

No need to spawn new quartz jobs. Quartz is a scheduler - not a task manager.

In the nightly reprocess - you need only that one quartz cron job to invoke some service responsible for managing and running the 10,000 tasks. In the "on demand" scenario, quartz shouldn't be involved at all. Just invoke that service directly.

How does the service manage 10,000 tasks?

Typically, when only one JVM is available, you'd just use some ExecutorService. Here, since you have 6 nodes under your fingers, you can easily use Hazelcast. Hazelcast is a java library that enables you to cluster your nodes, sharing resources efficiently with each other. Hazelcast has a straightforward solution distributing your ExecutorService, that's called Distributed Executor Service. It's as easy as creating a Hazelcast ExecutorService and submitting the task on all members. Here's an example from the documentation for invoking on a single member:

Callable<String> task = new Echo(input); // Echo is just some Callable
HazelcastInstance hz = Hazelcast.newHazelcastInstance();
IExecutorService executorService = hz.getExecutorService("default");
Future<String> future = executorService.submitToMember(task, member);
String echoResult = future.get();
like image 94
yair Avatar answered Nov 18 '22 14:11

yair


I would do this by making use of a queue (RabbitMQ/ActiveMQ). The cron job (or whatever your on-demand trigger is) populates the queue with messages representing the 10,000 work instructions (i.e. the instruction to reprocess the data for a given domain object).

On each of your nodes you have a pool of executors which pull from the queue and carry out the work instruction. This solution means that each executor is kept as busy as possible while there are still work items on the queue, meaning that the overall processing is accomplished as quickly as possible.

like image 32
azordi Avatar answered Nov 18 '22 14:11

azordi


The best way is to use a cluster of Quartz Instances. This will share the jobs between many cluster nodes : http://quartz-scheduler.org/documentation/quartz-2.x/configuration/ConfigJDBCJobStoreClustering

like image 2
Ali HAMDI Avatar answered Nov 18 '22 15:11

Ali HAMDI