We have a Spring Boot application, and have scheduled tasks.
We want to deploy our application on multiple servers, so will be multiple instances of application.
How to configure Spring to run scheduled tasks only on specified servers?
We can choose to delay the first execution of the method by specifying the interval using the initialDelay attribute. We can deploy multiple Scheduler Instances using the ShedLock library which ensures only one instance to run at a time by using a locking mechanism in a shared database.
To configure, batch job scheduling is done in two steps: Enable scheduling with @EnableScheduling annotation. Create method annotated with @Scheduled and provide recurrence details using cron job. Add the job execution logic inside this method.
In addition to the TaskExecutor abstraction, Spring 3.0 introduces a TaskScheduler with a variety of methods for scheduling tasks to run at some point in the future. The simplest method is the one named 'schedule' that takes a Runnable and Date only. That will cause the task to run once after the specified time.
This is a very wide topic. And there are many options to achieve this.
You can configure your application to have multiple profiles. For example use another profile 'cron' . And start your application on only one server with this profile. So for example, on a production environment you have three servers (S1, S2, S3), then you could run on S1 with profile prod and cron(-Dspring.profiles.active=prod,cron
). And on S2 and S3 just use prod profile(-Dspring.profiles.active=prod
).
And in code, you can use @Profile("cron")
on scheduler classes. This way it will be executed only when cron profile is active
Use a distributed lock. If you have Zookeeper in your environment, you can use this to achieve distributed locking system.
You can use some database(mysql) and create a sample code to get a lock on one of the table and add an entry. And whichever instance gets the lock, will make an entry in this database and will execute the cron job. You need to
put a check in your code, if getLock()
is successfull only then proceed with execution. Mysql has utilities like LOCK TABLES
, which you could use to get away with concurrent read/writes.
personally I would say, option 2 is the best of all.
The Spring - ShedLock project is specifically created to achieve this.
Dependency -
<groupId>net.javacrumbs.shedlock</groupId>
<artifactId>shedlock-spring</artifactId>
Configuration -
@EnableScheduling
@EnableSchedulerLock(defaultLockAtMostFor = "PT30S")
Implementation -
@Scheduled(cron = "0 0/15 * * * ?")
@SchedulerLock(name = "AnyUniqueName",
lockAtLeastForString = "PT5M", lockAtMostForString = "PT10M")
public void scheduledTask() {
// ...
}
This setup will make sure that exactly one instance should run the scheduled task.
If you want only a specific instance should run the Scheduler task,
You need to config your scheduler to use the properties file and control the Scheduler switch like this -
@ConditionalOnProperty(
value = "scheduling.enabled", havingValue = "true", matchIfMissing = true
)
@Configuration
@EnableScheduling
@EnableSchedulerLock(defaultLockAtMostFor = "PT30S")
public class SchedulingConfig {
Now, you need to provide a property scheduling.enabled = true
in your application.properties
file, for the instance from which you want Schedular to be run.
Follow this link for complete implementation.
One of the best options - use Quartz scheduler with clustering. It's simple, just:
implementation("org.springframework.boot:spring-boot-starter-quartz")
And configure jobs for quartz with spring (see tutorial)
Clustering configs in application.yaml:
spring:
datasource: ... # define jdbc datasource
quartz:
job-store-type: jdbc # Database Mode
jdbc:
initialize-schema: never # For clustering do not initialize table structure
properties:
org.quartz:
scheduler:
instanceId: AUTO #Default hostname and timestamp generate instance ID, which can be any string, but must be the only corresponding qrtz_scheduler_state INSTANCE_NAME field for all dispatchers
#instanceName: clusteredScheduler #quartzScheduler
jobStore:
class: org.quartz.impl.jdbcjobstore.JobStoreTX #Persistence Configuration
driverDelegateClass: org.quartz.impl.jdbcjobstore.StdJDBCDelegate #We only make database-specific proxies for databases
useProperties: true #Indicates that JDBC JobStore stores all values in JobDataMaps as strings, so more complex objects can be stored as name-value pairs rather than serialized in BLOB columns.In the long run, this is safer because you avoid serializing non-String classes to BLOB class versions.
tablePrefix: QRTZ_ #Database Table Prefix
misfireThreshold: 60000 #The number of milliseconds the dispatcher will "tolerate" a Trigger to pass its next startup time before being considered a "fire".The default value (if you do not enter this property in the configuration) is 60000 (60 seconds).
clusterCheckinInterval: 5000 #Set the frequency (in milliseconds) of this instance'checkin'* with other instances of the cluster.Affects the speed of detecting failed instances.
isClustered: true #Turn on Clustering
threadPool: #Connection Pool
class: org.quartz.simpl.SimpleThreadPool
threadCount: 10
threadPriority: 5
threadsInheritContextClassLoaderOfInitializingThread: true
Attention on initialize-schema: never
- you need to initialize it by yourself for cluster mode
See official scripts: https://github.com/quartz-scheduler/quartz/tree/master/quartz-core/src/main/resources/org/quartz/impl/jdbcjobstore
And you can use it through liquibase/flyway/etc, but remove DROP ...
queries! That's why in cluster we don't initialize schema automatically.
See quartz docs
See spring boot docs quartz
See article with example
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With