Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Single job execution across multiple servers

I have setup where multiple servers run a @Schedule which run a spring batch job that sends out emails to users. I want to make sure that only one instance of this job is ran across multiple servers.

Based on this question I have implemented some logic to see if its possible to solve this using only spring batch.

To run a job I created a helper class JobRunner with the following methods:

public void run(Job job) {
    try {
        jobLauncher.run(job, new JobParameters());
    } catch (JobExecutionAlreadyRunningException e) {

        // Check if job is inactive and stop it if so.
        stopIfInactive(job);

    } catch (JobExecutionException e) {
        ...
    }
}

The stopIfInactive method:

private void stopIfInactive(Job job) {
    for (JobExecution execution : jobExplorer.findRunningJobExecutions(job.getName())) {
        Date createTime = execution.getCreateTime();

        DateTime now = DateTime.now();

        // Get running seconds for more info.
        int seconds = Seconds
                .secondsBetween(new DateTime(createTime), now)
                .getSeconds();

        LOGGER.debug("Job '{}' already has an execution with id: {} with age of {}s",
                job.getName(), execution.getId(), seconds);

        // If job start time exceeds the execution window, stop the job.
        if (createTime.before(now.minusMillis(EXECUTION_DEAD_MILLIS)
                .toDate())) {

            LOGGER.warn("Execution with id: {} is inactive, stopping",
                    execution.getId());

            execution.setExitStatus(new ExitStatus(BatchStatus.FAILED.name(),
                    String.format("Stopped due to being inactive for %d seconds", seconds)));

            execution.setStatus(BatchStatus.FAILED);
            execution.setEndTime(now.toDate());

            jobRepository.update(execution);
        }
    }
}

And then the jobs are ran by the following on all servers:

@Scheduled(cron = "${email.cron}")
public void sendEmails() {
    jobRunner.run(emailJob);
}

Is this a valid solution for a multiple server setup? If not, what are the alternatives?

EDIT 1

I've did a bit more testing - setup two applications which run a @Schedule every 5 seconds that initiates a job using the helper class I created. It seems that my solution does not resolve the problem. Here is the data from batch_job_execution table that is used by spring batch:

 job_execution_id | version | job_instance_id |       create_time       |       start_time        |        end_time         |  status   | exit_code | exit_message |      last_updated       | job_configuration_location
------------------+---------+-----------------+-------------------------+-------------------------+-------------------------+-----------+-----------+--------------+-------------------------+----------------------------
             1007 |       2 |               2 | 2016-08-25 14:43:15.024 | 2016-08-25 14:43:15.028 | 2016-08-25 14:43:16.84  | COMPLETED | COMPLETED |              | 2016-08-25 14:43:16.84  |
             1006 |       1 |               2 | 2016-08-25 14:43:15.021 | 2016-08-25 14:43:15.025 |                         | STARTED   | UNKNOWN   |              | 2016-08-25 14:43:15.025 |
             1005 |       2 |               2 | 2016-08-25 14:43:10.326 | 2016-08-25 14:43:10.329 | 2016-08-25 14:43:12.047 | COMPLETED | COMPLETED |              | 2016-08-25 14:43:12.047 |
             1004 |       2 |               2 | 2016-08-25 14:43:10.317 | 2016-08-25 14:43:10.319 | 2016-08-25 14:43:12.03  | COMPLETED | COMPLETED |              | 2016-08-25 14:43:12.03  |
             1003 |       2 |               2 | 2016-08-25 14:43:05.017 | 2016-08-25 14:43:05.02  | 2016-08-25 14:43:06.819 | COMPLETED | COMPLETED |              | 2016-08-25 14:43:06.819 |
             1002 |       2 |               2 | 2016-08-25 14:43:05.016 | 2016-08-25 14:43:05.018 | 2016-08-25 14:43:06.811 | COMPLETED | COMPLETED |              | 2016-08-25 14:43:06.811 |
             1001 |       2 |               2 | 2016-08-25 14:43:00.038 | 2016-08-25 14:43:00.042 | 2016-08-25 14:43:01.944 | COMPLETED | COMPLETED |              | 2016-08-25 14:43:01.944 |
             1000 |       2 |               2 | 2016-08-25 14:43:00.038 | 2016-08-25 14:43:00.041 | 2016-08-25 14:43:01.922 | COMPLETED | COMPLETED |              | 2016-08-25 14:43:01.922 |
              999 |       2 |               2 | 2016-08-25 14:42:55.02  | 2016-08-25 14:42:55.024 | 2016-08-25 14:42:57.603 | COMPLETED | COMPLETED |              | 2016-08-25 14:42:57.603 |
              998 |       2 |               2 | 2016-08-25 14:42:55.02  | 2016-08-25 14:42:55.023 | 2016-08-25 14:42:57.559 | COMPLETED | COMPLETED |              | 2016-08-25 14:42:57.559 |
(10 rows)

I also tried the method provided by @Palcente, I've got similar results.

like image 992
Edd Avatar asked Aug 25 '16 09:08

Edd


1 Answers

Spring Integration's latest release added some functionality around distributed locks. This is really what you'd want to use to make sure that only one server fires the job (only the server that obtains the lock should launch the job). You can read more about Spring Integration's locking capabilities in the documentation here: http://projects.spring.io/spring-integration/

like image 161
Michael Minella Avatar answered Nov 05 '22 14:11

Michael Minella