Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to control the number of parallel Spring Batch jobs

I have a report generating application. As preparation of such reports is heavyweight, they are prepared asynchronously with Spring Batch. Requests for such reports are created via REST interface using HTTP.

The goal is that the REST resource simply queues report execution and completes (as described in documentation). Thus a TaskExecutor has been provided for the JobLauncher:

    <bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
        <property name="jobRepository" ref="jobRepository" />
        <property name="taskExecutor">
        <bean class="org.springframework.core.task.SimpleAsyncTaskExecutor"/>
    </property>
</bean>

As the reports are really heavyweight, only a specified number of them can be produced at a given time. Hoping to be able to configure Spring Batch to produce 2 instances at a time only, concurrencyLimit has been specified:

    <bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
        <property name="jobRepository" ref="jobRepository" />
        <property name="taskExecutor">
        <bean class="org.springframework.core.task.SimpleAsyncTaskExecutor">
            <property name="concurrencyLimit" value="2" />
        </bean>
    </property>
</bean>

Unfortunately, when 2 jobs are already running, the launch job call is blocked: jobLauncher.run(job, builder.toJobParameters());

Apparently jobLauncher immediately attempts to execute job. I would imagine it rather queue job for execution as soon as a thread is available. This way I could scale my application by simply adding additional processing instances, all using the same job repository database.

Similar question was asked here. I'm about to start exploring Spring Batch Integration but I'm not sure if that's the right direction.

My usecase does not seem that uncommon to me, should't there be a widely discussed pattern for it that I am apparently unable to find?

Thanks f

like image 759
featur Avatar asked Jun 16 '15 20:06

featur


People also ask

Can a Spring Batch have multiple jobs?

Multiple jobs can be run simultaneously. There are two main types of Spring Batch Parallel Processing: Single Process, Multi-threaded, or Multi-process. These are also divided into subcategories, as follows: Multi-threaded Step (Step with many threads, single process)

How do you implement parallel processing in Spring Batch?

The simplest way to start parallel processing is to add a TaskExecutor to your Step configuration. In this example, the taskExecutor is a reference to another bean definition that implements the TaskExecutor interface.


1 Answers

SimpleAsyncTaskExecutor isn't recommended for heavy use since it spawns a new thread with each task. It also does not support more robust concepts like thread pooling and queueing of tasks.

If you take a look at the ThreadPoolTaskExecutor, it supports a more robust task execution paradigm with things like queueing of tasks and using a thread pool instead of spawning random, un-reused threads.

You can read more about the ThreadPoolTaskExecutor in the javadoc here: http://docs.spring.io/spring/docs/current/javadoc-api/org/springframework/scheduling/concurrent/ThreadPoolTaskExecutor.html

like image 64
Michael Minella Avatar answered Sep 24 '22 17:09

Michael Minella