Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spring batch single threaded reader and multi threaded writer

Tags:

spring-batch

Tried to find if this was asked before but couldn't.

Here is the problem. The following has to be achieved via Spring batch There is one file to be read and processed. The item reader is not thread safe. The plan is to have multithreaded homogenous processors and multithreaded homogenous writers injest items read by a single threaded reader.

Kind of like below:

        ----------> Processor #1 ----------> Writer #1
       |
    Reader -------> Processor #2 ----------> Writer #2
       |
        ----------> Processor #3 ----------> Writer #3

Tried AsyncItemProcessor and AsyncItemWriter, but holding debug point on processor resulted in reader not being executed until the point was released i.e. single threaded processing.

Task executor was tried like below:

<tasklet task-executor="taskExecutor" throttle-limit="20">

Multiple threads on the reader were launched.

Synchronising the reader also didn't work.

I tried to read about partitioner but it seemed complex.

Is there an annotation to mark the reader as single threaded? Would pushing read data to Global context be a good idea?

Please guide towards a solution.

like image 551
Programmer Avatar asked Nov 24 '25 23:11

Programmer


1 Answers

I guess nothing is in built in Spring Batch API for the pattern that you are looking for. Coding on your part would be needed to achieve what you are looking for.

Method ItemWriter.write already takes a List of processed items based on your chunk size so you can divide up that List into as many threads as you like. You spawn your own threads and pass a segment of list to each of threads to write .

Problem is with method ItemProcesor.process() as it processes item by item so you are limited by a single item and you wouldn't be able to much of a threading for a single item.

So challenge is to write your own reader than can hand over a list of items to processor instead of a single item so you can process those items in parallel & writer will work on a list of list.

In all of this set up, you have to remember that threads spawned by you will be out of read - process - write transaction boundary of Spring batch so you will have to take care of that on your own - in terms of merging processed output for all threads and waiting till all threads are complete and handling any errors. All in all, its very risky.

Making a item reader to return a list instead single object - Spring batch

like image 125
Sabir Khan Avatar answered Nov 26 '25 21:11

Sabir Khan



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!