On the documentation of Spring Batch for configuring a step a clear picture describes how the read process and write is performed.
read
process
...
read
process
// until #amountOfReadsAndProcesses = commit interval
write
Corresponding (according to the doc):
List items = new Arraylist();
for(int i = 0; i < commitInterval; i++){
Object item = itemReader.read()
Object processedItem = itemProcessor.process(item);
items.add(processedItem);
}
itemWriter.write(items);
However when I debug and put a breakpoint in the read method of the reader and a breakpoint in the process method of the processor I see the following behaviour:
read
...
read
// until #amountOfReads = commit interval
process
...
process
// until #amountOfProcesses = commit interval
write
So is the documentation wrong? Or am I missing some configuration to make it behave like the documentation (didn't find anything there).
The problem that I have is that each consequetive read now depends on a status from the processor. The reader is a composite that reads two sources in parallel, depending on a the read items in one of the sources only the first, second or both sources are read during one read operation. But the status of which sources to read is made in the processor. Currently the only solution is going for commit-interval 1, which isn't very optimal for performance.
The short answer is, you are correct, our documentation isn't accurate on the chunking model. It's something that needs to be updated. There are reasons for why it is the way it is (they mainly have to do with how fault tolerance is handled). But that doesn't address your issue. For your use case, there are a couple options:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With