In my understanding "chunk oriented processing" in Spring Batch helps me to efficiently process multiple items in a single transaction. This includes efficient use of interfaces from external systems. As external communication includes overhead, it should be limited and chunk-oriented too. That's why we have the commit-level for the ItemWriter
.
So what I don't get is, why does the ItemReader
still have to read item-by-item? Why can't I read chunks also?
In my step, the reader has to call a webservice. And the writer will send this information to another webservice. That's why I wan't to do as few calls as necessary.
The interface of the ItemWriter
is chunk-oriented - as you know for sure:
public abstract void write(List<? extends T> paramList) throws Exception;
But the ItemReader
is not:
public abstract T read() throws Exception;
As a workaround I implemented a ChunkBufferingItemReader
, which reads a list of items, stores them and returns items one-by-one whenever its read()
method is called.
But when it comes to exception handling and restarting of a job now, this approach is getting messy. I'm getting the feeling that I'm doing work here, which the framework should do for me.
So am I missing something? Is there any existing functionality in Spring Batch I just overlooked?
In another post it was suggested to change the return type of the ItemReader
to a List
. But then my ItemProcessor
would have to emit multiple outputs from a single input. Is this the right approach?
I'm graceful for any best practices. Thanks in advance :-)
Spring Batch uses chunk oriented style of processing which is reading data one at a time, and creating chunks that will be written out within a transaction. The item is read by ItemReader and passed onto ItemProcessor, then it is written out by ItemWriter once the item is ready.
In the above code, the chunk size is set to 5, the default batch chunk size is 1. So, it reads, processes, and writes 5 of the data set each time. The reader can be defined by using the ItemReader interface which comes from the Spring Batch framework.
An Item Reader reads data into the spring batch application from a particular source, whereas an Item Writer writes data from Spring Batch application to a particular destination. An Item processor is a class which contains the processing code which processes the data read in to the spring batch.
Chunk oriented processing refers to reading the data one at a time, and creating 'chunks' that will be written out, within a transaction boundary. One item is read in from an ItemReader , handed to an ItemProcessor , and aggregated.
This is a draft for an implementation of the read() interface method.
public T read() throws Exception {
while (this.items.isEmpty()) {
final List<T> newItems = readChunk();
if (newItems == null) {
return null;
}
this.items.addAll(newItems);
}
return this.items.pop();
}
Please note, that items
is a buffer for the items read in chunks and not requested by the framework yet.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With