Whats the relationship/difference between Spring-Batch Reader 'pageSize' property and Writer 'commit-interval'.
I may be wrong but I see a pattern in my application that for every pageSize exceeded I get see one commit being made. Is this true.?
Thanks
The Commit Interval As mentioned previously, a step reads in and writes out items, periodically committing using the supplied PlatformTransactionManager . With a commit-interval of 1, it commits after writing each individual item.
Spring Batch collects items one at a time from the item reader into a chunk, where the chunk size is configurable. Spring Batch then sends the chunk to the item writer and goes back to using the item reader to create another chunk, and so on until the input is exhausted. This is what we call chunk processing.
With enable annotation, you can use Spring batch features and provide a base configuration for setting up batch jobs in a Configuration class. In the above code, the chunk size is set to 5, the default batch chunk size is 1. So, it reads, processes, and writes 5 of the data set each time.
In this case, there's only one step performing only one tasklet. However, that tasklet defines a reader, a writer and a processor that will act over chunks of data. Note that the commit interval indicates the amount of data to be processed in one chunk. Our job will read, process and write two lines at a time.
The commit-interval
defines how many items are processed within a single chunk. That number of items are read, processed, then written within the scope of a single transaction (skip/retry semantics not withstanding).
The page-size
attribute on the paging ItemReader
implementations (JdbcPagingItemReader
for example) defines how many records are fetched per read of the underlying resource. So in the JDBC example, it's how many records are requested with a single hit to the DB.
While there is no direct correlation between the two attributes, it's typically considered a good idea to make them match, however they independently provide two knobs you can turn to modify the performance of your application.
With regards to your direct question, if you have the page-size
set to the same as the commit-interval
, then yes, I'd expect a single commit for each page.
Commit interval determines how many items will be processed in a Chunk.
Page size determines how many items will be fetched every time it is needed.
Depending on the numbers you set, the behavior may be the one you describe. They are used for optimization.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With