Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spring Batch - Skip Record On Process

Tags:

spring-batch

I wanted to skip some record on process.

what i have tried is, i have created custom exception and throw the exception when i want to skip the record and its calling the Skip listener onSkipInProcess method.Its working fine.

please find the configuration.

 <batch:chunk reader="masterFileItemReader" writer="masterFileWriter" processor="itemProcessor" commit-interval="5000" skip-limit="100000" >
  <batch:skippable-exception-classes>
        <batch:include class="org.springframework.batch.item.file.FlatFileParseException"/>
        <batch:include class="com.exception.SkipException"/>
  </batch:skippable-exception-classes>
  <batch:listeners>
        <batch:listener ref="recordSkipListener"/>
</batch:listeners>

But i would like to know is there any other way to skip the record on process?

Regards, Shankar

like image 999
Shankar Avatar asked Mar 13 '15 11:03

Shankar


People also ask

How do I skip a record in Spring Batch?

There are indeed two ways to do this, one like you mention with skip mechanism and the other with returning null which will filter out item and not write it. Here is documentation link - 6.3. 2. Filtering records where it is nicely explained what is difference between two approaches.

Can we skip processor in Spring Batch?

Using skip and skipLimit. First of all, to enable skip functionality, we need to include a call to faultTolerant() during the step-building process. Within skip() and skipLimit(), we define the exceptions we want to skip and the maximum number of skipped items.

How does Spring Batch handle exceptions?

Restart. By default , if there's an uncaught exception when processing the job, spring batch will stop the job. If the job is restarted with the same job parameters, it will pick up where it left off. The way it knows where the job status is by checking the job repository where it saves all the spring batch job status.

What is chunking in Spring Batch?

Spring Batch uses chunk oriented style of processing which is reading data one at a time, and creating chunks that will be written out within a transaction. The item is read by ItemReader and passed onto ItemProcessor, then it is written out by ItemWriter once the item is ready.


1 Answers

There are indeed two ways to do this, one like you mention with skip mechanism and the other with returning null which will filter out item and not write it. Here is documentation link - 6.3.2. Filtering records where it is nicely explained what is difference between two approaches. Also this blog post explains skip in details and transactions in batch.

When you i.e. parse csv file and you expect 5 items per line but one line holds 6 items that is invalid item, and you can opt out to skip it (by marking reader exception as skippable and defining condition in policy as you gave example). However if each line holds name and your use case is to not write items that start with letter N that is better implemented with returning null (filtering item) since it is valid item but not according to your business case.

Please note also that if you return null number of those items will be in StepContext in getFilterCount() and if you use skip approach they will be in getReadSkipCount(), getProcessorSkipCount and getWriteSkipCount respectfully.

like image 189
Nenad Bozic Avatar answered Sep 19 '22 16:09

Nenad Bozic