I'm trying to use Spring Batch 2.2.5 with Java config. Here is the config that I have:
@Configuration
@EnableBatchProcessing
public class JobConfiguration {
@Autowired
private JobBuilderFactory jobBuilder;
@Autowired
private StepBuilderFactory stepBuilder;
@Bean
@Autowired
public Job processDocumentsJob() {
return jobBuilder.get("processDocumentsJob")
.start(procesingStep())
.build();
}
@Bean
@Autowired
public Step procesingStep() {
CompositeItemProcessor<File, DocumentPackageFileMetadata> compositeProcessor = new CompositeItemProcessor<File, DocumentPackageFileMetadata>();
compositeProcessor.setDelegates(Lists.newArrayList(
documentPackageFileValidationProcessor(),
documentMetadataFileTransformer()
));
return stepBuilder.get("procesingStep")
.<File, DocumentPackageFileMetadata>chunk(1)
.reader(documentPackageFileReader())
.processor(compositeProcessor)
.writer(documentMetadataFileMigrator())
.faultTolerant()
.skip(DocumentImportException.class)
.skipLimit(10)
.listener(stepExecutionListener())
.build();
}
....
}
With the config above, if the itemwriter (the bean pointed by documentMetadataFileMigrator) throws 'DocumentImportException', then the exception wont be skipped. Spring Batch will actually retry the same input again. i.e. it will use the same input against 'documentPackageFileValidationProcessor'.
But, if I move the logic inside the itemwriter into an itemprocessor:
@Bean
@Autowired
public Step procesingStep() {
CompositeItemProcessor<File, DocumentPackageFileMetadata> compositeProcessor = new CompositeItemProcessor<File, DocumentPackageFileMetadata>();
compositeProcessor.setDelegates(Lists.newArrayList(
documentPackageFileValidationProcessor(),
documentMetadataFileTransformer(),
documentMetadataFileMigratorAsProcessor() // same as itemwriter, but implemented as itemprocessor
));
return stepBuilder.get("procesingStep")
.<File, DocumentPackageFileMetadata>chunk(1)
.reader(documentPackageFileReader())
.processor(compositeProcessor)
.faultTolerant()
.skip(DocumentImportException.class)
.skipLimit(10)
.listener(stepExecutionListener())
.build();
}
then the exception will be skipped correctly. i.e. Spring Batch will not retry the same item against 'documentPackageFileValidationProcessor'. It will go to the next item to process (the one returned from 'documentPackageFileReader').
Is this a bug on Spring Batch, or is it behaving as expected? If so, can someone point me to the relevant documentation?
Thanks guys, and apology if this is a fundamental question.
Best regards,
Alex
First of all, to enable skip functionality, we need to include a call to faultTolerant() during the step-building process. Within skip() and skipLimit(), we define the exceptions we want to skip and the maximum number of skipped items.
ItemWriter. It is the element of the step of a batch process which writes data. An ItemWriter writes one item a time. Spring Batch provides an Interface ItemWriter. All the writers implement this interface.
Skipping ItemsDefine a skip-limit on your chunk element to tell Spring how many items can be skipped before the job fails (you might handle a few invalid records, but if you have too many then the input data might be invalid).
At the end, this is what is working for me - if I want to use itemwriter, with no reprocessing of the same item:
@Bean
@Autowired
public Step procesingStep() {
CompositeItemProcessor<DocumentPackageFileMetadata, DocumentPackageFileMetadata> compositeProcessor = new CompositeItemProcessor<DocumentPackageFileMetadata, DocumentPackageFileMetadata>();
compositeProcessor.setDelegates(Lists.newArrayList(
documentPackageFileValidationProcessor(),
documentPackageFileExtractionProcessor(),
documentMetadataFileTransformer()
));
return stepBuilder.get("procesingStep")
.<DocumentPackageFileMetadata, DocumentPackageFileMetadata>chunk(1)
.reader(documentPackageFileReader())
.processor(compositeProcessor)
.writer(documentMetadataFileMigrator())
.faultTolerant()
.skip(DocumentImportException.class)
.noRetry(DocumentImportException.class)
.noRollback(DocumentImportException.class)
.skipLimit(10)
.listener(skipListener())
.listener(documentPackageReadyForProcessingListener())
.listener(stepExecutionListener())
.build();
}
Note that I have specified 'noRetry' and 'noRollback'.
That behavior is correct. The ItemWriter receives a list of items to write. If a skippable exception is thrown, Spring Batch attempts to determine which item actually caused the exception so only that item is skipped. The way this is done is the transaction is rolled back, the commit interval is changed to 1, and each item is then reprocessed and the write is attempted again. This allows only the item with the error to be skipped instead of needing to skip the entire chunk.
This same issue is discussed here (only using XML config): How is the skipping implemented in Spring Batch?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With