Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

strange behavior in spring batch about skip policy implementation

Tags:

spring-batch

I have a spring batch program.

The skip limit is set to 5 and the chunk size is 1000.

I have a job with two steps as below:

    <step id="myFileGenerator" next="myReportGenerator">
        <tasklet transaction-manager="jobRepository-transactionManager">
            <chunk reader="myItemReader" processor="myItemProcessor" writer="myItemWriter"  commit-interval="1000" skip-policy="skipPolicy"/>
        </tasklet>
        <listeners>
            <listener ref="mySkipListener"/>
        </listeners>
    </step>

    <step id="myReportGenerator">
        <tasklet ref="myReportTasklet" transaction-manager="jobRepository-transactionManager"/>
    </step> 

The skip policy is as below:

<beans:bean id="skipPolicy" class="com.myPackage.util.Skip_Policy">
    <beans:property name="skipLimit" value="5"/>
</beans:bean>

The SkipPolicy class is as below:

public class Skip_Policy implements SkipPolicy {

private int skipLimit;

public void setSkipLimit(final int skipLimit) {
    this.skipLimit = skipLimit;
}

public boolean shouldSkip(final Throwable t, final int skipCount) throws SkipLimitExceededException {

    if (skipCount < this.skipLimit) {
        return true;
    }
    return false;
}
}

Thus for any error occurring before the skip limit is reached, the skip policy will ignore the error (return true). The job will fail for any error after the skip limit is reached.

The mySkipListener class is as below:

public class mySkipListener implements SkipListener<MyItem, MyItem> {

public void onSkipInProcess(final MyItem item, final Throwable t) {
    // TODO Auto-generated method stub
    System.out.println("Skipped details during PROCESS is: " + t.getMessage());
}

public void onSkipInRead(final Throwable t) {

    System.out.println("Skipped details during READ is: " + t.getMessage());
}

public void onSkipInWrite(final MyItem item, final Throwable t) {
    // TODO Auto-generated method stub
    System.out.println("Skipped details during WRITE is: " + t.getMessage());
}
}

Now in myItemProcessor I have below code block:

if (item.getTheNumber().charAt(4) == '-') {
        item.setProductNumber(item.getTheNumber().substring(0, 3));
    } else {
        item.setProductNumber("55");
    }

For some of the items theNumber field is null and so above code block throws "StringIndexOutofBounds" exception.

But I am seeing a strange behavior which I am not understanding why it is happening.

In all there are 6 items which are having error i.e. theNumber field is null.

If the skip limit is more than the number of errors (i.e. > 6), the sys outs in skip listener class are getting called and the errors skipped are being reported.

However, if the skip limit is less (say 5 as in my example), the sys outs in skip listener class are not getting called at all and I am directly getting the below exception dump on console:

org.springframework.batch.retry.RetryException: Non-skippable exception in recoverer while processing; nested exception is java.lang.StringIndexOutOfBoundsException
at org.springframework.batch.core.step.item.FaultTolerantChunkProcessor$2.recover(FaultTolerantChunkProcessor.java:282)
at org.springframework.batch.retry.support.RetryTemplate.handleRetryExhausted(RetryTemplate.java:416)
at org.springframework.batch.retry.support.RetryTemplate.doExecute(RetryTemplate.java:285)
at org.springframework.batch.retry.support.RetryTemplate.execute(RetryTemplate.java:187)

What is the reason behind this behavior ? What should I do to resolve this ?

Thanks for reading!

like image 255
Vicky Avatar asked Feb 07 '12 10:02

Vicky


1 Answers

The SkipListener is only used at the end of the Chunk, if the tasklet that contains it finishes normally. When you have more errors than the skip-limit, that is reported, via the exception you see, and the tasklet is aborted.

If the number of errors is less than the skip-limit, then the tasklet finishes normally and the SkipListener is invoked once for each skipped line or item - Spring Batch builds a list of them internally as it goes along but only reports at the end.

The idea if this is that if the task fails you are, probably, going to retry it, so knowing what got skipped during an incomplete run is not useful, every time you retry you will get the same notification. Only if everything else succeeds, do you get to see what was skipped. Imaging you are logging the skipped items, you don't want them to be logged as skipped over and over again.

As you have seen, the simple solution is to make the skip-limit large enough. Again the idea is that if you have to skip lots of items, there is probably a more serious problem.

like image 53
Paul Avatar answered Sep 24 '22 02:09

Paul