Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unterminated Double Quotes in Spring Batch

I am new to Spring Batch and I have run into a problem.

The batch application I am working on reads and processes lines from a delimited text file. I have configured the application to use a FlatFileReader to read the delimited text file, but the issue is that some of the data being read has a double quote in it. A FlatFileParseException is thrown when the FlatFileReader encounters a single double quote, but none is thrown when two double quotes are present.

Has anyone come across this issue, and if so, what would be the proper resolution? Manipulating the data itself is not an option unfortunately. I have tried adding an escape character before every double quote, but an exception is still thrown regardless.

Any help would be greatly appreciated.

like image 794
JPM Avatar asked Dec 05 '22 19:12

JPM


2 Answers

I ran into the same problem. However the proposed solution is not an optimal one. What if in your data there isn't a suitable quote character? Unfortunately we don't always have control over input data and pre-processing them is not often a good idea. Exploring the DelimitedLineTokenizer source code I decided to adopt this solution that I will share with this answer. It requires to override a class, but with this we totally remove the quote character issue.

import org.springframework.batch.item.file.transform.DelimitedLineTokenizer;

    public class CustomDelimitedLineTokenizer extends DelimitedLineTokenizer {

        @Override
        protected boolean isQuoteCharacter(char c) {
            return false;
        }

    } 

This way the DelimitedLineTokenizer can't recognize the quote character. Of course if we need this functionality then this solution is not adoptable, however I think it is better than the proposed one that just sort the issue instead of solving it. Hope it will help someone.

like image 50
Karura91 Avatar answered Dec 25 '22 20:12

Karura91


if the files have no real quotes (2x quote character) you could go with the solution from the spring forum changing the quote character for the DelimitedLineTokenizer

            <property name="lineTokenizer">
                <bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
                    <property name="quoteCharacter" value="@" />
                </bean>
            </property>
like image 43
Michael Pralow Avatar answered Dec 25 '22 18:12

Michael Pralow