I'm using spring batch to read csv files, when I open these files with Notepad++ I see that the used encode is encode in ANSI
.
Now when reading a line from a file, I notice that all accent character are not shown correctly. For example let's take this line:
Données issues de la reprise des données
It's transformed to be like this one with some special characters:
So as first solution I set the encode for my Item Reader to utf-8
but the problem still exist.
UTF-8
encoding all my accent characters will be recognized, is that not true ? from what I heard UTF-8 is the best encoding to use to handle all character on web page for example ?After setting my item Reader encoding to ISO-8859-1
:
public class TestItemReader extends FlatFileItemReader<TestFileRow> {
private static final Logger log = LoggerFactory.getLogger(TestItemReader.class);
public ScelleItemReader(String path) {
this.setResource( new FileSystemResource(path + "/Test.csv"));
this.setEncoding("ISO-8859-1");
I cant see that these character are now displayed correctly.
utf-8
as encoding, did this is correct if I use ISO-8859-1
as encoding input and utf-8
as output?I had the same problem. Input file is ANSI, and "ü" gets displayed as a square in the output.
That's because your input file is encoded in ANSI, but by default, Spring Batch assumes ISO-8859-1 encoding (6.6.2 FlatFileItemReader).
Therefore, you have to set the encoding for your reader to "Cp1252" (setEncoding("Cp1252")
) - that's how Java refers to ANSI encoding.
Furthermore, you will have to set your writer's encoding to "utf-8". I'm not entirely sure why it doesn't work with other encodings (that are generally able to display "ü", such as ISO-8859-1), but it works with UTF-8, so that's what I'm using.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With