we are using the external service to get the data in a CSV format. we are trying to write the data to response so that the csv can be downloadable to client. unfortunately, we are getting the data in the below format.
Amount inc. VAT Balance
£112.83 £0.0
£97.55 £0.0
£15.28 £0.0
we are unable to decode the content. Is there a way to decode £
and display £
in java.
Is there any String Utils available to decode the strings.
The file seems to be encoded in UTF-8. You should read it as UTF-8.
If you are using java.io.FileReader
and company, you should open a FileInputStream
and use an InputStreamReader
instead:
// Before: Reader in = new FileReader(file)
Reader in = new InputStreamReader(new FileInputStream(file), "UTF-8");
If you are using some other method for reading the file (an external or internal class library perhaps?), check in its documentation if it allows specifying the text encoding used to read the file.
Update: If you already have a String of mojibake like £97.55
and cannot fix the way it is read, one way of recoding is by converting the string back into bytes and re-interpreting the bytes as UTF-8. This process does not require any external "StringUtils" or codec library; the Java standard API is powerful enough:
String input = ...obtain from somewhere...;
String output = new String(input.getBytes(/*use platform default*/), "UTF-8");
Problem: when we use the getBytes() over string, it tries to decode using the default encoder. once the String is encoded, decoding may not work well if we use the default decoders.
Solution: One
StringUtils of apache will help us in decoding these characters while writing back to the response.
This class is available in org.apache.commons.codec.binary
package.
String CSVContent = "/* CSV data */";
/**
* Decode the bytes using UTF8.
*/
String decodedStr = StringUtils.newStringUtf8(CSVContent.getBytes("UTF-8"));
/**
* Convert the decoded string to Byte array to write to the stream
*/
Byte [] content = StringUtils.getBytesIso8859_1(decodedStr);
Maven 2.0 dependency.
<dependency>
<groupId>commons-codec</groupId>
<artifactId>commons-codec</artifactId>
<version>1.6</version>
</dependency>
Solution: Two
As per @Joni, Better solution with the standard API:
content = CSVContent.getBytes("ISO-8859-1");
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With