I want get the encoding from a stream.
1st method - to use the InputStreamReader.
But it always return OS encode.
InputStreamReader reader = new InputStreamReader(new FileInputStream("aa.rar"));
System.out.println(reader.getEncoding());
output:GBK
2nd method - to use the UniversalDetector.
But it always return null.
FileInputStream input = new FileInputStream("aa.rar");
UniversalDetector detector = new UniversalDetector(null);
byte[] buf = new byte[4096];
int nread;
while ((nread = input.read(buf)) > 0 && !detector.isDone()) {
detector.handleData(buf, 0, nread);
}
// (3)
detector.dataEnd();
// (4)
String encoding = detector.getDetectedCharset();
if (encoding != null) {
System.out.println("Detected encoding = " + encoding);
} else {
System.out.println("No encoding detected.");
}
// (5)
detector.reset();
output:null
How can I get the right? :(
Let's resume the situation:
So one needs to know the encoding before reading. You did everything right using first a charset detecting class.
Reading http://code.google.com/p/juniversalchardet/ it should handle UTF-8 and UTF-16. You might use the editor JEdit to verify the encoding, and see whether there is some problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With