I need to convert the content of an InputStream into a String. The difficulty here is the input encoding, namely Latin-1. I tried several approaches and code snippets with String, getBytes, char[], etc. in order to get the encoding straight, but nothing seemed to work.
Finally, I came up with the working solution below. However, this code seems a little verbose to me, even for Java. So the question here is:
Is there a simpler and more elegant approach to achieve what is done here?
private String convertStreamToStringLatin1(java.io.InputStream is)
throws IOException {
String text = "";
// setup readers with Latin-1 (ISO 8859-1) encoding
BufferedReader i = new BufferedReader(new InputStreamReader(is, "8859_1"));
int numBytes;
CharBuffer buf = CharBuffer.allocate(512);
while ((numBytes = i.read(buf)) != -1) {
text += String.copyValueOf(buf.array(), 0, numBytes);
buf.clear();
}
return text;
}
To convert an InputStream Object int to a String using this method. Instantiate the Scanner class by passing your InputStream object as parameter. Read each line from this Scanner using the nextLine() method and append it to a StringBuffer object. Finally convert the StringBuffer to String using the toString() method.
The BufferedReader can't read the InputStream directly; So, we need to use an adapter like InputStreamReader to convert bytes to characters format. For example: // BufferedReader -> InputStreamReader -> InputStream BufferedReader br = new BufferedReader( new InputStreamReader(inputStream, StandardCharsets. UTF_8));
byte[] utf8 = ... byte[] latin1 = new String(utf8, "UTF-8"). getBytes("ISO-8859-1"); You can exercise more control by using the lower-level Charset APIs. For example, you can raise an exception when an un-encodable character is found, or use a different character for replacement text.
Firstly, a few criticisms of the approach you've taken already. You shouldn't unnecessarily use an NIO CharBuffer
when you merely want a char[512]
. You don't need to clear
the buffer each iteration, either.
int numBytes;
final char[] buf = new char[512];
while ((numBytes = i.read(buf)) != -1) {
text += String.copyValueOf(buf, 0, numBytes);
}
You should also know that just constructing a String
with those arguments will have the same effect, as the constructor too copies the data.
The contents of the subarray are copied; subsequent modification of the character array does not affect the newly created string.
You can use a dynamic ByteArrayOutputStream
which grows an internal buffer to accommodate all the data. You can then use the entire byte[]
from toByteArray
to decode into a String
.
The advantage is that deferring decoding until the end avoids decoding fragments individually; while that may work for simple charsets like ASCII or ISO-8859-1, it will not work on multi-byte schemes like UTF-8 and UTF-16. This means it is easier to change the character encoding in the future, since the code requires no modification.
private static final String DEFAULT_ENCODING = "ISO-8859-1";
public static final String convert(final InputStream in) throws IOException {
return convert(in, DEFAULT_ENCODING);
}
public static final String convert(final InputStream in, final String encoding) throws IOException {
final ByteArrayOutputStream out = new ByteArrayOutputStream();
final byte[] buf = new byte[2048];
int rd;
while ((rd = in.read(buf, 0, 2048) >= 0) {
out.write(buf, 0, rd);
}
return new String(out.toByteArray(), 0, encoding);
}
I don't see how it could be much simpler. I did this a little different once.. if you already have a String, you can do this:
new String(originalString.getBytes(), "ISO-8859-1");
So something like this could also work:
BufferedReader reader = new BufferedReader(new InputStreamReader(is));
StringBuilder sb = new StringBuilder();
String line = null;
while ((line = reader.readLine()) != null) {
sb.append(line + "\n");
}
is.close();
return new String(sb.toString().getBytes(), "ISO-8859-1");
EDIT: I should add, this is really just an alternative to your already working solution. When it comes to converting Streams in Java it won't be much simpler, so go for it. :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With