Actually I am in a situation where I need to read a string which is in utf8 format but its chars use variable-length encoding so I have problem encoding them to string and I get weird chars when printing it, the chars seem to be in Korean and the is the code I used but had no result:
public static String byteToUTF8(byte[] bytes) {
try {
return (new String(bytes, "UTF-8"));
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
}
Charset UTF8_CHARSET = Charset.forName("UTF-8");
return new String(bytes, UTF8_CHARSET);
}
Also I used UTF-16 and got a bit better results, however it was giving me strange chars yet and according to doc provided above I should use utf8.
Thanks in advance for helping.
EDIT:
Base64 value: S0QtOTI2IEdHMDA2AAAAAA==\n
If you check Bluetooth adapter setName(), you will get that
https://developer.android.com/reference/android/bluetooth/BluetoothAdapter.html#setName
Valid Bluetooth names are a maximum of 248 bytes using UTF-8 encoding, although many remote devices can only display the first 40 characters, and some may be limited to just 20.
If you check the link https://stackoverflow.com/a/7989085/2293534, you will get the list of android supported version.
-----------------------------------------------------------------------------------------------------
| DEC Korean | Korean EUC | ISO-2022-KR | KSC5601/cp949 | UCS-2/UTF-16 | UCS-4 | UTF-8 |
-----------------------------------------------------------------------------------------------------
DEC Korean | - | Y | N | Y | Y | Y | Y |
-----------------------------------------------------------------------------------------------------
Korean EUC | Y | - | Y | N | N | N | N |
-----------------------------------------------------------------------------------------------------
ISO-2022-KR | N | Y | - | Y | N | N | N |
-----------------------------------------------------------------------------------------------------
KSC5601/cp949| Y | N | Y | - | Y | Y | Y |
-----------------------------------------------------------------------------------------------------
UCS-2/UTF-16| Y | N | N | Y | - | Y | Y |
-----------------------------------------------------------------------------------------------------
UCS-4 | Y | N | N | Y | Y | - | Y |
-----------------------------------------------------------------------------------------------------
UTF-8 | Y | N | N | Y | Y | Y | - |
-----------------------------------------------------------------------------------------------------
Solution#1:
Michael has given a great example for conversion. For more you can check https://stackoverflow.com/a/40070761/2293534
When you call getBytes(), you are getting the raw bytes of the string encoded under your system's native character encoding (which may or may not be UTF-8). Then, you are treating those bytes as if they were encoded in UTF-8, which they might not be.
A more reliable approach would be to read the ko_KR-euc file into a Java String. Then, write out the Java String using UTF-8 encoding.
InputStream in = ... Reader reader = new InputStreamReader(in, "ko_KR-euc"); // you can use specific korean locale here StringBuilder sb = new StringBuilder(); int read; while ((read = reader.read()) != -1){ sb.append((char)read); } reader.close(); String string = sb.toString(); OutputStream out = ... Writer writer = new OutputStreamWriter(out, "UTF-8"); writer.write(string); writer.close();
N.B: You should, of course, use the correct encoding name
Solution#2:
Using StringUtils, you can do it https://stackoverflow.com/a/30170431/2293534
Solutions#3:
You can use Apache Commons IO for conversion. A very great example is given here: http://www.utdallas.edu/~lmorenoc/research/icse2015/commons-io-2.4/examples/toString_49.html
1 String resource;
2 //getClass().getResourceAsStream(resource) -> the <code>InputStream</code> to read from
3 //"UTF-8" -> the encoding to use, null means platform default
4 IOUtils.toString(getClass().getResourceAsStream(resource),"UTF-8");
I suggest you use StringUtils per Apache libraries. I believe the necessary methods for your are documented here:
https://commons.apache.org/proper/commons-codec/apidocs/org/apache/commons/codec/binary/StringUtils.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With