Why does this junit test fail?
import org.junit.Assert;
import org.junit.Test;
import java.io.UnsupportedEncodingException;
public class TestBytes {
@Test
public void testBytes() throws UnsupportedEncodingException {
byte[] bytes = new byte[]{0, -121, -80, 116, -62};
String string = new String(bytes, "UTF-8");
byte[] bytes2 = string.getBytes("UTF-8");
System.out.print("bytes2: [");
for (byte b : bytes2) System.out.print(b + ", ");
System.out.print("]\n");
Assert.assertArrayEquals(bytes, bytes2);
}
}
I would assume that the incoming byte array equaled the outcome, but somehow, probably due to the fact that UTF-8 characters take two bytes, the outcome array differs from the incoming array in both content and length.
Please enlighten me.
The reason is 0, -121, -80, 116, -62
is not a valid UTF-8 byte sequence. new String(bytes, "UTF-8") does not throw any exception in such situations but the result is difficult to predict. Read http://en.wikipedia.org/wiki/UTF-8 Invalid byte sequences section.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With