I think I can use \u****
to construct a character based on UTF16, how to construct a string using UTF8?
Strings in Java are encoding-agnostic (they use UTF-16 internally, but that doesn't matter here). The codes you are entering after \u
are Unicde code points, they are not the actual binary representation of characters. Each character has an associated code point. Different encodings define how you map code points to given binary represantation.
That being said you construct string using code points and then convert it to arbitrary encoding using getBytes()
method. For example Euro sign (€
):
"\u20AC".getBytes("UTF-8"); //-30, -126, -84
"\u20AC".getBytes("UTF-16"); //-2, -1, 32, -84
"\u20AC".getBytes("UTF-32"); // 0, 0, 32, -84
Worth to remember: UTF-16 isn't really using 16 bits all the time!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With