Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Default Encoding and changes

By default, Character and String use UTF-16, however, for all practical purposes, in North America and most of the english locales, UTF-8 is sufficient (since it can go upto 4 bytes). So, if I use a InputStreamReader(InputStream), then does it give me default UTF-16 char encoding? Using a InputStreamReader(InputStream, "UTF-8") would provide a UTF-8 encoding, which would suffice my purpose.

How can I auto-set my JVM's default encoding to UTF-8 while using English locale? The intention is to improve performance for Character and String manipulation (by using 8-bit scheme instead of 16-bit encoding and most ASCII is covered using 8-bit encoding and at the same time complying with Unicode standard).

Any comments are appreciated. Thanks!

like image 913
Ashley Avatar asked Oct 10 '13 14:10

Ashley


People also ask

What is the default character encoding?

encoding attribute, Java uses “UTF-8” character encoding by default. Character encoding basically interprets a sequence of bytes into a string of specific characters. The same combination of bytes can denote different characters in different character encoding.

What are the 3 types of character encoding?

There are three different Unicode character encodings: UTF-8, UTF-16 and UTF-32.

Is UTF-8 the default encoding?

Browsers will typically use the value of the XML encoding declaration, or default to UTF-8 if there is none. Second, if there is a UTF-8 BOM on the document, and the XML encoding declaration is either UTF-8 or not included, the document will be interpreted as UTF-8, regardless of the charset used in the Content-Type.


1 Answers

The in-memory data types for text in java, char, Character, and String, are UTF-16. Absolutely. Always. Unconditionally.

The only thing you can change is how Java converts from bytes-on-the-outside to chars-on-the-inside. There is no way to change the representation to UTF-8 to trade space for time.

like image 105
bmargulies Avatar answered Sep 22 '22 15:09

bmargulies