Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to define a new Charset in Java/Android?

in Taiwan we have a character encoding called "Unicode At One (UAO)", which is an extension to BIG-5 but is not supported by Java and Android.
The code page is in http://moztw.org/docs/big5/table/uao241-b2u.txt

My question is, how can I build a String object with byte array data, using this Charset?
I guess I will extend the String class and do something in it, but I have no idea how to create a new Charset.

like image 293
Romulus Urakagi Ts'ai Avatar asked May 11 '11 07:05

Romulus Urakagi Ts'ai


People also ask

How do you set a charset in Java?

Setting default character encoding or Charset Methods: There are various ways of specifying the default charset value in Java. java -Dfile. encoding="UTF-8" HelloWorld, we can specify UTF-8 charset. Method 2: Specifying the environment variable “JAVA_TOOLS_OPTIONS.”

Does Android use UTF-8?

The default character encoding for Android is UTF-8, as specified by the JavaDoc of the Charset.

What is a charset in Java?

The native character encoding of the Java programming language is UTF-16. A charset in the Java platform therefore defines a mapping between sequences of sixteen-bit UTF-16 code units (that is, sequences of chars) and sequences of bytes.

Why UTF-8 is used in Java?

UTF-8 is a variable width character encoding. UTF-8 has the ability to be as condensed as ASCII but can also contain any Unicode characters with some increase in the size of the file. UTF stands for Unicode Transformation Format. The '8' signifies that it allocates 8-bit blocks to denote a character.


1 Answers

You can add your own Charset implementation by writing a CharsetProvider and registering it via the service discovery mechanism.

You'll need to extend Charset and implements its newDecoder and newEncoder methods to return an appropriate CharsetDecoder and CharsetEncoder respectively.

like image 70
Joachim Sauer Avatar answered Oct 13 '22 21:10

Joachim Sauer