Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

From locale to ansi codepage to java charset?

is there a way to get a java.nio.charset.Charset from an ANSI CODEPAGE and the ansi codepage from a locale? For example, if i have the locale "en_US" i want to have the charset "cp1252", so i can call

private final Charset CS1252 = Charset.forName("cp1252");

or when i have the locale "ja_JP" for japanese, i wanna get the corresponding charset, like

private final Charset CS932 = Charset.forName("ms932");

How can i achieve that in java? So what i need is a Method like getCharsetForLocale(java.util.Locale loc)

like image 256
Christian Schiepe Avatar asked Oct 23 '25 18:10

Christian Schiepe


1 Answers

You can't and it does not make sense. Actually, any language could be written with several different character encodings, for example English could be written with: ASCII, ISO8859-1, ISO-8859-15, Windows 1252, UTF-7, UTF-8, UTF-16, UTF-32 and many, many more, basically with all the Windows code pages for example.

I am not sure what you are looking for, so let me suggest this:

  1. If you are looking to save the data, use UTF-8 regardless of Locale. Always. Yes, always. Don't worry about the space, for many languages it is efficient enough and the disk space is cheap.

  2. If you are want to know what kind of character encoding users might use, it is not valid to think they are restricted to a single one. Instead you may think of detecting the encoding using ICU Charset Detector for example (read more about detection here).

  3. If you want to know the current code page of the system, the easiest way to do that (and it is OS independent!) is to call Charset.defaultCharset().

Next time, please try to describe your problem first, what you want to achieve and what you have already tried.

like image 199
Paweł Dyda Avatar answered Oct 26 '25 08:10

Paweł Dyda



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!