Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert from 'fr_FR' type language codes to ISO 639-2 language codes

I need to convert in Java from strings like fr_FR, en_GB, ja_JP (meaning the French, English, and Japanese language) to their ISO 639-2 representations: fre/fra, eng, jpn.

Do you know if the notation style fr_FR complies to a certain standard? I haven't found anything in this regard.

Do you know how can I make the conversion from this notation to ISO 639-2 (3-letter) language codes?

Thanks a lot!

Update: I know the method getISO3Language(). And I also know that I could construct, by iterating the available locales, strings like fr_FR and then make a mapping with the ISO 639-2 3-letter code - thus, whenever I search for a 3-letter code I can find in the map I constructed. The thing is that I would fit me much better a direct solution. Sorry that I didn't explained this from the beginning.

like image 562
ovdsrn Avatar asked Dec 12 '22 15:12

ovdsrn


1 Answers

You can see the notation style {language}_{country} in the javadoc of java.util.ResourceBundle.getBundle(String, Locale, ClassLoader), so it won't be so bad to use the notation style. On the other hand, it also should be noted that language tags have {language}-{country} style (not underscore '_' but hyphen '-'). Detailed description can be found in the javadoc of java.util.Locale.

A simple way to convert {language}_{country} to ISO 639-2 (3-letter) code is new Locale(str.substring(0,2)).getISO3Language(), but it seems you are looking for another way like the following:

String locale = "fr_FR";

try
{
    // LanguageAlpha3Code is a Java enum that represents ISO 639-2 codes.
    LanguageAlpha3Code alpha3;

    // LocaleCode.getByCode(String) [static method] accepts a string
    // whose format is {language}, {language}_{country}, or
    // {language}-{country} where {language} is IS0 639-1 (2-letter)
    // and {country} is ISO 3166-1 alpha2 code (2-letter) and returns
    // a LocaleCode enum. LocaleCode.getLanguage() [instance method]
    // returns a LanguageCode enum. Finally, LanguageCode.getAlpha3()
    // returns a LanguageAlpha3Code enum.
    alpha3 = LocaleCode.getByCode(locale).getLanguage().getAlpha3();

    // French has two ISO 639-2 codes. One is "terminology" code
    // (ISO 639-2/T) and the other is "bibliographic" code
    // (ISO 639-2/B). 2 lines below prints "fra" for ISO 639-2/T
    // and "fre" for ISO 639-2/B.
    System.out.println("ISO 639-2/T: " + alpha3.getAlpha3T());
    System.out.println("ISO 639-2/B: " + alpha3.getAlpha3B());
}
catch (NullPointerException e)
{
    System.out.println("Unknown locale: " + locale);
}

The example above can be run with nv-i18n internationalization package. If you are using Maven, try to add the dependency below to your pom.xml,

<dependency>
    <groupId>com.neovisionaries</groupId>
    <artifactId>nv-i18n</artifactId>
    <version>1.1</version>
</dependency>

or download nv-i18n's jar from Maven Central Repository directly.

nv-i18n source code and javadoc are hosted on GitHub.

Source: https://github.com/TakahikoKawasaki/nv-i18n
Javadoc: http://takahikokawasaki.github.com/nv-i18n/

like image 53
Takahiko Kawasaki Avatar answered Dec 28 '22 12:12

Takahiko Kawasaki