Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting language name to locale code

Tags:

php

locale

Is there a canonical way in PHP to do this (Java question): Locale: Language name to Country / Language code

This question is the inverse of: standard function to translate iso-639 codes to language name?

Namely, convert from a string such as French to the code fr? The mechanism would only need to support source strings in English, and I would like to avoid creating my own conversion list as was given in this answer: https://stackoverflow.com/a/20520458/760706

I am thinking along the lines of Locale::getLocaleCodeForDisplayLanguage("French", "en") which doesn't exist.

like image 324
Nicholas Shanks Avatar asked Jul 30 '14 15:07

Nicholas Shanks


3 Answers

Define :

function getLocaleCodeForDisplayLanguage($name){
    $languageCodes = array(
    "aa" => "Afar",
    "ab" => "Abkhazian",
    "ae" => "Avestan",
    "af" => "Afrikaans",
    "ak" => "Akan",
    "am" => "Amharic",
    "an" => "Aragonese",
    "ar" => "Arabic",
    "as" => "Assamese",
    "av" => "Avaric",
    "ay" => "Aymara",
    "az" => "Azerbaijani",
    "ba" => "Bashkir",
    "be" => "Belarusian",
    "bg" => "Bulgarian",
    "bh" => "Bihari",
    "bi" => "Bislama",
    "bm" => "Bambara",
    "bn" => "Bengali",
    "bo" => "Tibetan",
    "br" => "Breton",
    "bs" => "Bosnian",
    "ca" => "Catalan",
    "ce" => "Chechen",
    "ch" => "Chamorro",
    "co" => "Corsican",
    "cr" => "Cree",
    "cs" => "Czech",
    "cu" => "Church Slavic",
    "cv" => "Chuvash",
    "cy" => "Welsh",
    "da" => "Danish",
    "de" => "German",
    "dv" => "Divehi",
    "dz" => "Dzongkha",
    "ee" => "Ewe",
    "el" => "Greek",
    "en" => "English",
    "eo" => "Esperanto",
    "es" => "Spanish",
    "et" => "Estonian",
    "eu" => "Basque",
    "fa" => "Persian",
    "ff" => "Fulah",
    "fi" => "Finnish",
    "fj" => "Fijian",
    "fo" => "Faroese",
    "fr" => "French",
    "fy" => "Western Frisian",
    "ga" => "Irish",
    "gd" => "Scottish Gaelic",
    "gl" => "Galician",
    "gn" => "Guarani",
    "gu" => "Gujarati",
    "gv" => "Manx",
    "ha" => "Hausa",
    "he" => "Hebrew",
    "hi" => "Hindi",
    "ho" => "Hiri Motu",
    "hr" => "Croatian",
    "ht" => "Haitian",
    "hu" => "Hungarian",
    "hy" => "Armenian",
    "hz" => "Herero",
    "ia" => "Interlingua (International Auxiliary Language Association)",
    "id" => "Indonesian",
    "ie" => "Interlingue",
    "ig" => "Igbo",
    "ii" => "Sichuan Yi",
    "ik" => "Inupiaq",
    "io" => "Ido",
    "is" => "Icelandic",
    "it" => "Italian",
    "iu" => "Inuktitut",
    "ja" => "Japanese",
    "jv" => "Javanese",
    "ka" => "Georgian",
    "kg" => "Kongo",
    "ki" => "Kikuyu",
    "kj" => "Kwanyama",
    "kk" => "Kazakh",
    "kl" => "Kalaallisut",
    "km" => "Khmer",
    "kn" => "Kannada",
    "ko" => "Korean",
    "kr" => "Kanuri",
    "ks" => "Kashmiri",
    "ku" => "Kurdish",
    "kv" => "Komi",
    "kw" => "Cornish",
    "ky" => "Kirghiz",
    "la" => "Latin",
    "lb" => "Luxembourgish",
    "lg" => "Ganda",
    "li" => "Limburgish",
    "ln" => "Lingala",
    "lo" => "Lao",
    "lt" => "Lithuanian",
    "lu" => "Luba-Katanga",
    "lv" => "Latvian",
    "mg" => "Malagasy",
    "mh" => "Marshallese",
    "mi" => "Maori",
    "mk" => "Macedonian",
    "ml" => "Malayalam",
    "mn" => "Mongolian",
    "mr" => "Marathi",
    "ms" => "Malay",
    "mt" => "Maltese",
    "my" => "Burmese",
    "na" => "Nauru",
    "nb" => "Norwegian Bokmal",
    "nd" => "North Ndebele",
    "ne" => "Nepali",
    "ng" => "Ndonga",
    "nl" => "Dutch",
    "nn" => "Norwegian Nynorsk",
    "no" => "Norwegian",
    "nr" => "South Ndebele",
    "nv" => "Navajo",
    "ny" => "Chichewa",
    "oc" => "Occitan",
    "oj" => "Ojibwa",
    "om" => "Oromo",
    "or" => "Oriya",
    "os" => "Ossetian",
    "pa" => "Panjabi",
    "pi" => "Pali",
    "pl" => "Polish",
    "ps" => "Pashto",
    "pt" => "Portuguese",
    "qu" => "Quechua",
    "rm" => "Raeto-Romance",
    "rn" => "Kirundi",
    "ro" => "Romanian",
    "ru" => "Russian",
    "rw" => "Kinyarwanda",
    "sa" => "Sanskrit",
    "sc" => "Sardinian",
    "sd" => "Sindhi",
    "se" => "Northern Sami",
    "sg" => "Sango",
    "si" => "Sinhala",
    "sk" => "Slovak",
    "sl" => "Slovenian",
    "sm" => "Samoan",
    "sn" => "Shona",
    "so" => "Somali",
    "sq" => "Albanian",
    "sr" => "Serbian",
    "ss" => "Swati",
    "st" => "Southern Sotho",
    "su" => "Sundanese",
    "sv" => "Swedish",
    "sw" => "Swahili",
    "ta" => "Tamil",
    "te" => "Telugu",
    "tg" => "Tajik",
    "th" => "Thai",
    "ti" => "Tigrinya",
    "tk" => "Turkmen",
    "tl" => "Tagalog",
    "tn" => "Tswana",
    "to" => "Tonga",
    "tr" => "Turkish",
    "ts" => "Tsonga",
    "tt" => "Tatar",
    "tw" => "Twi",
    "ty" => "Tahitian",
    "ug" => "Uighur",
    "uk" => "Ukrainian",
    "ur" => "Urdu",
    "uz" => "Uzbek",
    "ve" => "Venda",
    "vi" => "Vietnamese",
    "vo" => "Volapuk",
    "wa" => "Walloon",
    "wo" => "Wolof",
    "xh" => "Xhosa",
    "yi" => "Yiddish",
    "yo" => "Yoruba",
    "za" => "Zhuang",
    "zh" => "Chinese",
    "zu" => "Zulu"
    );
    return array_search($name, $languageCodes);
}

Then you just call:

echo getLocaleCodeForDisplayLanguage("French");

and it will return fr

I found the list of the ISO 639-1 codes on this site.

like image 58
idmean Avatar answered Nov 19 '22 11:11

idmean


You can do this by PHP intl extension (available on PHP 5.3.2+).

<?php

if (version_compare(PHP_VERSION, '5.3.2', '<=')) {
    exit ('php_intl extension is available on PHP 5.3.2 or later.');
}    
if (!class_exists('Locale')) {
    exit ('You need to install php_intl extension.');
}

function getLocaleByDisplayName($displayName, $localeToSearch = 'en') {
    // get all available locales
    $allLocales = ResourceBundle::getLocales('');
    //var_dump($allLocales);

    $foundLocales = [];
    foreach ($allLocales as $locale) {
        $currentName = Locale::getDisplayLanguage($locale, $localeToSearch);
        if (strncmp($currentName, $displayName, strlen($currentName)) === 0) {
            $foundLocales[] = $locale;
        }
    }
    return $foundLocales;
}

$locales = getLocaleByDisplayName('Japanese', 'en');
var_dump($locales);
/*
array(2) {
[0]=>
string(2) "ja"
[1]=>
string(5) "ja_JP"
}
*/

$locales = getLocaleByDisplayName('スワヒリ語', 'ja');
var_dump($locales);
/*
array(5) {
[0]=>
string(2) "sw"
[1]=>
string(5) "sw_CD"
[2]=>
string(5) "sw_KE"
[3]=>
string(5) "sw_TZ"
[4]=>
string(5) "sw_UG"
}
*/

There could be multiple locales pointing the same language name, if you need to get only one, you may have to search the shortest locale or something.

like image 34
akky Avatar answered Nov 19 '22 11:11

akky


You can use thephpleague/iso3166 package by php league which implemented ISO 3166-1 data. For example, you can use:

$data = (new League\ISO3166\ISO3166)->name('Netherlands');
$data = (new League\ISO3166\ISO3166)->alpha2('NL');
$data = (new League\ISO3166\ISO3166)->alpha3('NLD');
$data = (new League\ISO3166\ISO3166)->numeric('528');
/*
result is:
[
    'name' => 'Netherlands',
    'alpha2' => 'NL',
    'alpha3' => 'NLD',
    'numeric' => '528',
    'currency' => [
        'EUR',
    ]
]
*/
like image 1
Ali Khalili Avatar answered Nov 19 '22 09:11

Ali Khalili