Possible Duplicate:
Return the language of a given string
The task is to sort the list of strings. With priority to a specific language. Strings can be written in different languages. Such as Chinese, English, Russian. And I need to first take all the Chinese, and then the rest.
To do this, I want to know what country (language) belongs to a particular character in a string. ( For example on the first letter of)
Are there any classes or methods?
If we're talking alphabets, then you can simply check the int representation of a char by casting it:
int unicodeValue = (int)myString[0];
Then using a table such as this one you check if it's within the limit of a language.
For example, 丐
is 19984
, which is 4E10
in hexadecimal (19984.ToString("X")
), making it a CJK Unified Ideographs. It looks like this it's the category for chinese characters, but you need to dig around and make sure.
Now if we're talking about determining which language is a particular word from, you need to look into Soundex algorithms.
Try this link
How to detect the language of a string?
Code is(Copied)
var text = "¿Dónde está el baño?";
google.language.detect(text, function(result) {
if (!result.error) {
var language = 'unknown';
for (l in google.language.Languages) {
if (google.language.Languages[l] == result.language) {
language = l;
break;
}
}
var container = document.getElementById("detection");
container.innerHTML = text + " is: " + language + "";
}
});
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With