Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to group Japanese and other non-Latin names in LongListSelector?

If you put your Windows Phone Emulator or device into Japanese, Korean, or other non latin languages and use the people app, their implementation of the LongListSelector shows Japanese grouping characters, then a unicode "globe with meridians" character, followed by a-z characters:

WP7 People AppWP7 People App

With the LongListSelector from the Windows Phone toolkit, you have to do your own manual grouping logic. How do I get the list of Japanese/Korean/etc name grouping characters and how do I determine what grouping character a name goes under (since looking at my 2nd screenshot, the grouping character is nowhere in the users name)?

like image 547
bkaid Avatar asked Feb 02 '12 18:02

bkaid


2 Answers

The short answer is: you add 96 to the unicode value of hiragana (ま) to get the equivalent katakana (マ).

You can determine if a character is hiragana by checking that it's unicode value falls in the range 3040-309F.

Unfortunately, as Noah mentions, many names are spelt using kanji: an alphabet of about 40,000 characters each with hiragana equivalents and many contextual to their surroundings. If you want to support those, you'll need to look for a Japanese language library to help you.

FYI, katakana is occasionally used to represent CAPITAL letters so that would explain their usage here. (Given Metro's lowercase preference, I would have thought katakana a better fit).

If you only want to support hiragana, here's something that should help:

const int KatakanaStartCode = 0x30A0;
const int HiraganaStartCode = 0x3040;
const int HiraganaEndCode = 0x309F;

private char GetGroupChar(string name)
{
    // Check for null/blank
    // Check for numbers, etc

    char firstChar = name[0];
    int firstCharCode = (int)firstChar;

    bool isHiragana = (HiraganaStartCode <= firstCharCode && 
        firstCharCode <= HiraganaEndCode);

    if (isHiragana)
    {
        char katakanaChar = (char)(firstCharCode + 
            (KatakanaStartCode - HiraganaStartCode));

        return katakanaChar;
    }

    return Char.ToLowerInvariant(firstChar);
}

And then:

string name = "またな たなかあ";

char s = GetGroupChar(name);

Debug.WriteLine(s); // マ
like image 197
Richard Szalay Avatar answered Sep 22 '22 12:09

Richard Szalay


I don't know anything about the Windows Phone Toolkit in particular, but basically, it works like this: Most Japanese names will have a Kanji form (which is normally how it is written, and what is displayed). Since the Kanji form may have ambiguous pronunciation, there are fields for pronunciation as well. You use the pronunciation field to group the names. (And you can group any names without data in the pronunciation field into another "other" group).

For example: Kanji: 山本次郎 katakana:ヤマモトジロウ

Then your "grouping chars" are just a list (or partial list) of Katakana or Hiragana, and this guy would fall under "や" or "ヤ".

like image 21
Noah Avatar answered Sep 20 '22 12:09

Noah