Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hiragana to Kanji converter

do you know if there is a library in C# or a dictionary that could help me to translate Hiragana to Kanji? I know that there is the IME of Windows but I would like to customize entirely the design of the candidate list of Kanji for a given Hiragana and it is not possible with this IME.

Exemple : the user writes "toru", first it is translated in Hiragana : "とる" I would like to have this list of choice:

撮る 取る 盗る

Thanks!

like image 450
Rodrigue Rens Avatar asked Apr 10 '12 13:04

Rodrigue Rens


People also ask

How do I change hiragana to kanji?

Japanese Input Basics By default, the text will display as Hiragana. When you've finished typing a word (or words), you can press space to convert the Hiragana to Kanji.

Is hiragana used with kanji?

Hiragana is a script, which in normal Japanese texts is used alongside kanji and katakana. Hiragana is usually used for grammatical functions (e.g., particles, verb inflections, etc.) and for words for which there doesn't exist kanji, or for words whose kanji are non-standard. 海山 would usually be written in kanji.

Is hiragana the same as kanji?

Japanese has three main sets of characters: Hiragana – a phonetic set of characters unique to Japanese. Katakana – another phonetic set of characters unique to Japanese, but used primarily for “loanwords”, or words borrowed from other languages. Kanji – Chinese “picture” characters adapted to Japanese.

Is kanji Hiragana or katakana?

Read more here. The Japanese writing system consists of two types of characters: the syllabic kana – hiragana (平仮名) and katakana (片仮名) – and kanji (漢字), the adopted Chinese characters. Each have different usages, purposes and characteristics and all are necessary in Japanese writing.


3 Answers

Unfortunatelly I do not know of a c# library. All I found involves importing some native libraries, like in this OS thread: Japanese to Romaji with Kakasi

If you are willing to do so, perhaps JWPce might help.

Although this is implemented as a Japanese text editor, it also contains a dictionary function (it actually contains a multitude of character lookup systems) that do what you want to do.

Possibly you can compile the project and then import those lookup functionality? JPWce is licensed under GPL and you can download both a binary executable and source code directly available from the homepage.

[Edit]

Researching some more I stumbled over mozc at Google Code:

Mozc is a Japanese Input Method Editor (IME) designed for multi-platform such as Chromium OS, Windows, Mac and Linux. This open-source project originates from Google Japanese Input.

(BSD license)

I have not looked into it myself yet, but it might be more what you are looking for as it does not have a full application "around it" but instead is intended to be used a library. Just like you wanted.

They also link to a short video how the input looks like: http://www.google.co.jp/ime/

Unfortunatelly, this still is C++, not .NET but it might be a starting point.

like image 97
Jens H Avatar answered Sep 20 '22 04:09

Jens H


Microsoft publishes this as a separate product, called Visual Studio International Pack

http://visualstudiogallery.msdn.microsoft.com/74609641-70BD-4A18-8550-97441850A7A8

like image 23
Lex Li Avatar answered Sep 17 '22 04:09

Lex Li


I do not know a C# library either. But given that a dictionary might be sufficient, you may want to look into using the IME dictionary that comes with Anthy.

If you download the sources of the most recent version, you'll find dictionary sources in the mkworddic and alt-cannadic directories. Look at the various files ending in .t.

Note that they are encoded in EUC-JP; you might want to convert them to UTF-8.

like image 41
jogojapan Avatar answered Sep 21 '22 04:09

jogojapan