Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting chinese to pinyin [closed]

Tags:

parsing

cjk

I've found places on the web such as http://www.chinesetopinyin.com that convert Chinese characters to pinyin (romanization).

Does anyone know how to do this, or have a database that can be parsed?


EDIT: I'm using C# but would actually prefer a database/flatfile.

like image 605
Mass Avatar asked Aug 26 '10 01:08

Mass


People also ask

Can I type Traditional Chinese in pinyin?

After you set up the Pinyin - Traditional input source, you can enter Traditional Chinese characters using Pinyin phonetic input codes.


1 Answers

possible solution using Python:

I think that Unicode database contains pinyin romanizations for chinese characters, but these are not included in unicodedata module data.

however, you can use some external libraries, like cjklib, example:

# coding: UTF-8
import cjklib
from cjklib.characterlookup import CharacterLookup

c = u'好'

cjk = CharacterLookup('T')
readings = cjk.getReadingForCharacter(c, 'Pinyin')
for r in readings:
    print r

output:

hāo
hǎo
hào

UPDATE

cjklib comes with an standalone cjknife utility, which micht help. some usage is described here

like image 98
mykhal Avatar answered Oct 03 '22 06:10

mykhal