For sorting Chinese language text, I want to convert Chinese characters to Pinyin, properly separating each Chinese character and grouping successive characters together.
Can you please help me in this task by providing the logic or source code for doing this?
Please let me know if any open source or lib already present for this.
In the Google Play store, search for "Google Pinyin App". Install the app. Open the app and tap to "Enable" the input method. In the "Language & input" menu, enable Google Pinyin Input with the slider.
Short answer: you don't.
Long answer: There is no one-to-one mapping for 汉字 to 汉语拼音. Just some quick examples:
And so on. I have a beginners' book on this topic that has 207 examples. I stress that this is a beginners' book and is by no means complete. Each one has a page or two of examples of use and conditions under which you choose the appropriate pronunciation. It is not something that could be easily programmed (if at all).
And this doesn't even address the other slippery thing you want to deal with: the separation of characters into grouped words. The very notion of a word is a bit slippery in Chinese. (There's two terms that correspond, roughly to "word" in Chinese for example: 字 and 词. The first is the character, the second groups of characters that are put together into one concept. (I frequently get asked by Chinese speakers how many "words" I can read when they really mean "characters".) While in some cases the distinction is clear (the 词 "乌鸦", for example, is "crow" -- the two 字 must be together to express the idea properly and it would be incorrect to translate it as "black crow"), in others it is not so clear. What does "你好" translate to? Is it one word meaning, idiomatically, "hello"? Or is it two words translating literally to "you good"? Each of the characters involved stands alone or in groups with other words, but together they mean something entirely different from their individual meanings. Given this, how, precisely, do you plan to group the 汉语拼音 transliterations (which are difficult to impossible to get right in the first place!) into "words"?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With