I have a mobile application that needs to be ported for a Japanese audience. Part of the application is a custom font file that needs to be extended from only containing latin-1 characters to also containing Japanese characters. I realise that this will make it rather large, but that is not todays problem.
Note that I have no control over the text to be displayed by this application, so it needs to be able to support enough to be able to display user-generated content.
Here is what I believe to be a maximal set of unicode ranges that would cover anything required of it.
Compatability U+3300 - U+33FF
Compatability forms U+FE30 - U+FE4F
Compatability ideographs U+F900 - U+FAFF
Compatability ideographs supplement U+2F800 - U+2FA1F
Radicals supplement U+2E80 - U+2EFF
Strokes U+31C0 - U+31EF
Symbols and punctuation U+3000 - U+303F
Unified Ideographs U+4E00 - U+9FBB
Unified Ideographs ext. A U+3400 - U+4DB5
Unified Ideographs ext. B U+20000 - U+2A6D6
Enclosed letters and months U+3200 - U+32FF
Hiragana U+3040 - U+309F
Kanbun U+3190 - U+319F
Katakana U+30A0 - U+30FF
Katakana phonetic U+31F0 - U+31FF
What I need to know is:
Enclosed Alphanumerics U+2460 - U+2473 " U+2474 - U+24E9* " U+24EA - U+24FF Miscellaneous Symbols U+2600 - U+2607 " U+2618 - U+2618 " U+260E - U+260F " U+2614 - U+2615 " U+263D - U+2653 " U+2660 - U+266F Symbols and punctuation U+3000 - U+303F Hiragana U+3040 - U+309F Katakana U+30A0 - U+30FF Katakana phonetic U+31F0 - U+31FF Enclosed letters and months U+321F - U+325F* " U+3280 - U+32FF* Unified Ideographs ext. A U+3400 - U+4DB5 Unified Ideographs U+4E00 - U+9FBB Compatability ideographs U+F900 - U+FAFF Compatability forms U+FE30 - U+FE4F Full-Width Roman U+FF00 - U+FF5E Half-Width Katakana U+FF61 - U+FF9F Full- and Half-Width Symbols U+FFE0 - U+FFEE Unified Ideographs ext. B U+20000 - U+2A6D6 Compatability ideographs supplement U+2F800 - U+2FA1F * = Lower priority
Don't forget the full-width Roman, which are used often for the Roman alphabet in Japanese (FF00-FF5E) and half-width Katakana pages (FF61-FF9F). You will probably also need the full- and half-width symbols (FFE0-FFEE).
An argument can be made that the Kanbun annotation page (3190-319F) will generally not be used. Kanbun is and old style of Japanese which uses all Chinese characters (no Hiragana or Katakana) with a different set of grammar rules, generally taught at school. These annotation marks will not be used unless someone is trying to explain how to read/understand one of these passages, which is probably unlikely. It could be included for completeness, but probably is not a high priority.
CJK Compatability (3300-33FF) is generally used by newspapers in print media, but will almost certainly not be used by the average public (I have yet to see one on a website). In either event, they have equivalent long forms (e.g. ㌘ can be written as グラム instead), so this is also in the non-essential category.
CJK Radicals Supplement (2E80-2EFF) is also non-essential, but could be used. They are not complete characters, but the "radical" (base part) of characters. They could be used to explain the derivation of a character, but unlikely to be used in normal application of the language.
CJK Strokes (31C0-31E3) is the same as the CJK Radicals Supplement, and probably has an even less likelyhood of being used in everyday application.
The first part of Enclosed CKJ Letters and Months (3200-321E) are unnecessary. They are Korean symbols. Same with (3260-327F). The rest of the page has a low usage rate, but I would include it for completeness because someone will probably try to use one occasionally. But you can consider them lower priority.
The rest you have called out in your original list are essential.
Also missing from the list is Enclosed Alphanumerics (2460-24FF). The circled numbers (2460-2473 and 24EA-24FF) are used relatively frequently. The circled alphabet, parenthesized numbers, and numbers period (2474-24E9) could be omitted as non-essential, however.
Also, you would do well to include Miscellaneous Symbols (2600-263C), although some are used more often than others. Absolutely essential ones include some of the weather symbols (2600-2607), shamrock (2618), the telephones (260E-260F), umbrella and hot drink (2614-2615), Astrological and Zodiac symbols (263D-2653), and playing cards, hot springs, and musical symbols (2660-266F).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With