Different representation of unicode code points in Japanese and chinese

Tags:

I am trying to display the glyph corresponding to unicode 0x95E8. This codepoint is basically of CJK block (chinese, Japanese, Korean).

I am struggling to know if the glyph representation of this particular codepoint can be different for Japanese and Chinese.

When I am displaying this U+95E8 in a JTextArea, i am able to see "门" character on linux/windows. But when I am trying to display the same codepoint in my "embedded device". the displayed character changes to.

japanese_glyph

I want to know if this codepoint U+95E8 should have uniform representation in all the CJK (Chinese, Japanese, Korean) locales or is different for some of them. Can this kind of manifestation be because of different kind of font installed in different devices? I am sorry for my ignorance but I am not too much into internationalization.

import java.awt.*;
import java.awt.event.*;
import java.util.Locale;

import javax.swing.*;

public class TextDemo extends JPanel implements ActionListener {

    public TextDemo() {
    }

    public void actionPerformed(ActionEvent evt) {
    }

    /**
     * Create the GUI and show it.  For thread safety,
     * this method should be invoked from the
     * event dispatch thread.
     * @throws InterruptedException 
     */
    private static void createAndShowGUI() throws InterruptedException {

        JFrame frame = new JFrame(java.util.Locale.getDefault().getDisplayName());

        frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);

        Container contentPane = frame.getContentPane();
        contentPane.setLayout(new SpringLayout());

        Dimension size = new Dimension(500, 500);
        frame.setSize(size);
        JTextArea textArea = new JTextArea();

        //Font font1 = new Font("SansSerif", Font.BOLD, 20);
        //textArea.setFont(font1);

        textArea.setEditable(true);
        textArea.setSize(new Dimension(400,400));
        textArea.setDefaultLocale(java.util.Locale.SIMPLIFIED_CHINESE);

        textArea.setText("Printing U+95E8 : \u95e8");                
        contentPane.add(textArea);        
        frame.setVisible(true);
    }

    public static void main (String[] args) {
        java.util.Locale.setDefault(java.util.Locale.JAPANESE);
        javax.swing.SwingUtilities.invokeLater(new Runnable() {
            public void run() {
                try {
                    createAndShowGUI();
                } catch (InterruptedException e) {
                    // TODO Auto-generated catch block
                    e.printStackTrace();
                }
            }
        });
    }
}

408

asked Jul 22 '14 18:07

Yogesh

Video Answer

2 Answers

Generally, CJK characters in Unicode are “unified”, which means that a single code point is used even though the character has traditionally been somewhat different for the different languages. In theory, a single font can contain multiple glyphs for a code point, with some selection mechanism. In practice, a font that contains CJK characters typically has a single design for them, reflecting the design of Traditional Chinese, Simplified Chinese, Japanese, or Korean. In this sense, some fonts might be called “Traditional Chinese”, “Japanese”, etc.

Obviously, you should select the font according to the language of the text.

The glyph in the image in the question looks somewhat odd, and it deviates from the glyphs for U+95E8 in some common fonts, which generally show rather similar designs for this character. So for this specific character, the variation can be expected to be only in the general style (e.g., serif vs. sans-serif, stroke width). It seems that the font being used is somehow oddly designed, at least for this character,

108

answered Oct 06 '22 17:10

Jukka K. Korpela

Adding to Jukka's answer:

Here is some more info on the "Han unification": http://en.wikipedia.org/wiki/Han_unification

There are two main ways one can render the glyph desired:

Use a locale-specific font (means different fonts for Chinese Traditional, Chinese Simplified, Japanese, Korean). The designers of such fonts take care to do the right thing. This is Jukka's answer. As an example you can take a look at the Noto family of fonts (http://www.google.com/get/noto/cjk.html). Download the "Language specific fonts in OTF" files:
- The Simplified Chinese font is NotoSansHans-Regular.otf
- The Traditional Chinese font is NotoSansHant-Regular.otf
- The Japanese font is NotoSansJP-Regular.otf
- The Korean font is NotoSansKR-Regular.otf
Use a generic CJK font with multiple locale-speciffic glyphs. As an example you can again use the CJK Noto font, the "Multilingual fonts in OTF" option. See "Script Table and Language System Record" in http://www.microsoft.com/typography/otspec/chapter2.htm. But to use that the font should have the info, the text rendering engine should understand how to deal with the language setting, and the API should expose it.

Now, the stuff below is very low level. When you use something like JTextArea, you have no control. You use what the implementers of JTextArea decided to do.

You can call the setDefaultLocale of your component, and that might help. It is recommended you do that, no matter what. But if you want to be sure what is going on, you take control and specify a language specific font.

how can I recognize the correct font/environment in my PC that is causing "门" to be printed.

You can't do that reliably. The layers below Java might do their own fallback operations. And you can't legally distribute the Windows fonts.

So that I can install the same font in my embedded device

Don't. Use an open source, good quality font. The Noto fonts are a very good option.

answered Oct 06 '22 19:10

Mihai Nita

Related questions
                            
                                Servlet Gson().toJson infinite loop
                            
                                org.apache.hadoop.mapred.LocalClientProtocolProvider not found
                            
                                CrudRepository: find by multiple related entities
                            
                                Why does Java not support retrieval of exceptions from try/catch lost when an exception is thrown from finally?
                            
                                Adding a line in a JavaFX chart
                            
                                The matching wildcard is strict, but no declaration can be found for element 'bean'
                            
                                Access fragment from adapter
                            
                                Clone InputStream
                            
                                How to define which property ListView should use to render
                            
                                How can I make my generics code compatible with this method signature?
                            
                                Selecting from Multiple Tables in Spring Data
                            
                                How to properly interrupt a thread in android
                            
                                jBoss stuck at starting in netbeans and never starts
                            
                                Write to static field from instance method
                            
                                bug retrofit.RetrofitError: java.io.EOFException for Android
                            
                                Emulate This SWT Shell in Swing
                            
                                Jenkins Fails to Start
                            
                                Proguard issue with SAAgent(Samsung Accessory) java.lang.NoSuchMethodException: <init> []
                            
                                How to get pptx slide notes text using apache poi?
                            
                                How to identify holder of reference to object in Java Memory Analyzer using heap dump

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Different representation of unicode code points in Japanese and chinese

Tags:

java

unicode

locale

localization

chinese-locale