Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python unicode rendering: how to know if a unicode character is missing from the font

In Python when I render a unicode character, e.g. a Chinese character, with a selected font, sometimes the font is incomplete regarding the common unicode characters, and can't render the unicode character in question. In those cases, if I call the "print" function, the output usually just look like a square box, regardless what the underlying unicode character should look like.

Of course, once I print the unicode character, I can look at the output and then determine that the chosen font misses the particular unicode character. But is there a way to tell before I print, automatically, without having to resort to my own human eyes to determine if a character is included in the font?

I'd also clarify that I know of fonts that are more complete than others. My question is NOT which font I can use so that if I call "print" I'd generally have a reasonable output. Please also ignore the question of how I print the character or if I actually want to print a character. My question is simply, for any given font, how do I tell if a unicode character is missing from the font, without using any manual process relying on human judgement of the output.

like image 568
MichM Avatar asked May 07 '17 17:05

MichM


1 Answers

See https://unix.stackexchange.com/questions/247108/how-to-find-out-which-unicode-codepoints-are-defined-in-a-ttf-file

In short, one can install the fonttools package, supply it with the path to any .ttf font file of interest, and check if the long form of the unicode character of interest is included in the font file's unicode map table.

from fontTools.ttLib import TTFont
font = TTFont(fontpath)   # specify the path to the font in question


def char_in_font(unicode_char, font):
    for cmap in font['cmap'].tables:
        if cmap.isUnicode():
            if ord(unicode_char) in cmap.cmap:
                return True
    return False

Then just call the char_in_font function to check if the unicode character is included in the font.

like image 133
MichM Avatar answered Oct 14 '22 08:10

MichM