Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does a Unicode character get mapped to a glyph in a font?

Tags:

unicode

fonts

I am wondering, that each char in Unicode has a code point; what's the analogous term for a character in a font?

I never understood the part of the process when a decoded file needs to be mapped to font (or fonts, by some modern font substitution technology).

For example, when a text editor has decoded a file from its character encoding, and suppose there's Greek alpha α (U+03B1). What's the exact process this app chooses a particular glyph in a font? Most app has a preferred font. Let's say it's Courier. (And what happens in the case of a rare Unicode char likethe heart ♥ (U+2665), that's not in the default font? How does the app know the font doesn't contain that char?)

Does a font contain meta info about what symbols it has?

If 2 fonts both have the symbol alpha, do they necessarily share the same “code point”? Or is it dependent on the type of font such as Type1, Type3, TrueType, OpenType? ...

Thanks for any pointers or references.

like image 204
Xah Lee Avatar asked Aug 27 '10 09:08

Xah Lee


People also ask

How do fonts work with Unicode?

A Unicode font is a computer font that maps glyphs to code points defined in the Unicode Standard. The vast majority of modern computer fonts use Unicode mappings, even those fonts which only include glyphs for a single writing system, or even only support the basic Latin alphabet.

How do you add characters to glyphs?

To add glyphs to the custom set, select the font containing the glyph at the bottom of the Glyphs panel, click the glyph to select it, and then choose the name of the custom glyph set from the Add To Glyph Set menu on the Glyphs panel menu.

What are fonts glyphs?

A glyph is a single representation of a character. Every font has a Unicode character map that links (abstract) character IDs with how to display that character, using the default glyphs.


1 Answers

TrueType fonts consist of a number of sections, most importantly for this question a table of "glyphs" and a table ("cmap") for mapping characters to those glyphs.

Long story short, the operating system uses the "cmap" table to convert characters into glyph indexes, substituting a default glyph for any which have no matching entry. Unfortunately there are multiple versions of the font file specification (not to mention different types of fonts) and different character encodings of the same mappings in those tables, so the actual process of doing the mapping, and doing it efficiently so that text drawing is fast, ends up being extremely complex.

A "Code Point" is completely independent of characters, encodings and fonts. A particular code point is universal, but there are many encodings for it (UTF-8, UTF-16, etc.) and it will map to different glyph indexes in different fonts.

Apple's developer documentation has a pretty good section on the details of TrueType fonts:

http://developer.apple.com/fonts/ttrefman/

Specifically:

Glyph table: https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6glyf.html

Character map: https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6cmap.html

I also recommend an application called BabelMap, which gives you a lot of interesting information about fonts. Specifically look at Tools/Unicode Summary, Fonts/Font Analysis Utility, and Fonts/Font Information, where you can extract the entire glyph mapping table to the clipboard.

like image 159
Tim Sylvester Avatar answered Oct 10 '22 21:10

Tim Sylvester