Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to set font per Unicode range / codepoint in fontconfig?

I recently figured out how to use fontconfig on Linux to set system default fonts for serif, sans-serif and monospaced fonts; basically, you save an XML configuration file to ~/.config/fontconfig/fonts.conf with the following content:

<?xml version="1.0"?>
<!DOCTYPE fontconfig SYSTEM "fonts.dtd">
<fontconfig>

<match>
  <test qual="any" name="family"><string>serif</string></test>
  <edit name="family" binding="strong" mode="prepend_first">
    <string>Gentium</string>
    <string>Sun-ExtA</string>
    <string>HanaMinA</string>
    <string>HanaMinB</string>
    </edit>
  </match>

</fontconfig>

The binding="strong" mode="prepend_first" attributes ensure that the matching rules take precedence over other settings, and the sequence of font names ensure that where a font doesn't contain a given code point / character, the next font in the list is tried (this list applies top-to-bottom; IMHO it should really be a later-binds-stronger logic, but whatever).

The great thing about this configuration is that it works in text editors and terminal emulators alike.

However, there's still a nag: there are many cases where a given font does contain a given glyph, but another font would be preferrable for that codepoint; for example, Sun-ExtA is a great default font for CJK characters, but it also covers lots and lots of non-CJK characters and has a few problematic glyphs.

Suppose I don't like the appearance of 〇 U+3007 IDEOGRAPHIC NUMBER ZERO in Sun-ExtA and would rather use HanaMinA for it, how could I do that with fontconfig? Obviously I can't just prioritize the entry for HanaMinA over Sun-ExtA, as that would affect all of the glyphs that are contained in both fonts.

My hunch is that there should be a solution involving elements <charset> (according to the fontconfig user documentation, "This element holds at least one element of an Unicode code point or more") and/or <range> ("This element holds the two elements of a range representation"—presumably to denote a range of Unicode code points). I couldn't find a single example how to use these elements, though.

Is it possible to configure fontconfig to use a specific font for a single Unicode code point or a range of codepoints?

like image 543
John Frazer Avatar asked Nov 26 '17 21:11

John Frazer


2 Answers

Inside a scan pattern, including a <minus> element in your <edit> tag allows you to subtract from the charset.

This was mostly designed for removing "bad" or buggy characters from a font, but going further, you can write a <test> that matches every font besides the one you want to use:

<match target="scan">
  <test name="family" compare="not_eq">
    <string>VL Gothic</string>
  </test>
  <edit name="charset" mode="assign">
    <minus>
      <name>charset</name>
      <range>
        <int>0x0021</int>
        <int>0x00FF</int>
      </range>
    </minus>
  </edit>
</match>

A similar configuration can also be used to remove entire langs from a font.

As far as I know this wasn't really documented anywhere before now, I found out about it from a redhat bug

like image 160
Miss Blit Avatar answered Sep 19 '22 13:09

Miss Blit


You can promote fonts for specific locales in fontconfig using:

   <match>
    <test name="lang">
      <string>[RFC-3066 language code]</string>
    </test>
    <test name="family">
      <string>[genericname]</string>
    </test>
    <edit name="family" mode="prepend">
      <string>[fontname]</string>
    </edit>
  </match>
   <alias>
    <family>[fontname]</family>
     <default>
      <family>[genericname]</family>
     </default>
   </alias>

Careful use of fontconfig priorities is required so the font is promoted before those you don't want and after common latin/greek/cyrillic fonts (since CJK latin glyphs tend to be horrible).

Of course, that supposes your software environment is able to signal fontconfig when you read/write in a locale that needs this override.

like image 21
nim Avatar answered Sep 22 '22 13:09

nim