Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TeX Hyphenation patterns : What do they represent

If you scroll down this page a bit, you'd see UK English hyphenation patterns like:

\patterns{ % just type <return> if you're not using INITEX
.ab4i
.ab3ol
.ace4
.acet3
.ach4
.ac5tiva

What do these patterns like .ab4i mean?

like image 859
understack Avatar asked Dec 19 '09 22:12

understack


2 Answers

There are three kinds of characters in a TeX hyphenation pattern. The dot . is an anchor for word boundary. A letter stands for itself, that is, a letter in the word to be hyphenated. A number stands for a potential hyphenation point, and the number signifies the hyphenation level. There are five levels in total.

The basic idea of the algorithm is that a word is matched against the patterns, and the hyphenation level inserted from each pattern that matches. If two levels from two different patterns match at the same point, the higher one is selected. Of the final values, only odd levels indicate allowed hyphenation points. The idea is to be able to specify both possible hyphenation points and places where a hyphen should not be inserted. So, for example, if a specific spot in a word matches two patterns that have a 1 and a 2 in that spot, hyphenation at that point is not allowed because the 2 overrides the 1 and only an odd value indicates a permitted hyphenation point.

Looking at your examples, .ab4i indicates that abi at the start of a word will rarely receive a hyphen between b and i because a level of 4, being even, will inhibit hyphenation unless overridden by a 5. On the other hand, a word beginning with activa can always be hyphenated between the c and the t because the 5 will override any other value and, being odd, permits hyphenation.

like image 51
JaakkoK Avatar answered Nov 01 '22 07:11

JaakkoK


These patterns are created using a tool called patgen2. There's TeX source for a tutorial about this tool at patgen2.tutorial, and the Ph. D. thesis on this topic available through tug.org.

like image 42
Chip Uni Avatar answered Nov 01 '22 07:11

Chip Uni