Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use of unicode characters in Haddock documentation

Haddock seems to incorrectly re-encode non-ASCII characters in the documentation in UTF-8 encoded source files. I often need to include mathematical formulas in the documentation and they are much more readable if some common math symbols such as summation (∑) can be used.

However, after running the files through haddock, these symbols become blank squares. Haddock has the option --use-unicode but that just converts function arrows in function signatures etc. into unicode characters, while still breaking the actually documentation.

Even better would be if this can be controlled from cabal haddock!

I'm using Haddock version 2.9.4.

like image 273
Grzegorz Chrupała Avatar asked Mar 01 '12 15:03

Grzegorz Chrupała


2 Answers

Note that Haddock uses the GHC API to do parsing. Non-ASCII characters in comments are not handled properly by GHC < 7.4, but it seems that with GHC 7.4 it works fine.

like image 129
Brent Yorgey Avatar answered Nov 07 '22 07:11

Brent Yorgey


If UTF-8 cannot be used and numeric character references like &#8721; or &­#x2211; (these are correct references for the n-ary summation symbol ∑) are regarded as unreadable, then I’m afraid the only option is to use named references like &sum;, if they get passed thru to the HTML result and are supported by the browser(s) that will be used.

That’s a big “if,” since the new HTML5 entities have rather limited support, but perhaps in an intranet where everyone uses Firefox... HTML5 entities: http://www.whatwg.org/specs/web-apps/current-work/multipage/named-character-references.html

(And most of the references are not as mnemonic as &sum;.)

like image 31
Jukka K. Korpela Avatar answered Nov 07 '22 06:11

Jukka K. Korpela