I have a text box implementation that uses pango. If i put a string that starts with a word in right-to-left script, followed by a space, followed by word in left-to-right based script, the word wrapping that pango uses gets messed up (using PANGO_WRAP_WORD_CHAR
). For the string العربية ENGLISH I get the following:
If I add the unicode character U+200F
after the space, then I get the expected word wrapping:
Also, if I replace the Arabic script above with Hindi (which is left-to-right like the English next to it) then I still get the problem, so it doesn't seem to be a strictly left-to-right, right-to-left thing. In the Hindi case, I put in a hack that inserts a 0x200E
after the space it resolves the problem.
Is this a bug in pango? Are there work-arounds I can try that are generic enough to fix the problem but not break other cases? The current work around I'm using inserts either a 0x200E
or 0x200F
after every space based on the direction of the previous strongly directed character in the string, but I'm not sure if there's certain strings that this will cause problems with.
Update: I was able to reproduce this problem on Ubuntu 12.04 with gedit (with Enable Text Wrapping and Do no split words over two lines settings enabled). I simply typed Hello world
over and over until it wrapped several times, then replaced all instances of world
with पहुंचगया
, and everything collapsed to a single line.
The symbols U+200F
and U+200E
are RIGHT-TO-LEFT and LEFT-TO-RIGHT Marks. S:
It is a bug because Pango should this automatically in viewing text but as Pango isnt doing it, you should do it manually.
It seems to me a bug or not complete feature as it appears on mixed scripts.
Seem to me you are using an old pango development, may be from Ubuntu 12.04?
Ubuntu 12.04 contains Gedit 3.4
Ubuntu 15.10 contains Gedit 3.10
Pango has radical change in 3.6, it has replaced his shaping engine with HarfBuzz. [2]
I couldn't reproduce the bug using Gedit 15.10, it always moves (2) two words down, also it does not allow me to resize its window to try splitting those two words. See screen-shot.
Update:
It seems its behavior has changed:
It does not wrap the 1st word from English script when start with Arabic.
pango-view --text "وقعت أطراف سياسية ليبية اليوم في المغرب اتفاق سلام برعاية أممية aljazeeranet" --width=70 --margin=0 --wrap=word
It same as previous case, does not wrap, and enforce the width
pango-view --text "elections الجزيرة" --width=30 --margin=0 --wrap=word
References:
Note, we recently upgraded the version of pango we used, from pango version 1.36.1 to 1.38.1, and this issue went away. So I believe this was a bug in pango or harfbuzz that has since been fixed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With