Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is Unicode combining character order different between IDEA and Chrome?

In Java, I am generating a string with letters A and B with a COMBINING OVERLINE U+0305 character in between.

@Test
public void test() {
  System.out.println("A\u0305B");
}

I get this in IDEA:

enter image description here

But if I copy to here, it will become A̅B.


This one is from the Chrome console:

enter image description here

I was confused by the combining character's combining order. Which one is correct?

I was writing this in Kotlin and compiling to JavaScript to run in the browser. Debugging in IDEA is correct, but the browser shows a different answer.

like image 598
wener Avatar asked Jun 16 '19 17:06

wener


1 Answers

If one is to believe Wikipedia and refraining from jumping into the dense Unicode Consortium authoritative PDF jungle, the text related to this is "In Unicode, diacritics are always added after the main character (in contrast to some older combining character sets such as ANSEL), so it is possible to add several diacritics to the same character, although as of 2010, few applications support correct rendering of such combinations." (Maybe I should edit the page at that point to add the "citation needed", though).

Anyway, in both GTK+, SDL and both Browsers in my system, Overline is drawn on the preceeding character. My Qt apps do not support this character, but all its siblings diacriticals - including "\u0304" and "\u0306", are drawn on the preceding character. And unlike overline, these are used in "real world" text in latin languages, which would be rendered in an absurdly incorrect way with the diactricals shifted.

From these points, I think it is clear that the subsystems rendering the sign on the following letter are buggy. Moreover as we can see from the comments, the problem might lie just on the fonts in use - buggy fonts are better than buggy IDE.

like image 98
jsbueno Avatar answered Nov 05 '22 19:11

jsbueno