Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compatibility about CSS ::first-letter and UTF-8 mb4

So, here's my problem: I'm creating a website where I've some posts. In those posts, I put a "::first-letter" highlighting to make it bigger, and it works perfectly.

But, when I'm going to load a post with first letter as a Unicode Emoticon that is a UTF-8 mb4 (2 Unicode Chars), it fails, by trying to load the single char as 2 separated, so the result is something strange.

This is a screenshot:

Error with unicode and ::first-letter

How can you see, there's a bigger letter and one smaller that are unknown, and then the same emoticon visible, because I created a post with the same emoticons wrote down 2 times.

.first_letter_post::first-letter {
  float: left;
  padding-right: 20px;
  padding-top: 0px;
  margin-bottom: -15px;
  margin-top: -10px;
  font-size: 50px;
  font-weight: bold;
  text-transform: uppercase;
}
<p class="first_letter_post">🗿foobar</p>

This is the character: 🗿, and I'm using Google Chrome.

I hope someone can help me with this.

like image 428
Davide I. Avatar asked Aug 19 '16 15:08

Davide I.


1 Answers

Chrome has a long know history of problems with unicode [bug]. This issues is a combination of those problems:

  1. Failing to correctly recognize symbols consisting of more than 3 bytes.
  2. Styling symbols regardless of being a letter unit

This results in Chrome tearing a single symbol apart.

IE is correctly recognizing unicode symbols consisting of multiple codepoints and applies the styling regardless of the spec stating that ::first-letter should be applied to typographic letter units only.

Firefox behaves very strict to the spec, not applying styles to non-letter units. I could not determine whether the Alphanumeric Supplement Space should be treated as letter as well, but Firefox is not treating them as such.

This means, that you should refrain from using ::first-letter when you are heavily relying on it and know that those characters might occur.

A possible solution I could think of, is manually detecting the first character via javascript and wrapping it in a tag and then apply the styles. My solution is a bit messy due to the hard coded hex value, but it might be sufficient.

// manually wrapping the "first character"
Array.prototype.forEach.call(document.querySelectorAll("div"),function(el){wrapFirstChar(el)});

function wrapFirstChar(div){
  let content = div.innerHTML,chars=content.charCodeAt(0) >= 55349?2:1;
  div.innerHTML = "<span>"+content.substring(0,chars)+"</span>"+content.substring(chars);
}

// this is what javascript sees at the first two positions of the string
//Array.prototype.forEach.call(document.querySelectorAll("p"),(e)=>console.log(e.innerHTML.charCodeAt(0)+"+"+e.innerHTML.charCodeAt(1)));
p::first-letter {
  font-weight: bold;
  color:red;
}
span {
  font-weight: bold;
  color:blue;
}
p{
margin:0;
}
<h2>using ::first-letter</h2>
<p>🗿 4 bytes symbol</p>
<p>🅰 Enclosed Alphanumeric Supplement 1F170</p>
<p>𝞹 Mathematical Alphanumeric Symbols 1D7B9</p>
<p>𞸀 Arabic Mathematical Alphabetic Symbols 1EE00</p>
<p>a normal character (1 byte)</p>

<h2>manually replaced</h2>
<div>🗿 4 bytes symbol</div>
<div>🅰 Enclosed Alphanumeric Supplement 1F170</div>
<div>𝞹 Mathematical Alphanumeric Symbols 1D7B9</div>
<div>𞸀 Arabic Mathematical Alphabetic Symbols 1EE00</div>
<div>a normal character (1 byte)</div>
like image 184
Christoph Avatar answered Nov 11 '22 01:11

Christoph