Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unicode characters not rendering properly in HTML5 canvas

I am trying to render a unicode treble clef using the HTML5 canvas element. When using the correct character code (specifically 1D120), it renders fine in HTML, but when I try to use it inside of a canvas a weird character appears

The following code is in my javascript file which works its magic on the canvas...

var canvas = document.getElementById('canvas');
var context = canvas.getContext('2d');

context.font = "48px serif";
context.strokeText("\u1D120", 10, 50);
<h1>&#x1D120;</h1>

<canvas id="canvas" width="100" height="100">
</canvas>

Unfortunately I can't put a picture of the character because my rep is too low as of yet.

Any insight into what might be causing this problem is appreciated. Thanks in advance!

like image 889
noocsharp Avatar asked Apr 05 '15 22:04

noocsharp


1 Answers

JavaScript strings use UTF-16 encoding. Your character requires a two-part escape because it's a 3-byte UTF-8 sequence codepoint that requires 2 UTF-16 characters.

Stolen from a blog post by somebody smarter than me is this handy function:

function toUTF16(codePoint) {
    var TEN_BITS = parseInt('1111111111', 2);
    function u(codeUnit) {
        return '\\u'+codeUnit.toString(16).toUpperCase();
    }

    if (codePoint <= 0xFFFF) {
        return u(codePoint);
    }
    codePoint -= 0x10000;

    // Shift right to get to most significant 10 bits
    var leadSurrogate = 0xD800 + (codePoint >> 10);

    // Mask to get least significant 10 bits
    var tailSurrogate = 0xDC00 + (codePoint & TEN_BITS);

    return u(leadSurrogate) + u(tailSurrogate);
}

When you invoke that with your code:

var treble = toUTF16(0x1D120);

you get back "\uD834\uDD20".

Thanks again to Dr. Axel Rauschmayer for the code above — read the excellent linked blog post for more information.

like image 121
Pointy Avatar answered Sep 26 '22 06:09

Pointy