UTF-16 to UTF-8 conversion in JavaScript

1 Answers

You want to decode UTF-16, not convert to UTF-8. Decoding means that the result is a string of abstract characters. Of course there is an internal encoding for strings as well, UTF-16 or UCS-2 in javascript, but that's an implementation detail.

With strings the goal is that you don't have to worry about encodings but just about manipulating characters "as they are". So you can write string methods that don't need to decode input at all. Of course there are many edge cases where this falls apart.

You cannot decode utf-16 just by removing nulls. I mean this will work fine for the first 256 code points of unicode, but you will get garbage when any of the other ~110000 characters in unicode are used. You cannot even get the most popular non-ASCII characters like em dash or any smart quotes working.

Also, looking at your example, it looks like UTF-16LE.

//Braindead decoder that assumes fully valid input
function decodeUTF16LE( binaryStr ) {
    var cp = [];
    for( var i = 0; i < binaryStr.length; i+=2) {
        cp.push( 
             binaryStr.charCodeAt(i) |
            ( binaryStr.charCodeAt(i+1) << 8 )
        );
    }

    return String.fromCharCode.apply( String, cp );
}

var base64decode = atob; //In chrome and firefox, atob is a native method available for base64 decoding

var base64 = "VABlAHMAdABpAG4AZwA";
var binaryStr = base64decode(base64);
var result = decodeUTF16LE(binaryStr);

Now you can even get smart quotes working:

var base64 = "HCBoAGUAbABsAG8AHSA="
var binaryStr = base64decode(base64);
var result = decodeUTF16LE(binaryStr);
//"“hello”"

answered Sep 22 '22 10:09

Esailija

Related questions
                            
                                Passing window and undefined to an immediately invoked anonymous function. Why? [duplicate]
                            
                                In Backbone.js, what are all events for the "binds"?
                            
                                What does /// mean in JavaScript?
                            
                                JQuery / Javascript - How to find the index of a table header by the header text
                            
                                Why are two vertical scrollbars showing?
                            
                                Copy an associative array in JavaScript
                            
                                check if string contains url anywhere in string using javascript
                            
                                why jquery can't animate number accurately?
                            
                                JavaScript document.execCommand remove formatBlock formatting?
                            
                                How to grab the x, y position of html elements with javascript
                            
                                jQuery: stopPropagation for link
                            
                                touchmove/MSPointerMove event not firing in Windows 8
                            
                                Inline javascript with "<script>" string closes script tag by mistake
                            
                                jQuery: replace text inside link? [duplicate]
                            
                                Checkbox doesn't get checked in knockout
                            
                                Jasmine: did not find expected alphabetic or numeric character while scanning an alias
                            
                                Missing Catch or Finally After Try
                            
                                Convert Javascript Date object to PST time zone
                            
                                JS or Jquery create unique span ID
                            
                                I want to delay a link for a period of 500 with javascript

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

UTF-16 to UTF-8 conversion in JavaScript

Tags:

javascript

utf-8

base64

utf-16

Don P

People also ask

1 Answers

Esailija

Recent Activity

Donate For Us