I wanted to write a method to escape special chars like 'ä' to their responding Unicode (e.g. \u00e4).
For some reason JS finds it amusing to not even save the 'ä' internally but use 'üÜ' or some other garble, so when I convert it spits out '\u00c3\u00b6\u00c3\u002013' because it converts these chars instead of 'ä'.
I have tried setting the HTML file's encoding to utf-8 and tried loading the scripts with charset="UTF-8" to no avail. The code doesn't really do anything special but here it is:
String.prototype.replaceWithUtf8 = function() {
var str_newString = '';
var str_procString = this;
for (var i = 0; i < str_procString.length; i++) {
if (str_procString.charCodeAt(i) > 126) {
var hex_uniCode = '\\u00' + str_procString.charCodeAt(i).toString(16);
console.log(hex_uniCode + " (" + str_procString.charAt(i) + ")");
str_newString += hex_uniCode;
} else {
str_newString += str_procString.charAt(i);
}
}
return str_newString;
}
var str_item = "Lärm, Lichter, Lücken, Löcher."
console.log(str_item); // Lärm, Lichter, Lücken, Löcher.
console.log(str_item.replaceWithUtf8()); //L\u00c3\u00a4rm, Lichter, L\u00c3\u00bccken, L\u00c3\u00b6cher.
JavaScript allows us to add special characters to a text String using a backslash (\) sign. We can add different types of special characters, including the single quote, double quote, ampersand, new line, tab, backspace, form feed, etc., using the backslash just before the characters.
In order to encode/decode a string in JavaScript, We are using built-in functions provided by JavaScript. btoa(): This method encodes a string in base-64 and uses the “A-Z”, “a-z”, “0-9”, “+”, “/” and “=” characters to encode the provided string.
UTF-16 is used by systems such as the Microsoft Windows API, the Java programming language and JavaScript/ECMAScript. It is also sometimes used for plain text and word-processing data files on Microsoft Windows. It is rarely used for files on Unix-like systems.
To use a special character as a regular one, prepend it with a backslash: \. . That's also called “escaping a character”.
I have no idea how or why but I just restarted the server again and now it's displaying correctly. To follow up; here's the code for everyone who's interested:
String.prototype.replaceWithUtf8 = function() {
var str_newString = '';
var str_procString = this;
var arr_replace = new Array('/', '"');
var arr_replaceWith = new Array('\\/', '\\"');
for (var i = 0; i < str_procString.length; i++) {
var int_charCode = str_procString.charCodeAt(i);
var cha_charAt = str_procString.charAt(i);
var int_chrIndex = arr_replace.indexOf(cha_charAt);
if (int_chrIndex > -1) {
console.log(arr_replaceWith[int_chrIndex]);
str_newString += arr_replaceWith[int_chrIndex];
} else {
if (int_charCode > 126 && int_charCode < 65536) {
var hex_uniCode = '\\u' + ("000" + int_charCode.toString(16)).substr(-4);
console.log(hex_uniCode + " (" + cha_charAt + ")");
str_newString += hex_uniCode;
} else {
str_newString += cha_charAt;
}
}
}
return str_newString;
}
Use '\\u' + ('000' + str_procString.charCodeAt(i).toString(16) ).stubstr(-4);
instead to get the right escape sequences - yours do always start with 00
. Also, instead of a for-loop processing your string, .replace()
might be faster.
On your question:
console.log("Lärm, Lichter, Lücken, Löcher."); // Lärm, Lichter, Lücken, Löcher.
does not sound as you really sent the file with the right encoding. Might be a server problem, too, if it is correctly saved already.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With