//string with correct json format
{"reaction":"\ud83d\udc4d","user":{"id":"xyz"}}
//after JSON.parse()
{ reaction: '👍', user: [Object] }
What I want to do is keep the reaction value encoded, but JSON.parse()
does not exactly do what I want.
Update
In the end I decided to leave JSON.parse()
alone and fix the database issue as @Brad suggested. I changed the database format, but that was not enough to fix the problem, so I found this. Every statement must now start with SET NAMES utf8mb4;
then the query. Also in the connection you then have to have these {charset : 'utf8mb4', multipleStatements: true}
. Without node-mysql proper documentation it's quite hard to find the best answer, but in the end I got to learn a lot along the way, Thank you.
If you don't want parse to unencode that string then you could escape the backslashes, e.g. "\\ud83d\\udc4d"
Do you control where that data comes from? Perhaps you want to provide a "replacer" in JSON.stringify to escape those, or an "reviver" in JSON.parse.
What options do you have for exercising control over the stringify or parse?
const myReviver = (key, val) => key === "reaction" ? val.replace(/\\/g, "\\\\") : val;
var safeObj = JSON.parse(myJson, myReviver);
CAUTION: This doesn't seem to work in a browser, as it appears the \uxxxx character is decoded in the string before the reviver is able to operate on it, and therefore there are no backslashes left to escape!
Following on from chat with the OP it transpired that adding multiple escaped backslashes to the property with utf characters did eventually lead to the desired value being stored in the database. A number of steps were unescaping the backslashes until the real utf character was eventually being exposed.
This is brittle and far from advisable, but it did help to identify what was/wasn't to blame.
This appears to be the best solution. Strip all backslashes from the data before it is converted into the utf characters or processed in any way. Essentially storing deactivated "uxxxxuxxxx" codes in the database.
Those codes can be revived to utf characters at the point of rendering by reinserting the backslashes using a regular expression:
database_field.replace(/(u[0-9a-fA-F]{4})/g, "\\$1");
Ironically, that seems to skip utf interpretation and you actually end up with the string that was wanted in the first place. So to force it to deliver the character that was previously seen, it can be processed with:
emoji = JSON.parse(`{"utf": "${myUtfString}"}`).utf;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With