I'm using node.js, node-mysql
npm module, have a utf8 column in mysql (sql_mode=STRICT_ALL_TABLES
), and can store "regular" utf8 chars normally, but for some input it throws this:
AssertionError: Error: ER_TRUNCATED_WRONG_VALUE_FOR_FIELD: Incorrect string value: '\xF3\xA5\xB5\xA5\xEF\xBF...'
I imagine this is due to some characters being outside utf8
. I know it's possible to use utf8mb
charset in mysql, but in this case I don't care about those characters and want to just remove them from a string instead of changing mysql config.
Is it possible to remove such characters using node.js?
RFC here features a list of unicode char codes along with the number of chars they take:
Char. number range | UTF-8 octet sequence
(hexadecimal) | (binary)
--------------------+---------------------------------------------
0000 0000-0000 007F | 0xxxxxxx
0000 0080-0000 07FF | 110xxxxx 10xxxxxx
0000 0800-0000 FFFF | 1110xxxx 10xxxxxx 10xxxxxx
0001 0000-0010 FFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
While the error message I've provided did not show any wide characters:
node -e 'console.log("\xF3\xA5\xB5\xA5\xEF\xBF")'
ó¥µ¥ï¿
And this page shows that all of those characters are 2 byte ones, I've still tried removing wide (3 or more octets) characters from my string with .replace(/[\u0800-\uFFFF]/g, '')
, and it worked!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With