Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Nodejs asymmetrical buffer <-> string conversion

Tags:

node.js

buffer

In nodejs I had naively expected the following to always output true:

let buff = Buffer.allocUnsafe(20); // Essentially random contents
let str = buff.toString('utf8');
let decode = Buffer.from(str, 'utf8');
console.log(0 === buff.compare(decode));

Given a Buffer buff, how can I detect ahead of time whether buff will be exactly equal to Buffer.from(buff.toString('utf8'), 'utf8')?

like image 996
Gershom Maes Avatar asked Mar 25 '26 06:03

Gershom Maes


1 Answers

You should be probably be fine by just testing that the input buffer contains valid UTF-8 data:

try {
    new TextDecoder('utf-8', { fatal: true }).decode(buff);
    console.log(true);
} catch {
    console.log(false);
}

But I wouldn't swear on Node being 100% consistent in the handling of invalid UTF-8 data when converting from string to buffer. If you want to be safe, you'll have to stick to buffer comparison. You could make the process of encoding/decoding a little more efficient by using transcode, which does not require creating a temporary string.

import { transcode } from 'buffer';

let buff = Buffer.allocUnsafe(20);
let decode = transcode(buff, 'utf8', 'utf8');
console.log(0 === buff.compare(decode));

If you're interested how TextDecoder determines if a buffer represents a valid utf8 string, the rigorous definition of this procedure can be found here.

like image 114
GOTO 0 Avatar answered Mar 27 '26 19:03

GOTO 0



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!