Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading binary data in node.js

I'm having problems reading binary data in node.js. This is what I do:

$ cat test.js 
var fs = require('fs'),
    binary = fs.readFileSync('./binary', 'binary').toString('binary');
process.stdout.write(binary.substring(0, 48));
$ xxd binary
00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000  .ELF............
00000010: 0300 3e00 0100 0000 0008 0000 0000 0000  ..>.............
00000020: 4000 0000 0000 0000 10a0 0000 0000 0000  @...............
$ node test.js | xxd
00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000  .ELF............
00000010: 0300 3e00 0100 0000 0008 0000 0000 0000  ..>.............
00000020: 4000 0000 0000 0000 10c2 a000 0000 0000  @...............
00000030: 00                                       .
$

Notice how a 0xc2 byte is inserted at index 0x29 when reading with node. Why is that? I've stated binary encoding both to readFileSync and toString. I've also tried ascii but then I get a different and equally wrong result.

like image 917
Robert Larsen Avatar asked Sep 27 '17 07:09

Robert Larsen


2 Answers

The 'binary' encoding is an alias for 'latin1', which you clearly don't want when reading non-character data.

If you want the raw data, don't specify an encoding at all (or supply null)*. You'll get a Buffer instead of a string, which you'd then want to use directly rather than using toString on it.

* (Some APIs [like fs.watch] also accept 'buffer', but it's not on the list of encodings and readFileSync doesn't say it does. [Thanks Patrick for providing the list link.])

like image 132
T.J. Crowder Avatar answered Sep 18 '22 16:09

T.J. Crowder


Just to add some more information, the reason this is happening is because you're passing a string to stdout.write(), which is implicitly converted back into a Buffer before being written, and when you do that in the Node.js REPL with this particular substring at position 0x28 of your binary file, you get the behavior you described:

> new Buffer('\u0010\u00a0')
<Buffer 10 c2 a0>

So as @T.J.Crowder correctly suggested, here's how to fix your script:

var fs = require('fs'),
    binary = fs.readFileSync('./binary');
process.stdout.write(binary.slice(0, 48));

This also uses Buffer#slice() instead of String#substring().

like image 31
Patrick Roberts Avatar answered Sep 17 '22 16:09

Patrick Roberts