I'm considering building an application in nodejs which would need to stream large (>GB) files containing an array of integers. Crucially the array needs to be serialised optimally, so not ascii based, ideally using 8 bits for smaller integers (which would be the vast majority of the data) but still being able to represent larger numbers.
This question is maybe about more than nodejs, but how does one go about this in nodejs? Are there readily available solutions for streaming files with custom byte encodings from disk? Or better, integer arrays?
Ideally it should be possible for the decoding of each part of the stream to be disk bound rather than cpu bound, even with an ssd.
I feel silly for not diving into the documentation first (the purpose of this project is for me to learn nodejs after all).
Turns out the default behaviour of the File System module looks up to the job. Though I haven't implemented the variable-length quantity decoding part or tested it for speed yet.
var fs, rs, bufferSize, buffer, i;
fs = require('fs');
rs = fs.createReadStream('/Path/to/file');
bufferSize = 10;
while(true){
buffer = rs.read(bufferSize);
if (!buffer) break;
for(i=0; i<buffer.length; i++;){
byte = buffer[i];
// interpret byte given as integer according to 'variable-length quantity' encoding
}
}
http://en.wikipedia.org/wiki/Variable-length_quantity
EDIT: I made a gist of the fully functioning script.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With