Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Node.js: How to read a stream into a buffer?

Tags:

node.js

I wrote a pretty simple function that downloads an image from a given URL, resize it and upload to S3 (using 'gm' and 'knox'), I have no idea if I'm doing the reading of a stream to a buffer correctly. (everything is working, but is it the correct way?)

also, I want to understand something about the event loop, how do I know that one invocation of the function won't leak anything or change the 'buf' variable to another already running invocation (or this scenario is impossible because the callbacks are anonymous functions?)

var http = require('http'); var https = require('https'); var s3 = require('./s3'); var gm = require('gm');  module.exports.processImageUrl = function(imageUrl, filename, callback) { var client = http; if (imageUrl.substr(0, 5) == 'https') { client = https; }  client.get(imageUrl, function(res) {     if (res.statusCode != 200) {         return callback(new Error('HTTP Response code ' + res.statusCode));     }      gm(res)         .geometry(1024, 768, '>')         .stream('jpg', function(err, stdout, stderr) {             if (!err) {                 var buf = new Buffer(0);                 stdout.on('data', function(d) {                     buf = Buffer.concat([buf, d]);                 });                  stdout.on('end', function() {                     var headers = {                         'Content-Length': buf.length                         , 'Content-Type': 'Image/jpeg'                         , 'x-amz-acl': 'public-read'                     };                      s3.putBuffer(buf, '/img/d/' + filename + '.jpg', headers, function(err, res) {                         if(err) {                             return callback(err);                         } else {                             return callback(null, res.client._httpMessage.url);                         }                     });                 });             } else {                 callback(err);             }         });     }).on('error', function(err) {         callback(err);     }); }; 
like image 237
Gal Ben-Haim Avatar asked Jan 10 '13 23:01

Gal Ben-Haim


People also ask

How do I use buffer and stream in node JS?

A buffer memory in Node by default works on String and Buffer . We can also make the buffer memory work on JavaScript objects. To do so, we need to set the property objectMode on the stream object to true . If we try to push some data into the stream, the data is pushed into the stream buffer.

Is buffer a stream in node JS?

Buffer: In Node. js to manipulate a stream of binary data, the buffer module can be included in the code. However, the buffer is a global object in Node.

What is stream in node js explain buffers in Node JS?

Buffers in Streams In Node Js, buffers are used to store raw binary data. A buffer represents a chunk of memory that is allocated on our computer. The size of the buffer, once set, cannot be changed. A buffer is used to store bytes.


2 Answers

Overall I don't see anything that would break in your code.

Two suggestions:

The way you are combining Buffer objects is a suboptimal because it has to copy all the pre-existing data on every 'data' event. It would be better to put the chunks in an array and concat them all at the end.

var bufs = []; stdout.on('data', function(d){ bufs.push(d); }); stdout.on('end', function(){   var buf = Buffer.concat(bufs); }) 

For performance, I would look into if the S3 library you are using supports streams. Ideally you wouldn't need to create one large buffer at all, and instead just pass the stdout stream directly to the S3 library.

As for the second part of your question, that isn't possible. When a function is called, it is allocated its own private context, and everything defined inside of that will only be accessible from other items defined inside that function.

Update

Dumping the file to the filesystem would probably mean less memory usage per request, but file IO can be pretty slow so it might not be worth it. I'd say that you shouldn't optimize too much until you can profile and stress-test this function. If the garbage collector is doing its job you may be overoptimizing.

With all that said, there are better ways anyway, so don't use files. Since all you want is the length, you can calculate that without needing to append all of the buffers together, so then you don't need to allocate a new Buffer at all.

var pause_stream = require('pause-stream');  // Your other code.  var bufs = []; stdout.on('data', function(d){ bufs.push(d); }); stdout.on('end', function(){   var contentLength = bufs.reduce(function(sum, buf){     return sum + buf.length;   }, 0);    // Create a stream that will emit your chunks when resumed.   var stream = pause_stream();   stream.pause();   while (bufs.length) stream.write(bufs.shift());   stream.end();    var headers = {       'Content-Length': contentLength,       // ...   };    s3.putStream(stream, ....); 
like image 129
loganfsmyth Avatar answered Sep 20 '22 10:09

loganfsmyth


Javascript snippet

function stream2buffer(stream) {      return new Promise((resolve, reject) => {                  const _buf = [];          stream.on("data", (chunk) => _buf.push(chunk));         stream.on("end", () => resolve(Buffer.concat(_buf)));         stream.on("error", (err) => reject(err));      }); }  

Typescript snippet

async function stream2buffer(stream: Stream): Promise<Buffer> {      return new Promise < Buffer > ((resolve, reject) => {                  const _buf = Array < any > ();          stream.on("data", chunk => _buf.push(chunk));         stream.on("end", () => resolve(Buffer.concat(_buf)));         stream.on("error", err => reject(`error converting stream - ${err}`));      }); }  
like image 20
bsorrentino Avatar answered Sep 18 '22 10:09

bsorrentino