Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sending large image data over HTTP in Node.js

In my development environment I have two servers. One sends and image to the other over a POST http request.

Client server does this:

    fs.readFile(rawFile.path,'binary',function (err, file){         restler.post("http://0.0.0.0:5000",{             data: file,             headers:{                 "Content-Type": rawFile.type,             }         }).on('complete',function(data,response){                                            console.log(data);             res.send("file went through")         }) 

The server that recieves the request does this:

    server.post('/',function(req,res,next){         fs.writeFileSync("test.png",req.body,"binary",function(err){             if(err) throw err;             res.send("OK")         })     }) 

If i send a small image it works fine. However, if i send a large image although the file is saved correctly only the first upper portion of the image is displayed. The rest is black. Image size is correct.

I guess it's just the first chunk of the image that's being written on the file. I've tried creating a readStream and a writeStream but it doesn't seem to work:

req.body.pipe(fs.createWriteStream('test.png')) 

Can i stream directly from the binary data and pipe it into the file? For what i've seen, readStream is often used to stream from files not raw binary data.

I read a few posts but it doesn't seem to work for me.

I'm using restler module in the client server and restify in the other.

Thanks!

like image 777
Maroshii Avatar asked Feb 21 '13 12:02

Maroshii


1 Answers

Sorry to be blunt, but there's a lot wrong here.

readFile reads the entire contents of a file into memory before invoking the callback, at which point you begin uploading the file.

This is bad–especially when dealing with large files like images–because there's really no reason to read the file into memory. It's wasteful; and under load, you'll find that your server will run out of memory and crash.

Instead, you want to get a stream, which emits chunks of data as they're read from disk. All you have to do is pass those chunks along to your upload stream (pipe), and then discard the data from memory. In this way, you never use more than a small amount of buffer memory.

(A readable stream's default behavior is to deal in raw binary data; it's only if you pass an encoding that it deals in text.)

The request module makes this especially easy:

fs.createReadStream('test.png').pipe(request.post('http://0.0.0.0:5000/')); 

On the server, you have a larger problem. Never use *Sync methods. It blocks your server from doing anything (like responding to other requests) until the entire file is flushed to disk, which can take seconds.

So instead, we want to take the incoming data stream and pipe it to a filesystem stream. You were on the right track originally; the reason that req.body.pipe(fs.createWriteStream('test.png')) didn't work is because body is not a stream.

body is generated by the bodyParser middleware. In restify, that middleware acts much like readFile in that it buffers the entire incoming request-entity in memory. In this case, we don't want that. Disable the body parser middleware.

So where is the incoming data stream? It is the req object itself. restify's Request inherits node's http.IncomingMessage, which is a readable stream. So:

fs.createWriteStream('test.png').pipe(req); 

I should also mention that this all works so simply because there's no form parsing overhead involved. request simply sends the file with no multipart/form-data wrappers:

POST / HTTP/1.1 host: localhost:5000 content-type: application/octet-stream Connection: keep-alive Transfer-Encoding: chunked  <image data>... 

This means that a browser could not post a file to this URL. If that's a need, look in to formidable, which does streaming parsing of request-entities.

like image 189
josh3736 Avatar answered Oct 09 '22 01:10

josh3736