Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NodeJS - How to stream request body without buffering

In the below code I can't figure out why req.pipe(res) doesn't work, and yet doesn't throw an error either. A hunch tells me it's due to nodejs' asynch behavior, but this is a very simple case without a callback.

What am I missing?

http.createServer(function (req, res) {

  res.writeHead(200, { 'Content-Type': 'text/plain' });

  res.write('Echo service: \nUrl:  ' + req.url);
  res.write('\nHeaders:\n' + JSON.stringify(req.headers, true, 2));

  res.write('\nBody:\n'); 

  req.pipe(res); // does not work

  res.end();

}).listen(8000);

Here's the curl:

➜  ldap-auth-gateway git:(master) ✗ curl -v -X POST --data "test.payload" --header "Cookie:  token=12345678" --header "Content-Type:text/plain" localhost:9002 

Here's the debug output (see that body was uploaded):

  About to connect() to localhost port 9002 (#0)
  Trying 127.0.0.1...
    connected
    Connected to localhost (127.0.0.1) port 9002 (#0)
  POST / HTTP/1.1
  User-Agent: curl/7.24.0 (x86_64-apple-darwin12.0) libcurl/7.24.0 OpenSSL/0.9.8x zlib/1.2.5
  Host: localhost:9002
  Accept: */*
  Cookie:  token=12345678
  Content-Type:text/plain
  Content-Length: 243360
  Expect: 100-continue

  HTTP/1.1 100 Continue
  HTTP/1.1 200 OK
  Content-Type: text/plain
  Date: Sun, 04 Aug 2013 17:12:39 GMT
  Connection: keep-alive
  Transfer-Encoding: chunked

And the service responds without echoing the request body:

Echo service: 
Url:  /
Headers:
{
  "user-agent": "curl/7.24.0 (x86_64-apple-darwin12.0) libcurl/7.24.0 OpenSSL/0.9.8x zlib/1.2.5",
  "host": "localhost:9002",
  "accept": "*/*",
  "cookie": "token=12345678",
  "content-type": "text/plain",
  "content-length": "243360",
  "expect": "100-continue"
}

... and final curl debug is

Body:
 Connection #0 to host localhost left intact
 Closing connection #0

Additionally, when I stress test with large request body, I get an EPIPE error. How can I avoid this?

-- EDIT: Through trial and error I did get this to work, and it still points to being a timing issue. Though it is still strange, as the timeout causes the payload to be returned, but the timeout duration is not minded. In other words whether I set the timeout to 5 seconds or 500 seconds, the payload is properly piped back to the request and the connection is terminated.

Here's the edit:

http.createServer(function (req, res) {

    try {
      res.writeHead(200, { 'Content-Type': 'text/plain' });
      res.write('Echo service: ' + req.url + '\n' + JSON.stringify(req.headers, true, 2));
      res.write('\nBody:"\n');
      req.pipe(res);
    } catch(ex) {
      console.log(ex);
      // how to change response code to error here?  since headers have already been written?
    } finally {
      setTimeout((function() {
        res.end();
      }), 500000);
    }

}).listen(TARGET_SERVER.port);

?

like image 649
Robert Christian Avatar asked Jul 31 '13 01:07

Robert Christian


2 Answers

Pipe req to res. Req is readable stream and response is a writable stream.It should work

   http.createServer(function (req, res) {

       res.writeHead(200, { 'Content-Type': 'text/plain' });    
       res.write('Echo service: ' + req.url + '\n' + JSON.stringify(req.headers, true, 2));

       // pipe request body directly into the response body
       req.pipe(res);       

   }).listen(9002);
like image 87
Chandu Avatar answered Nov 18 '22 15:11

Chandu


So first, it looks like your curl is off, the filename of the posted data should be preceded by an @ as shown here. You'd just be posting the filename otherwise.

Aside from that, Chandu is correct in saying that the call to res.end() is the problem here.

Since IO is asynchronous in node, when you issue the .pipe command, control is immediately returned to the current context, while the pipe works in the background. When you next call res.end(), you close the stream, preventing any more data to be written.

The solution here is to let .pipe end the stream itself, which is the default.

I'd imagine that timing came into play because on different machines and different data sizes, the asynchronous IO could theoretically finish (fast IO of small dataset) before the end event on the writable stream is fully processed.

I'd recommend this blog post for some more context.

like image 36
Wyatt Avatar answered Nov 18 '22 15:11

Wyatt