Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does simply piping to the response object render data to the client?

In the example code in this article, how is the last segment of the stream working on the line:

fs.createReadStream(filePath).pipe(brotli()).pipe(res)

I understand that the first part reading the file, the second is compressing it, but what is .pipe(res)? which seems to do the job I'd usually do with res.send or res.sendFile.

Full code:

const accepts = require('accepts')
const brotli = require('iltorb').compressStream
function onRequest (req, res) {
  res.setHeader('Content-Type', 'text/html')
  const fileName = req.params.fileName
  const filePath = path.resolve(__dirname, 'files', fileName)
  const encodings = new Set(accepts(req).encodings())
  if (encodings.has('br')) {
    res.setHeader('Content-Encoding', 'br')
    fs.createReadStream(filePath).pipe(brotli()).pipe(res)
  }
}
const app = express()
app.use('/files/:fileName', onRequest)

localhost:5000/files/test.txt => Browser displays text contents of that file

How does simply piping the data to the response object render the data back to the client?

† which I changed slightly to use express, and a few other minor stuff.

like image 419
1252748 Avatar asked Dec 31 '17 01:12

1252748


1 Answers

"How does simply piping the data to the response object render the data back to the client?"

The wording of "the response object" in the question could mean the asker is trying to understand why piping data from a stream to res does anything. The misconception is that res is just some object.

This is because all express Responses (res) inherit from http.ServerResponse (on this line), which is a writable Stream. Thus, whenever data is written to res, the written data is handled by http.ServerResponse which internally sends the written data back to the client.

Internally, res.send actually just writes to the underlying stream it represents (itself). res.sendFile actually pipes the data read from the file to itself.

In case the act of "piping" data from one stream to another is unclear, see the section at the bottom.


If, instead, the flow of data from file to client isn't clear to the asker, then here's a separate explanation.

I'd say the first step to understanding this line is to break it up into smaller, more understandable fragments:

First, fs.createReadStream is used to get a readable stream of a file's contents.

const fileStream = fs.createReadStream(filePath);

Next, a transform stream that transforms data into a compressed format is created and the data in the fileStream is "piped" (passed) into it.

const compressionStream = brotli();
fileStream.pipe(compressionStream);

Finally, the data that passes through the compressionStream (the transform stream) is piped into the response, which is also a writable stream.

compressionStream.pipe(res);

The process is quite simple when laid out visually:

stream flowchart

Following the flow of data is now quite simple: the data first comes from a file, through a compressor, and finally to the response, which internally sends the data back to the client.

Wait, but how does the compression stream pipe into the response stream?

The answer is that pipe returns the destination stream. That means when you do a.pipe(b), you'll get b back from the method call.

Take the line a.pipe(b).pipe(c) for example. First, a.pipe(b) is evaluated, returning b. Then, .pipe(c) is called on the result of a.pipe(b), which is b, thus being equivalent to b.pipe(c).

<code>pipe</code> flowchart

a.pipe(b).pipe(c);

// is the same as

a.pipe(b); // returns `b`
b.pipe(c);

// is the same as

(a.pipe(b)).pipe(c);

The wording "imply piping the data to the response object" in the question could also entail the asker doesn't understand the flow of the data, thinking that the data goes directly from a to c. Instead, the above should clarify that the data goes from a to b, then b to c; fileStream to compressionStream, then compressionStream to res.


A Code Analogy

If the whole process still makes no sense, it might be beneficial to rewrite the process without the concept of streams:

First, the data is read from the file.

const fileContents = fs.readFileSync(filePath);

The fileContents are then compressed. This is done using some compress function.

function compress(data) {
    // ...
}

const compressedData = compress(fileContents);

Finally, the data is sent back to the client through the response res.

res.send(compressedData);

The original line of code in the question and the above process are more or less the same, barring the inclusion of streams in the original.

The act of taking some data in from an outside source (fs.readFileSync) is like a readable Stream. The act of transforming the data (compress) via a function is like a transform Stream. The act of sending the data to an outside source (res.send) is like a writable Stream.


"Streams are Confusing"

If you're confused about how streams work, here's a simple analogy: each type of stream can be thought of in the context of water (data) flowing down the side of a mountain from a lake on the top.

  • Readable streams are like the lake on the top, the source of the water (data).
  • Writable streams are like people or plants at the bottom of the mountain, consuming the water (data).
  • Duplex streams are just streams that are both Readable and Writable. They're be akin to a facility at the bottom that takes in water and puts out some type of product (i.e. purified water, carbonated water, etc.).
  • Transform streams are also Duplex streams. They're like rocks or trees on the side of the mountain, forcing the water (data) to take a different path to get to the bottom.

A convenient way of writing all data read from a readable stream directly to a writable stream is to just pipe it, which is just directly connecting the lake to the people.

readable.pipe(writable); // easy & simple

This is in contrast to reading data from the readable stream, then manually writing it to the writable stream:

// "pipe" data from a `readable` stream to a `writable` one.
readable.on('data', (chunk) => {
    writable.write(chunk);
});
readable.on('end', () => writable.end());

You might immediately question why Transform streams are the same as Duplex streams. The only difference between the two is how they're implemented.

Transform streams implement a _transform function that's supposed to take in written data and return readable data, whereas a Duplex stream is simply both a Readable and Writable stream, thus having to implement _read and _write.

like image 194
Clavin Avatar answered Oct 06 '22 08:10

Clavin