Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Stream highWaterMark misunderstanding

Tags:

node.js

After reading some code on Github, it seems like i misandurstood how the highWaterMark concept works.

On a case of a writable stream which would write a big amount of data as fast as possible, here is the idea i had of the lifecycle:

1) While the highWaterMark limit is not reached, the stream is able to buffer and write data.

2) If the highWaterMark limit is reached, the stream cannot buffer anymore, so the #write method returns false to let you know that what you tried to write won't be write (never).

3) Once the stream emits a drain event, it means that the buffer has been cleaned up, and you can write again from where you got "rejected".

It was clear and simple in my mind, but it looks like this is not exactly true (on the step 2), is the data you try to write really "rejected" when the #write method returns false ? Or is it buffered (or something else) ?

Sorry for the basic question but i need to be sure !

like image 560
Ludo Avatar asked Mar 04 '16 16:03

Ludo


People also ask

What is highWaterMark stream?

highWaterMark is just an indicator to communicate to upstream that no more data should be written/pushed. Obviously you can write more data and it will just continue to be buffered in memory. highWaterMark does not dictate the size of chunks passed to _transform() / _write() .

What is highWaterMark in node?

The highWaterMark option gives you some control on the amount of "buffer memory" used. Once you've written more than the amount specified, write will return false to give you an opportunity to stop writing.


2 Answers

2) If the highWaterMark limit is reached, the stream cannot buffer anymore, so the #write method returns false to let you know that what you tried to write won't be write (never).

This is false, data is still buffered, the stream doesn't lose it. But you should stop writing at this point. This is to allow backpressure to propagate.

Your question is addressed in the writable.write(chunk[, encoding][, callback]) docs:

This return value is strictly advisory. You MAY continue to write, even if it returns false. However, writes will be buffered in memory, so it is best not to do this excessively. Instead, wait for the 'drain' event before writing more data.

like image 69
Matt Harrison Avatar answered Oct 23 '22 15:10

Matt Harrison


is the data you try to write really "rejected" when the #write method returns false ? Or is it buffered (or something else) ?

The data is buffered. However, excessive calls to write() without allowing the buffer to drain will cause high memory usage, poor garbage collector performance, and could even cause Node.js to crash with an Allocation failed - JavaScript heap out of memory error. See this related question:

Node: fs write() doesn't write inside loop. Why not?


For reference, here are some relevant details on highWaterMark and backpressure from the current docs (v8.4.0):

writable.write()

The return value is true if the internal buffer is less than the highWaterMark configured when the stream was created after admitting chunk. If false is returned, further attempts to write data to the stream should stop until the 'drain' event is emitted.

While a stream is not draining, calls to write() will buffer chunk, and return false. Once all currently buffered chunks are drained (accepted for delivery by the operating system), the 'drain' event will be emitted. It is recommended that once write() returns false, no more chunks be written until the 'drain' event is emitted. While calling write() on a stream that is not draining is allowed, Node.js will buffer all written chunks until maximum memory usage occurs, at which point it will abort unconditionally. Even before it aborts, high memory usage will cause poor garbage collector performance and high RSS (which is not typically released back to the system, even after the memory is no longer required).

Backpressuring in Streams

In any scenario where the data buffer has exceeded the highWaterMark or the write queue is currently busy, .write() will return false.

When a false value is returned, the backpressure system kicks in. It will pause the incoming Readable stream from sending any data and wait until the consumer is ready again. Once the data buffer is emptied, a .drain() event will be emitted and resume the incoming data flow.

Once the queue is finished, backpressure will allow data to be sent again. The space in memory that was being used will free itself up and prepare for the next batch of data.

               +-------------------+         +=================+
               |  Writable Stream  +--------->  .write(chunk)  |
               +-------------------+         +=======+=========+
                                                     |
                                  +------------------v---------+
   +-> if (!chunk)                |    Is this chunk too big?  |
   |     emit .end();             |    Is the queue busy?      |
   +-> else                       +-------+----------------+---+
   |     emit .write();                   |                |
   ^                                   +--v---+        +---v---+
   ^-----------------------------------<  No  |        |  Yes  |
                                       +------+        +---v---+
                                                           |
           emit .pause();          +=================+     |
           ^-----------------------+  return false;  <-----+---+
                                   +=================+         |
                                                               |
when queue is empty     +============+                         |
^-----------------------<  Buffering |                         |
|                       |============|                         |
+> emit .drain();       |  ^Buffer^  |                         |
+> emit .resume();      +------------+                         |
                        |  ^Buffer^  |                         |
                        +------------+   add chunk to queue    |
                        |            <---^---------------------<
                        +============+
like image 26
TachyonVortex Avatar answered Oct 23 '22 14:10

TachyonVortex