Stream highWaterMark misunderstanding

Tags:

node.js

After reading some code on Github, it seems like i misandurstood how the highWaterMark concept works.

On a case of a writable stream which would write a big amount of data as fast as possible, here is the idea i had of the lifecycle:

1) While the highWaterMark limit is not reached, the stream is able to buffer and write data.

2) If the highWaterMark limit is reached, the stream cannot buffer anymore, so the #write method returns false to let you know that what you tried to write won't be write (never).

3) Once the stream emits a drain event, it means that the buffer has been cleaned up, and you can write again from where you got "rejected".

It was clear and simple in my mind, but it looks like this is not exactly true (on the step 2), is the data you try to write really "rejected" when the #write method returns false ? Or is it buffered (or something else) ?

Sorry for the basic question but i need to be sure !

560

asked Mar 04 '16 16:03

Ludo

2 Answers

2) If the highWaterMark limit is reached, the stream cannot buffer anymore, so the #write method returns false to let you know that what you tried to write won't be write (never).

This is false, data is still buffered, the stream doesn't lose it. But you should stop writing at this point. This is to allow backpressure to propagate.

Your question is addressed in the writable.write(chunk[, encoding][, callback]) docs:

This return value is strictly advisory. You MAY continue to write, even if it returns false. However, writes will be buffered in memory, so it is best not to do this excessively. Instead, wait for the 'drain' event before writing more data.

answered Oct 23 '22 15:10

Matt Harrison

is the data you try to write really "rejected" when the #write method returns false ? Or is it buffered (or something else) ?

The data is buffered. However, excessive calls to write() without allowing the buffer to drain will cause high memory usage, poor garbage collector performance, and could even cause Node.js to crash with an Allocation failed - JavaScript heap out of memory error. See this related question:

Node: fs write() doesn't write inside loop. Why not?

For reference, here are some relevant details on highWaterMark and backpressure from the current docs (v8.4.0):

`writable.write()`

The return value is true if the internal buffer is less than the highWaterMark configured when the stream was created after admitting chunk. If false is returned, further attempts to write data to the stream should stop until the 'drain' event is emitted.

While a stream is not draining, calls to write() will buffer chunk, and return false. Once all currently buffered chunks are drained (accepted for delivery by the operating system), the 'drain' event will be emitted. It is recommended that once write() returns false, no more chunks be written until the 'drain' event is emitted. While calling write() on a stream that is not draining is allowed, Node.js will buffer all written chunks until maximum memory usage occurs, at which point it will abort unconditionally. Even before it aborts, high memory usage will cause poor garbage collector performance and high RSS (which is not typically released back to the system, even after the memory is no longer required).

Backpressuring in Streams

In any scenario where the data buffer has exceeded the highWaterMark or the write queue is currently busy, .write() will return false.

When a false value is returned, the backpressure system kicks in. It will pause the incoming Readable stream from sending any data and wait until the consumer is ready again. Once the data buffer is emptied, a .drain() event will be emitted and resume the incoming data flow.

Once the queue is finished, backpressure will allow data to be sent again. The space in memory that was being used will free itself up and prepare for the next batch of data.

               +-------------------+         +=================+
               |  Writable Stream  +--------->  .write(chunk)  |
               +-------------------+         +=======+=========+
                                                     |
                                  +------------------v---------+
   +-> if (!chunk)                |    Is this chunk too big?  |
   |     emit .end();             |    Is the queue busy?      |
   +-> else                       +-------+----------------+---+
   |     emit .write();                   |                |
   ^                                   +--v---+        +---v---+
   ^-----------------------------------<  No  |        |  Yes  |
                                       +------+        +---v---+
                                                           |
           emit .pause();          +=================+     |
           ^-----------------------+  return false;  <-----+---+
                                   +=================+         |
                                                               |
when queue is empty     +============+                         |
^-----------------------<  Buffering |                         |
|                       |============|                         |
+> emit .drain();       |  ^Buffer^  |                         |
+> emit .resume();      +------------+                         |
                        |  ^Buffer^  |                         |
                        +------------+   add chunk to queue    |
                        |            <---^---------------------<
                        +============+

answered Oct 23 '22 14:10

TachyonVortex

Related questions
                            
                                Android immediately created Pair elements are null
                            
                                How to create a src/main/resources directory?
                            
                                graceful-fs warning when running Grunt task
                            
                                "No extensions found" when running Visual Studio Code from source
                            
                                ASP.NET Core DependencyResolver
                            
                                Hangfire CRON in UTC time
                            
                                How to assert that a type equals a given value
                            
                                How to edit django-allauth default templates?
                            
                                iOS 10.1 and Xcode 8 issue
                            
                                Grouped Bar graph Pandas
                            
                                mac OSX Sierra, can't add vagrant box laravel/homestead, due to needing cURL v 9.0.0 or later, and libcurl.4.dylib providing v7.0.0
                            
                                What is the most efficient way to copy some properties from an object in JavaScript?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With