Here's my code
const fs = require('fs');
const src = fs.createReadStream('bigfile3.txt');
const des = fs.createWriteStream('newTest.txt');
I can use either
src.on('data',(chunk)=>{
des.write(chunk);});
Or
src.pipe(des);
Is there any difference between this two ways of handling the file operation? The pipe method gives me an error of
> "size" argument must not be larger than 2147483647
whenever I try with a large file.(~2GB)
Can anyone explain the working behind pipe and stream? Thanks.
A pipe is a communication channel between two processes. It has a writing end and a reading end. When on open one of these two end, one get a (writing or reading) stream. So in a first approximation there is a stream at each end of a pipe.
Piping is a mechanism where we provide the output of one stream as the input to another stream. It is normally used to get data from one stream and to pass the output of that stream to another stream. There is no limit on piping operations. In other words, piping is used to process streamed data in multiple steps.
All streams are instances of EventEmitter . They emit events that can be used to read and write data. However, we can consume streams data in a simpler way using the pipe method. So this is one of the things that make streams in Node.
So what is the difference between Stream & Buffer? A buffer has a specified, definite length whereas a stream does not. A stream is a sequence of bytes that is read and/or written to, while a buffer is a sequence of bytes that is stored.
You should use the pipe method because the flow of data will be automatically managed so that the destination Writable stream is not overwhelmed by a faster Readable stream.
If your readable stream is faster than the writable stream then you may experience data loss in des.write(data)
method so better you should use src.pipe(des);
If the file size is big then you should use streams, thats the correct way of doing it, I tried similar example like yours to copy 3.5 GB file with streams and pipe, it worked flawlessly in my case. Check you must be doing something wrong.
The example which I tried
'use strict'
const fs =require('fs')
const readStream = fs.createReadStream('./Archive.zip')
const writeStream = fs.createWriteStream('./Archive3.zip')
readStream.pipe(writeStream)
However, if you still need to use stream des.write(data)
, you can handle backpressure to avoid loss of data when readStream
is faster. If the response from des.write(data)
is false
, then the writeStream
is loaded, pause the readStream src.pause()
.
To continue when writeStream
is drained, handle drain
event on writeStream and resume in the callback.
des.on("drain", () => src.resume())
To allow higher writeStream buffer memory, you can set highWaterMark
for readStream
to a very high value, example
const des = fs.createWriteStream('newTest.txt',{
highWaterMark: 1628920128
});
Be careful of too massive highWaterMark
because this takes of too much memory and defeat the primary advantage of streaming data.
I will definitely still recommend using pipe
as this handles everything for you with lesser code.
Docs:
https://nodejs.org/api/stream.html#stream_writable_write_chunk_encoding_callback
https://nodejs.org/api/stream.html#stream_readable_pipe_destination_options
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With