I am copying file with Node on an SSD under VMWare, but the performance is very low. The benchmark I have run to measure actual speed is as follows: <pre class="prettyprint"><code>$ hdparm -tT /dev/sda /dev/sda: Timing cached reads: 12004 MB in 1.99 seconds = 6025.64 MB/sec Timing buffered disk reads: 1370 MB in 3.00 seconds = 456.29 MB/sec </code></pre> However, the following Node code that copies file is very slow, evne teh consequent runs do not make it faster: <pre class="prettyprint"><code>var fs = require("fs"); fs.createReadStream("bigfile").pipe(fs.createWriteStream("tempbigfile")); </code></pre> And the runs as: <pre class="prettyprint"><code>$ seq 1 10000000 > bigfile $ ll bigfile -h -rw-rw-r-- 1 mustafa mustafa 848M Jun 3 03:30 bigfile $ time node test.js real 0m4.973s user 0m2.621s sys 0m7.236s $ time node test.js real 0m5.370s user 0m2.496s sys 0m7.190s </code></pre> What is the issue here and how can I speed it up? I believe I can write it faster in C by just adjusting the buffer size. The thing that confuses me is that when I wrote simple almost pv equivalent program, that pipes stdin to stdout as the below, it is very fast. <pre class="prettyprint"><code>process.stdin.pipe(process.stdout); </code></pre> And the runs as: <pre class="prettyprint"><code>$ dd if=/dev/zero bs=8M count=128 | pv | dd of=/dev/null 128+0 records in 174MB/s] [ <=> ] 128+0 records out 1073741824 bytes (1.1 GB) copied, 5.78077 s, 186 MB/s 1GB 0:00:05 [ 177MB/s] [ <=> ] 2097152+0 records in 2097152+0 records out 1073741824 bytes (1.1 GB) copied, 5.78131 s, 186 MB/s $ dd if=/dev/zero bs=8M count=128 | dd of=/dev/null 128+0 records in 128+0 records out 1073741824 bytes (1.1 GB) copied, 5.57005 s, 193 MB/s 2097152+0 records in 2097152+0 records out 1073741824 bytes (1.1 GB) copied, 5.5704 s, 193 MB/s $ dd if=/dev/zero bs=8M count=128 | node test.js | dd of=/dev/null 128+0 records in 128+0 records out 1073741824 bytes (1.1 GB) copied, 4.61734 s, 233 MB/s 2097152+0 records in 2097152+0 records out 1073741824 bytes (1.1 GB) copied, 4.62766 s, 232 MB/s $ dd if=/dev/zero bs=8M count=128 | node test.js | dd of=/dev/null 128+0 records in 128+0 records out 1073741824 bytes (1.1 GB) copied, 4.22107 s, 254 MB/s 2097152+0 records in 2097152+0 records out 1073741824 bytes (1.1 GB) copied, 4.23231 s, 254 MB/s $ dd if=/dev/zero bs=8M count=128 | dd of=/dev/null 128+0 records in 128+0 records out 1073741824 bytes (1.1 GB) copied, 5.70124 s, 188 MB/s 2097152+0 records in 2097152+0 records out 1073741824 bytes (1.1 GB) copied, 5.70144 s, 188 MB/s $ dd if=/dev/zero bs=8M count=128 | node test.js | dd of=/dev/null 128+0 records in 128+0 records out 1073741824 bytes (1.1 GB) copied, 4.51055 s, 238 MB/s 2097152+0 records in 2097152+0 records out 1073741824 bytes (1.1 GB) copied, 4.52087 s, 238 MB/s </code></pre>

I don't know the answer to your question, but perhaps this helps in your investigation of the problem. In the Node.js documentation about stream buffering, it says: <blockquote> Both Writable and Readable streams will store data in an internal buffer that can be retrieved using <code>writable.writableBuffer</code> or <code>readable.readableBuffer</code>, respectively. The amount of data potentially buffered depends on the <code>highWaterMark</code> option passed into the stream's constructor. For normal streams, the <code>highWaterMark</code> option specifies a total number of bytes. For streams operating in object mode, the <code>highWaterMark</code> specifies a total number of objects.... A key goal of the <code>stream</code> API, particularly the stream.pipe() method, is to limit the buffering of data to acceptable levels such that sources and destinations of differing speeds will not overwhelm the available memory. </blockquote> So, you can play with the buffer sizes to improve speed: <pre class="prettyprint"><code>var fs = require('fs'); var path = require('path'); var from = path.normalize(process.argv[2]); var to = path.normalize(process.argv[3]); var readOpts = {highWaterMark: Math.pow(2,16)}; // 65536 var writeOpts = {highWaterMark: Math.pow(2,16)}; // 65536 var source = fs.createReadStream(from, readOpts); var destiny = fs.createWriteStream(to, writeOpts) source.pipe(destiny); </code></pre>

NodeJS Copying File over a stream is very slow

I am copying file with Node on an SSD under VMWare, but the performance is very low. The benchmark I have run to measure actual speed is as follows:

$ hdparm -tT /dev/sda

/dev/sda:
 Timing cached reads:   12004 MB in  1.99 seconds = 6025.64 MB/sec
 Timing buffered disk reads: 1370 MB in  3.00 seconds = 456.29 MB/sec

However, the following Node code that copies file is very slow, evne teh consequent runs do not make it faster:

var fs  = require("fs");
fs.createReadStream("bigfile").pipe(fs.createWriteStream("tempbigfile"));

And the runs as:

$ seq 1 10000000 > bigfile
$ ll bigfile -h
-rw-rw-r-- 1 mustafa mustafa 848M Jun  3 03:30 bigfile
$ time node test.js 

real    0m4.973s
user    0m2.621s
sys     0m7.236s
$ time node test.js 

real    0m5.370s
user    0m2.496s
sys     0m7.190s

What is the issue here and how can I speed it up? I believe I can write it faster in C by just adjusting the buffer size. The thing that confuses me is that when I wrote simple almost pv equivalent program, that pipes stdin to stdout as the below, it is very fast.

process.stdin.pipe(process.stdout);

And the runs as:

$ dd if=/dev/zero bs=8M count=128 | pv | dd of=/dev/null
128+0 records in 174MB/s] [        <=>                                                                                ]
128+0 records out
1073741824 bytes (1.1 GB) copied, 5.78077 s, 186 MB/s
   1GB 0:00:05 [ 177MB/s] [          <=>                                                                              ]
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 5.78131 s, 186 MB/s
$ dd if=/dev/zero bs=8M count=128 |  dd of=/dev/null
128+0 records in
128+0 records out
1073741824 bytes (1.1 GB) copied, 5.57005 s, 193 MB/s
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 5.5704 s, 193 MB/s
$ dd if=/dev/zero bs=8M count=128 | node test.js | dd of=/dev/null
128+0 records in
128+0 records out
1073741824 bytes (1.1 GB) copied, 4.61734 s, 233 MB/s
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 4.62766 s, 232 MB/s
$ dd if=/dev/zero bs=8M count=128 | node test.js | dd of=/dev/null
128+0 records in
128+0 records out
1073741824 bytes (1.1 GB) copied, 4.22107 s, 254 MB/s
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 4.23231 s, 254 MB/s
$ dd if=/dev/zero bs=8M count=128 | dd of=/dev/null
128+0 records in
128+0 records out
1073741824 bytes (1.1 GB) copied, 5.70124 s, 188 MB/s
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 5.70144 s, 188 MB/s
$ dd if=/dev/zero bs=8M count=128 | node test.js | dd of=/dev/null
128+0 records in
128+0 records out
1073741824 bytes (1.1 GB) copied, 4.51055 s, 238 MB/s
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 4.52087 s, 238 MB/s

Why is Nodejs so slow?

Node. js programs can be slow due to a CPU/IO-bound operation, such as a database query or slow API call. For most Node. js applications, data fetching is done via an API request and a response is returned.

What is PassThrough stream Nodejs?

PassThrough. This Stream is a trivial implementation of a Transform stream that simply passes the input bytes across to the output. This is mainly for testing and some other trivial use cases. Here is an example of Passthrough Stream where it is piping from readable stream to writable stream.

What is FS createWriteStream?

The function fs. createWriteStream() creates a writable stream in a very simple manner. After a call to fs. createWriteStream() with the filepath, you have a writeable stream to work with. It turns out that the response (as well as the request) objects are streams.

How do I copy files from one node js file to another?

copyFile() method is used to asynchronously copy a file from the source path to destination path. By default, Node. js will overwrite the file if it already exists at the given destination. The optional mode parameter can be used to modify the behavior of the copy operation.

I don't know the answer to your question, but perhaps this helps in your investigation of the problem.

In the Node.js documentation about stream buffering, it says:

Both Writable and Readable streams will store data in an internal buffer that can be retrieved using writable.writableBuffer or readable.readableBuffer, respectively.

The amount of data potentially buffered depends on the highWaterMark option passed into the stream's constructor. For normal streams, the highWaterMark option specifies a total number of bytes. For streams operating in object mode, the highWaterMark specifies a total number of objects....

A key goal of the stream API, particularly the stream.pipe() method, is to limit the buffering of data to acceptable levels such that sources and destinations of differing speeds will not overwhelm the available memory.

So, you can play with the buffer sizes to improve speed:

var fs = require('fs');
var path = require('path');
var from = path.normalize(process.argv[2]);
var to = path.normalize(process.argv[3]);

var readOpts = {highWaterMark: Math.pow(2,16)};  // 65536
var writeOpts = {highWaterMark: Math.pow(2,16)}; // 65536  

var source = fs.createReadStream(from, readOpts);
var destiny = fs.createWriteStream(to, writeOpts)

source.pipe(destiny);

NodeJS Copying File over a stream is very slow

Tags:

performance

stream

node.js

file-io

pipe

Mustafa

People also ask

1 Answers

Edwin Dalorzo

Recent Activity

Donate For Us

NodeJS Copying File over a stream is very slow

Tags:

performance

stream

node.js

file-io

pipe

Mustafa

People also ask

1 Answers

Edwin Dalorzo

Related questions

Recent Activity

Donate For Us