Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast file copy with progress information in Node.js?

Tags:

node.js

Is there any chance to copy large files with Node.js with progress infos and fast?

Solution 1 : fs.createReadStream().pipe(...) = useless, up to 5 slower than native cp

See: Fastest way to copy file in node.js, progress information is possible (with npm package 'progress-stream' ):

fs = require('fs');
     fs.createReadStream('test.log').pipe(fs.createWriteStream('newLog.log')); 

The only problem with that way is that it takes easily 5 times longer compared "cp source dest". See also the appendix below for the full test code.

Solution 2 : rsync ---info=progress2 = same slow as solution 1 = useless

Solution 3 : My last resort, write a native module for node.js, using "CoreUtils" (linux sources for cp and others) or other functions as shown in Fast file copy with progress

Does anyone knows better than solution 3? I'd like to avoid native code but it seems the best fit.

thanks! any package recommendations or hints (tried all fs**) are welcome!

Appendix:

test code, using pipe and progress:

var path = require('path');
var progress = require('progress-stream');
var fs = require('fs');
var _source = path.resolve('../inc/big.avi');// 1.5GB
var _target= '/tmp/a.avi';

var stat = fs.statSync(_source);
var str = progress({
    length: stat.size,
    time: 100
});

str.on('progress', function(progress) {
    console.log(progress.percentage);
});

function copyFile(source, target, cb) {
    var cbCalled = false;


    var rd = fs.createReadStream(source);
    rd.on("error", function(err) {
        done(err);
    });

    var wr = fs.createWriteStream(target);

    wr.on("error", function(err) {
        done(err);
    });

    wr.on("close", function(ex) {
        done();
    });

    rd.pipe(str).pipe(wr);

    function done(err) {
        if (!cbCalled) {
            console.log('done');
            cb && cb(err);
            cbCalled = true;
        }
    }
}
copyFile(_source,_target);

update: a fast (with detailed progress!) C version is implemented here: https://github.com/MidnightCommander/mc/blob/master/src/filemanager/file.c#L1480. Seems the best place to go from :-)

like image 665
xamiro Avatar asked Dec 07 '15 20:12

xamiro


3 Answers

I have the same issue. I want to copy large files as fast as possible and want progress information. I created a test utility that tests the different copy methods:

https://www.npmjs.com/package/copy-speed-test

You can run it simply with:

npx copy-speed-test --source someFile.zip --destination someNonExistentFolder

It does a native copy using child_process.exec(), a copy file using fs.copyFile and it uses createReadStream with a variety of different buffer sizes (you can change buffer sizes by passing them on the command line. run npx copy-speed-test -h for more info.

Some things I learnt:

  • fs.copyFile is just as fast as native
  • you can get quite inconsistent results on all these methods, particularly when copying from and to the same disc and with SSDs
  • if using a large buffer then createReadStream is nearly as good as the other methods
  • if you use a very large buffer then the progress is not very accurate.

The last point is because the progress is based on the read stream, not the write stream. if copying a 1.5GB file and your buffer is 1GB then the progress immediately jumps to 66% then jumps to 100% and you then have to wait whilst the write stream finishes writing. I don't think that you can display the progress of the write stream.

If you have the same issue I would recommend that you run these tests with similar file sizes to what you will be dealing with and across similar media. My end use case is copying a file from an SD card plugged into a raspberry pi and copied across a network to a NAS so that's what I was the scenario that I ran the tests for.

I hope someone other than me finds it useful!

like image 143
Roaders Avatar answered Nov 03 '22 13:11

Roaders


One aspect that may slow down the process is related to console.log. Take a look into this code:

const fs = require('fs');
const sourceFile = 'large.exe'
const destFile = 'large_copy.exe'

console.time('copying')
fs.stat(sourceFile, function(err, stat){
  const filesize = stat.size
  let bytesCopied = 0

  const readStream = fs.createReadStream(sourceFile)

  readStream.on('data', function(buffer){
    bytesCopied+= buffer.length
    let porcentage = ((bytesCopied/filesize)*100).toFixed(2)
    console.log(porcentage+'%') // run once with this and later with this line commented
  })
  readStream.on('end', function(){
    console.timeEnd('copying')
  })
  readStream.pipe(fs.createWriteStream(destFile));
})

Here are the execution times copying a 400mb file:

with console.log: 692.950ms

without console.log: 382.540ms

like image 40
Tulio Faria Avatar answered Nov 03 '22 12:11

Tulio Faria


cpy and cp-file both support progress reporting

like image 29
Yury Solovyov Avatar answered Nov 03 '22 12:11

Yury Solovyov