I`m trying to create a file downloader as a background service but when a large file is scheduled, it's first put in memory and then, at the end of the download the file is written to disk.
How can I make the file be wrote gradually to the disk preserving memory considering that I may have lots of files being downloaded at the same time?
Here's the code I`m using:
var sys = require("sys"),
http = require("http"),
url = require("url"),
path = require("path"),
fs = require("fs"),
events = require("events");
var downloadfile = "http://nodejs.org/dist/node-v0.2.6.tar.gz";
var host = url.parse(downloadfile).hostname
var filename = url.parse(downloadfile).pathname.split("/").pop()
var theurl = http.createClient(80, host);
var requestUrl = downloadfile;
sys.puts("Downloading file: " + filename);
sys.puts("Before download request");
var request = theurl.request('GET', requestUrl, {"host": host});
request.end();
var dlprogress = 0;
setInterval(function () {
sys.puts("Download progress: " + dlprogress + " bytes");
}, 1000);
request.addListener('response', function (response) {
response.setEncoding('binary')
sys.puts("File size: " + response.headers['content-length'] + " bytes.")
var body = '';
response.addListener('data', function (chunk) {
dlprogress += chunk.length;
body += chunk;
});
response.addListener("end", function() {
fs.writeFileSync(filename, body, 'binary');
sys.puts("After download finished");
});
});
It is suitable for large enterprise projects that do complex and complicated computations and data processing. The comparison in terms of development time between Node. js and Java is that, Node. js is easier to learn than Java, leading to faster development when using Node.
256 MB is sufficient amount of RAM to run Node. js (e.g. on Linux VPS instance), assuming no other memory-hog software is run.
When Node. js applications are running within containers with memory limits set (using the --memory option for docker or any other flags with your orchestration system), use the --max-old-space-size option to ensure that Node knows its limit and that the set value is smaller than the container limit.
Out of the box, a 64-bit installation of node. js assumes a memory ceiling of 1.5GB per node process.
As it turns out, although Node.js is streaming the file input and output, in between it is still attempting to hold the entire file contents in memory, which it can’t do with a file that size. Node can hold up to 1.5GB in memory at one time, but no more.
Aside from AppSignal to monitor your Node.js production setup, Node-Memwatch and Node-Inspector are great for debugging memory issues. P.S.
The most straightforward is fs.readFile () wherein, the whole file is read into memory and then acted upon once Node has read it, and the second option is fs.createReadStream (), which streams the data in (and out) similar to other languages like Python and Java.
Processing large files is nothing new to JavaScript, in fact, in the core functionality of Node.js, there are a number of standard solutions for reading and writing to and from files.
I changed the callback to:
request.addListener('response', function (response) {
var downloadfile = fs.createWriteStream(filename, {'flags': 'a'});
sys.puts("File size " + filename + ": " + response.headers['content-length'] + " bytes.");
response.addListener('data', function (chunk) {
dlprogress += chunk.length;
downloadfile.write(chunk, encoding='binary');
});
response.addListener("end", function() {
downloadfile.end();
sys.puts("Finished downloading " + filename);
});
});
This worked perfectly.
does the request package work for your uses?
it lets you do things like this:
request(downloadurl).pipe(fs.createWriteStream(downloadtohere))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With