I want to download a file with the Request library. It's pretty straightforward:
request({
url: url-to-file
}).pipe(fs.createWriteStream(file));
Since the URL is supplied by users (in my case) I would like to limit the maximum file size my application will download - let's say 10MB. I could rely on content-length
headers like so:
request({
url: url-to-file
}, function (err, res, body) {
var size = parseInt(res.headers['content-length'], 10);
if (size > 10485760) {
// ooops - file size too large
}
}).pipe(fs.createWriteStream(file));
The question is - how reliable is this? I guess this callback will be called after the file has been downloaded, right? But than it's too late if someone supplies the URL of file which is 1 GB. My application will first download this 1 GB of a file just to check (in the callback) that this is too big.
I was also thinking about good old Node's http.get()
method. In this case I would do this:
var opts = {
host: host,
port: port,
path: path
};
var file = fs.createWriteStream(fileName),
fileLength = 0;
http.get(opts, function (res) {
res.on('data', function (chunk) {
fileLength += chunk.length;
if (fileLength > 10485760) { // ooops - file size too large
file.end();
return res.end();
}
file.write(chunk);
}).on('end', function () {
file.end();
});
});
What approach would you recommend to limit download max file size without actually downloading the whole thing and checking it's size after all?
Limit Payload Size Using a Reverse-Proxy or a Middleware });app. listen(3000, () => console. log('server started')); We just set the limit value to '200kb' to limit the request payload size to 200KB.
js can handle ~15K requests per second, and the vanilla HTTP module can handle 70K rps.
I would actually use both methods you've discussed: check the content-legnth
header, and watch the data stream to make sure it doesn't exceed your limit.
To do this I'd first make a HEAD
request to the URL to see if the content-length
header is available. If it's larger than your limit, you can stop right there. If it doesn't exist or it's smaller than your limit, make the actual GET
request. Since a HEAD
request will only return the headers and no actual content, this will help weed out large files with valid content-length
s quickly.
Next, make the actual GET
request and watch your incoming data size to make sure that it doesn't exceed your limit (this can be done with the request module; see below). You'll want to do this regardless of if the HEAD
request found a content-length
header, as a sanity check (the server could be lying about the content-length
).
Something like this:
var maxSize = 10485760;
request({
url: url,
method: "HEAD"
}, function(err, headRes) {
var size = headRes.headers['content-length'];
if (size > maxSize) {
console.log('Resource size exceeds limit (' + size + ')');
} else {
var file = fs.createWriteStream(filename),
size = 0;
var res = request({ url: url });
res.on('data', function(data) {
size += data.length;
if (size > maxSize) {
console.log('Resource stream exceeded limit (' + size + ')');
res.abort(); // Abort the response (close and cleanup the stream)
fs.unlink(filename); // Delete the file we were downloading the data to
}
}).pipe(file);
}
});
The trick to watching the incoming data size using the request module is to bind to the data
event on the response (like you were thinking about doing using the http
module) before you start piping it to your file stream. If the data size exceeds your maximum file size, call the response's abort()
method.
I had a similar issue. I use now fetch to limit download size.
const response = await fetch(url, {
method: 'GET',t
size: 5000000, // maximum response body size in bytes, 5000000 = 5MB
}).catch(e => { throw e })
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With