I am downloading a file through puppeteer into my directory. I need to upload this file to an s3 bucket so I need to pick up the file name. But the problem is, this file name has a time stamp that changes every time so I can't keep a hard coded name. So is there a way around this to get a constant name every time (even if the old file is replaced), or how to rename the file being downloaded?
I thought of using node's fs.rename() function but that would again require the current file name.
I want a constant file name to hard code and then upload into the s3 bucket.
await page._client.send('Page.setDownloadBehavior', {behavior: 'allow', downloadPath: './xml'}); // This sets the directory
await page.keyboard.press('Tab');
await page.keyboard.press('Enter'); // This downloads an XML file.
Setting up a download path and reading from the disk _client API which gives us access to all the functions of the underlying developer console protocol. Basically, it extends Puppeteer's functionality. Then we can download the file by clicking on it.
Click on the "Edit Menu" > Preferences > General tab. Locate the "Save downloaded files to" section, Click on "Downloads" > "Other"... Browse and indicate your new download location.
You have two options:
This is the most straight-forward way to do it. Monitor all responses and in case you notice the response that is being downloaded, use the name to rename it locally via fs.rename
.
Code Sample
const path = require('path');
// ...
page.on('response', response => {
const url = response.request().url();
const contentType = response.headers()['content-type'];
if (/* URL and/or contentType matches pattern */) {
const fileName = path.basename(response.request().url());
// handle and rename file name (after making sure it's downloaded)
}
});
The code listens to all responses and wait for a specific pattern (e.g. contentType === 'application/pdf'
). Then it takes the file name from the request. Depending on your use case, you might want to check the Content-Disposition
header in addition. After that, you have to wait until the file is downloaded (e.g. file is present and file size does not change) and then you can rename it.
I'm 99% sure, that this is possible. You need to intercept the response which is currently not supported by puppeteer itself. But as the Chrome DevTools Protocol is supporting this functionality, you can use it using the low-level protocol.
The idea is to intercept the response and change the Content-Disposition
header to your desired file name.
Here is the idea:
chrome-remote-interface
or a CDP Session to activate Network.requestIntercepted
Network.requestIntercepted
eventsNetwork.getResponseBodyForInterception
to receive the body of the responseContent-Disposition
header to include your filenameNetwork.continueInterceptedRequest
with your modified responseYour file should then be save with your modified file name. Check out this comment on github for a code sample. As I already explained it is a rather sophisticated approach as long as puppeteer does not support modifying responses.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With