Is it possible to know the name of a file that is being downloaded or to set a name before download?

Tags:

puppeteer

I'm downloading a ton of files using puppeteer, but I need to know each file's name before or after download is complete. Watching the folder for file change doesn't solve my problem, due to lots of processes downloading files at the same time and having now way to match them.

I've been trying to set a custom path for download for each file, but Puppeteer does something weird that some downloads go to that folder and others to /Downloads.

So, I would like to know if there's a way to know the name before download or to set the name of the file before downloading. This way I can properly match it through code.

Note: files are downloaded via JS i.e. when a button is clicked. No way to know file name via scraping due to it being auto-generated.

501

asked Jun 11 '19 23:06

Carlos Ortiz

1 Answers

If the download is triggered by the page, this is done by using the Content-Disposition header. Very likely, the header also includes the file name as part of the header.

Example

Below, an example for the header:

Content-Disposition: attachment; filename="name_of_download.ext"

In order to read the filename, you can therefore check out the name of the file by looking at response.headers(). In the following example I'm using a regular expression after that to extract the file name:

const contentDisposition = response.headers()['content-disposition'];
const matchFilename = contentDisposition.match(/filename="(.*)"/);
if (matchFilename) {
  const filename = matchFilename[1];
}

Non-ASCII characters

Depending on the files you are downloading, you also might want to check out this stackoverflow answer regarding the encoding for non-ASCII file names.

answered Jan 02 '23 22:01

Thomas Dondorf

Related questions
                            
                                How to upload a file with puppeteer and dropzone?
                            
                                How to generate server-side PDF of Angular app?
                            
                                I need more info about Puppeteer page.metrics and queryObjects
                            
                                Puppeteer chrome get active/visible tab
                            
                                Error "Running as root without --no-sandbox is not supported"
                            
                                Puppeteer fails to initiate in GitHub Actions
                            
                                How to set up a node http proxy to intercept a particular request/response?
                            
                                External resources in Puppeteer with Chrome executable fail to load (net::ERR_EMPTY_RESPONSE)
                            
                                Unable to install Chromium inisde a docker container on M1 macbook
                            
                                Loading and using a JS module in puppeteer
                            
                                How to shift puppeteers focus to a pop up window
                            
                                Automate Google Takeout Download
                            
                                Global variable in Google App engine Nodejs Puppeteer
                            
                                Serverless framework: Chrome "Error: spawn ETXTBSY",
                            
                                Node Puppeteer, page.on( "request" ) throw a "Request is already handled!"
                            
                                Scrape Text From Iframe
                            
                                JSHandles can be evaluated only in the context they were created / Cannot find context with specified id
                            
                                React with Google Chromes Puppeteer
                            
                                Return window object using puppeteer
                            
                                How to check if an element contains a certain string in puppeteer?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With