I know this should be simple. But how do return the values for use outside the function, I cannot get it to work. This works downloading file and in the console returns
value: attachment; filename="filename"
await page._client.send('Page.setDownloadBehavior', {behavior: 'allow', downloadPath: './tmp'})
await page.click('download');
await page.on('response', resp => {
var header = resp.headers();
console.log("value: " + header['content-disposition']);
});
but this and everything I have tried returns nothing
await page.on('response', resp => {
var header = resp.headers();
return header['content-disposition'];
});
I want to be able to return the filename, file size, etc. of a downloaded file for further use in the script.
How do I return and access the response values?
You shouldn't use the await
operator before page.on()
.
The Puppeteer page
class extends Node.js's native EventEmitter
, which means that whenever you call page.on()
, you are setting up an event listener using Node.js's emitter.on()
.
This means that the functionality you include in page.on('response')
will execute when the response
event is fired.
You don't return values from an event handler. Instead, the functionality within the event handler is executed when the event occurs.
If you want to use the result of page.on()
in a function, you can use the following method:
const example_function = value => {
console.log(value);
};
page.on('response', resp => {
var header = resp.headers();
example_function(header['content-disposition']);
});
Grant I've realised from your answer that I have have made a few beginner mistakes.
Puppeteer await - I thought await page.on() would pause the script until complete. I was wrong.
I had placed page.on() inside the loop causing errors, it should have been outside.
The script was going to the next download page before the download started and page.on() being called.
I should have saved the file inside page.on() instead of outside.
Correct me if I am wrong.
This is what I was trying to do.(abbreviated)
async function main() {
await page.goto(page, { waitUntil: 'networkidle0' });
for(loop through download pages){
await page.click(download);
await page.on('response', resp => {
var header = resp.headers();
return header['content-disposition'];
});
save.write(header['content-disposition']);
}
}
main();
This is what has worked.
async function main() {
page.on('response', resp => {
var header = resp.headers();
var fileName = header['content-disposition'];
save.write(fileName);
});
await page.goto(startPage, { waitUntil: 'networkidle0' });
for(loop through download pages){
await page.goto(downloadPage, { waitUntil: 'networkidle0' });
await page.click(download);
await page.waitFor(30000);
//download starts
//page.on called and saves fileName
//page.waitFor gives it time to complete before starting next loop
}
}
main();
await page.waitFor(30000);
I don't know if await is required.
And page.waitFor(30000); slows the script down, but I could not get it to work without it. There might be a better way.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With