I thought I had a pretty good catch to find those rare timeouts that I get from puppeteer, but some how this timeout is not caught by any of them - my question is why?
Here is the code:
var readHtml = (url) => {
return new Promise( async (resolve,reject)=> {
var browser = await puppeteer.launch()
var page = await browser.newPage()
await page.waitForSelector('.allDataLoaded')
.then(() => {
console.log ("Finished reading: " + url)
return resolve("COOL");
})
.catch((err) => {
console.log ("Timeout or other error: ", err)
return resolve("TRYAGAIN");
});
})}
And here is the error....
(node:23124) UnhandledPromiseRejectionWarning: Error: Navigation Timeout Exceeded: 30000ms exceeded at Promise.then
(node:23124) UnhandledPromiseRejectionWarning:
Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 2)
I did some research which said it might be because there are some urls not yet finished inside the puppeteer newPage()
But how come this does not get cough by my .catch?
I need it to "TRYAGAIN" in case it fails for what ever reason. Now it just stops with the error and does nothing.
You can handle this error in several ways. First, you can call on the try/catch block, or . catch() failure handler. Second, you can choose to use the unhandledrejection event handler to accomplish this.
The try-catch statement consists of a try block followed by one or more catch clauses, which specify handlers for different exceptions. When an exception is thrown, the common language runtime (CLR) looks for the catch statement that handles this exception.
You're properly catch
ing the waitForSelector
and its chained promises, but you're not doing the same for the launch
and newPage
calls - they're not connected to the catch
later.
Because async functions automatically return Promises already, you might consider avoiding the Promise constructor entirely:
var readHtml = async (url) => {
try {
var browser = await puppeteer.launch()
var page = await browser.newPage()
} catch(e) {
// handle initialization error
}
await page.waitForSelector('.allDataLoaded')
.then(() => {
console.log ("Finished reading: " + url)
return resolve("COOL");
})
.catch((err) => {
console.log ("Timeout or other error: ", err)
return resolve("TRYAGAIN");
});
}
Or, you might consider putting the catch
in the consumer of readHtml
:
var readHtml = async (url) => {
var browser = await puppeteer.launch()
var page = await browser.newPage()
await page.waitForSelector('.allDataLoaded')
console.log ("Finished reading: " + url)
};
readHtml(someurl)
.catch((e) => console.log('err: ' + e));
The tip I'll give you is that you can catch
errors at each step of Puppeteer since each returns a promise.
So instead of a try / catch block you can, if you feel the need to, do the following:
const browser = await puppeteer
.launch()
.catch(function (error) {
/* Handle error here for Puppeteer launch and return
expected value for browser if things fail */
console.log(error);
});
const page = await browser
.newPage()
.catch(function (error) {
/* Handle error here for browser new page and return
expected value for page if things fail */
console.log(error);
});
This, for me, is a much cleaner way of catching any expected exceptions at each step.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With