So here's the code snippet:
for (let item of items)
{
await page.waitFor(10000)
await page.click("#item_"+item)
await page.click("#i"+item)
let pages = await browser.pages()
let tempPage = pages[pages.length-1]
await tempPage.waitFor("a.orange", {timeout: 60000, visible: true})
await tempPage.click("a.orange")
counter++
}
page
and tempPage
are two different pages.
What happens is that page
waits for 10 seconds, then clicks some stuff, which opens a second page.
What's supposed to happen is that tempPage
waits for an element, clicks it, then page should wait 10 seconds before doing it all over again.
However, what actually happens is that page
waits for 10 seconds, clicks the stuff, then starts waiting for 10 seconds without waiting for tempPage
to finish its tasks.
Is this a bug, or am I misunderstanding something? How should I fix this so that when the for
loop loops again, it is only after tempPage
has clicked.
You can use the async/await syntax or call the . then() method on a promise to wait for it to resolve. Inside of functions marked with the async keyword, you can use await to wait for the promises to resolve before continuing to the next line of the function.
Inside an async function, you can use the await keyword before a call to a function that returns a promise. This makes the code wait at that point until the promise is settled, at which point the fulfilled value of the promise is treated as a return value, or the rejected value is thrown.
You can use Puppeteer's page. waitForNavigation() method here to explicitly wait for this event to happen and then continue your script. The accepted notation in Puppeteer's case is by using the Promise. all() method to wait for the click to happen and the navigation to happen before continuing.
Generally, you cannot rely on await tempPage.click("a.orange")
to pause execution until tempPage
has "finish[ed] its tasks". For super simple code that executes synchronously, it may work. But in general, you cannot rely on it.
If the click triggers an Ajax operation, or starts a CSS animation, or starts a computation that cannot be immediately computed, or opens a new page, etc., then the result you are waiting for is asynchronous, and the .click
method will not wait for this asynchronous operation to complete.
What can you do? In some cases you may be able to hook into the code that is running on the page and wait for some event that matters to you. For instance, if you want to wait for an Ajax operation to be done and the code on the page uses jQuery, then you might use ajaxComplete
to detect when the operation is complete. If you cannot hook into any event system to detect when the operation is done, then you may need to poll the page to wait for evidence that the operation is done.
Here is an example that shows the issue:
const puppeteer = require('puppeteer');
function getResults(page) {
return page.evaluate(() => ({
clicked: window.clicked,
asynchronousResponse: window.asynchronousResponse,
}));
}
puppeteer.launch().then(async browser => {
const page = await browser.newPage();
await page.goto("https://example.com");
// We add a button to the page that will click later.
await page.evaluate(() => {
const button = document.createElement("button");
button.id = "myButton";
button.textContent = "My Button";
document.body.appendChild(button);
window.clicked = 0;
window.asynchronousResponse = 0;
button.addEventListener("click", () => {
// Synchronous operation
window.clicked++;
// Asynchronous operation.
setTimeout(() => {
window.asynchronousResponse++;
}, 1000);
});
});
console.log("before clicks", await getResults(page));
const button = await page.$("#myButton");
await button.click();
await button.click();
console.log("after clicks", await getResults(page));
await page.waitForFunction(() => window.asynchronousResponse === 2);
console.log("after wait", await getResults(page));
await browser.close();
});
The setTimeout
code simulates any kind of asynchronous operation started by the click.
When you run this code, you'll see on the console:
before click { clicked: 0, asynchronousResponse: 0 }
after click { clicked: 2, asynchronousResponse: 0 }
after wait { clicked: 2, asynchronousResponse: 2 }
You see that clicked
is immediately incremented twice by the two clicks. However, it takes a while before asynchronousResponse
is incremented. The statement await page.waitForFunction(() => window.asynchronousResponse === 2)
polls the page until the condition we are waiting for is realized.
You mentioned in a comment that the button is closing the tab. Opening and closing tabs are asynchronous operations. Here's an example:
puppeteer.launch().then(async browser => {
let pages = await browser.pages();
console.log("number of pages", pages.length);
const page = pages[0];
await page.goto("https://example.com");
await page.evaluate(() => {
window.open("https://example.com");
});
do {
pages = await browser.pages();
// For whatever reason, I need to have this here otherwise
// browser.pages() always returns the same value. And the loop
// never terminates.
await page.evaluate(() => {});
console.log("number of pages after evaluating open", pages.length);
} while (pages.length === 1);
let tempPage = pages[pages.length - 1];
// Add a button that will close the page when we click it.
tempPage.evaluate(() => {
const button = document.createElement("button");
button.id = "myButton";
button.textContent = "My Button";
document.body.appendChild(button);
window.clicked = 0;
window.asynchronousResponse = 0;
button.addEventListener("click", () => {
window.close();
});
});
const button = await tempPage.$("#myButton");
await button.click();
do {
pages = await browser.pages();
// For whatever reason, I need to have this here otherwise
// browser.pages() always returns the same value. And the loop
// never terminates.
await page.evaluate(() => {});
console.log("number of pages after click", pages.length);
} while (pages.length > 1);
await browser.close();
});
When I run the above, I get:
number of pages 1
number of pages after evaluating open 1
number of pages after evaluating open 1
number of pages after evaluating open 2
number of pages after click 2
number of pages after click 1
You can see it takes a bit before window.open()
and window.close()
have detectable effects.
In your comment you also wrote:
I thought
await
was basically what turned an asynchronous function into a synchronous one
I would not say it turns asynchronous functions into synchronous ones. It makes the current code wait for an asynchronous operation's promise to be resolved or rejected. However, more importantly for the issue at hand here, the problem is that you have two virtual machines executing JavaScript code: there's Node which runs puppeteer
and the script that controls the browser, and there's the browser itself which has its own JavaScript virtual machine. Any await
that you use on the Node side affects only the Node code: it has no bearing on the code that runs in the browser.
It can get confusing when you see things like await page.evaluate(() => { some code; })
. It looks like it is all of one piece, and all executing in the same virtual machine, but it is not. puppeteer
takes the parameter passed to .evaluate
, serializes it, and sends it over to the browser, where it executes. Try adding something like await page.evaluate(() => { button.click(); });
in the script above, after const button = ...
. Something like this:
const button = await tempPage.$("#myButton");
await button.click();
await page.evaluate(() => { button.click(); });
In the script, button
is defined before page.evaluate
, but you'll get a ReferenceError
when page.evaluate
runs because button
is not defined on the browser side!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With