I'm interested in the differences of these two blocks of code.
const $anchor = await page.$('a.buy-now');
const link = await $anchor.getProperty('href');
await $anchor.click();
await page.evaluate(() => {
const $anchor = document.querySelector('a.buy-now');
const text = $anchor.href;
$anchor.click();
});
I've generally found raw DOM elements in page.evaluate()
easier to work and the ElementHandles returned by the $ methods an abstraction to far.
However I felt perhaps that the async Puppeteer methods might be more performant or improve reliability? I couldn't find any guidance on this in the docs and would be interested in learning more about the pro's/con's about each approach and the motivation behind adding methods like page.$$()
.
Evaluates a function in the page's context and returns the result. If the function passed to page. evaluteHandle returns a Promise, the function will wait for the promise to resolve and return its value.
Puppeteer is a Node library that provides a high-level API to control headless Chrome over the DevTools Protocol. Also known as a Headless Chrome Node API, it is useful for automating the Chrome browser to run website tests. Fundamentally, Puppeteer is an automation tool and not a test tool.
Puppeteer page. waitForSelector method is used to wait for the selector to appear or to disappear from the page.
The main difference between those lines of code is the interaction between the Node.js and the browser environment.
The first code snippet will do the following:
document.querySelector
in the browser and return the element handle (to the Node.js environment)getProperty
on the handle and return the result (to the Node.js environment)The second code snippet simply does this:
Regarding the performance of these statements, one has to remember that puppeteer communicates via WebSockets with the browser. Therefore the second statement will run faster as there is just one command send to the browser (in contrast to three).
This might make a big difference if the browser you are connecting to is running on a different machine (connected to using puppeteer.connect
). It will likely only result in a few milliseconds difference if the script and the browser are located on the same machine. In the latter case it might therefore not make a big difference.
Using element handles has some advantages. First, functions like elementHandle.click
will behave more "human-like" in contrast to using document.querySelector('...').click()
. puppeteer will for example move the mouse to the location and click in the center of the element instead of just executing the click
function.
In general, I recommend to use page.evaluate
whenever possible as this API is also a lot easier to debug. When an error happens, you can simply reproduce the error by opening the DevTools in your Chrome browser and rerunning the same lines in your browser. If you are mixing a lot of page.$
statements together it might be much harder to understand what the problem is and whether it happened inside the Node.js or the browser runtime.
Use the element handles if you need the element for longer (because you maybe have make some complex calculations or wait for an external event before you can extract information from them).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With