When taking a screenshot of a website using pupeteer, cookie consent prompts are displayed. I want to dismiss or accept these prompts before taking the screenshot. The problem I am facing is that most websites present the cookie prompt in different ways, so its difficult to isolate them.
How can I best target and dismiss these prompts using pupeteer?
I don't believe there is a general way of doing this as these prompts are elements like every other elements in the page. Having said that, there are some attempts to block them with extensions or filter-lists you can try:
I haven't tested any of these and do not know whether they're effective.
keep in mind headless chrome doesn't support extension. Loading extensions in puppeteer:
const browser = await puppeteer.launch({
headless: true,
args: [
'--disable-extensions-except=/path/to/manifest/folder/',
'--load-extension=/path/to/manifest/folder/',
]
});
Update: A more general way to combat cookie consents with headless puppeteer
This approach is nowhere near complete either, but shows an efficient way to eliminate cookie consent pop-ups in a less specific way. It uses language and generalized selectors to detect consent buttons and links rather than solely relying on exact selectors for each website.
In this following example I am targeting the elements a, button
that are within a container that uses the name cookie within an id, class
. I limit buttons in this context, so I won't randomly click around the website by accident.
Furthermore it uses regular expressions to identify button text which is commonly used to accept cookies and can be replaced with ^(Accept all|Accept|I understand|Agree|Okay|OK)$
or translated into any language of your choice (case-insensitive).
await page.evaluate(_ => {
function xcc_contains(selector, text) {
var elements = document.querySelectorAll(selector);
return Array.prototype.filter.call(elements, function(element){
return RegExp(text, "i").test(element.textContent.trim());
});
}
var _xcc;
_xcc = xcc_contains('[id*=cookie] a, [class*=cookie] a, [id*=cookie] button, [class*=cookie] button', '^(Alle akzeptieren|Akzeptieren|Verstanden|Zustimmen|Okay|OK)$');
if (_xcc != null && _xcc.length != 0) { _xcc[0].click(); }
});
Old Answer:
There is indeed no general way to handle cookie consent pop-ups, as they vary greatly, and even the chrome extensions won't handle all. However, you can replicate what the extensions do and manage your own list, by evaluating JS code on the target site before taking a screenshot.
In my case I just accept them all, trying to do it in headless mode. Add more selectors as you identify them. You could use dismiss button selectors instead, if you wish so.
Following you will find some real world scenarios that should help to get you going:
await page.evaluate(_ => {
var xcc
// ids
var xcc_id = [
'borlabsCookieOptionAll',
'cookie-apply-all',
'cookie-settings-all',
// add ids here
];
for (let i = 0; i < xcc_id.length; i++) {
xcc = document.getElementById(xcc_id[i]);
if (xcc != null) {
xcc.click();
}
}
// classes
var xcc_class = [
'accept-all',
'accept-cookies-button',
'avia-cookie-select-all',
// add classes here
];
for (let i = 0; i < xcc_class.length; i++) {
xcc = document.getElementsByClassName(xcc_class[i]);
if (xcc != null && xcc.length != 0) {
xcc[0].click();
}
}
// custom data attributes
xcc = document.querySelectorAll('[data-cookieman-accept-all]'); if (xcc != null && xcc.length != 0) { xcc[0].click(); }
// hide iframes, can't eval
xcc = document.querySelectorAll("iframe[src*=eurocookie]"); if (xcc != null && xcc.length != 0) { xcc[0].style.display = 'none'; }
xcc = document.querySelectorAll("iframe[src*=eurocookie]"); if (xcc != null && xcc.length > 1) { xcc[1].style.display = 'none'; }
});
There sure is a more elegant way of doing this, but this way I was able to quickly organize my list, make changes on the fly, sorting and removing duplicates in the code editor by keeping them as a one-liner or in arrays.
Alternatively just use the { headless: false }
option and load an extension that does it for you as suggested. Cheers.
Side note: Interaction with cookie consent pop-ups can cause your code to break if the page reloads (page navigation error). To circumvent this, I use a fixed time delay of 3000-4000 ms after await page.evaluate( ... );
const delay = (ms) => new Promise(resolve => setTimeout(resolve, ms));
await delay(3500);
which also catches plenty of meta-refreshes, JS redirects and gives some extra time for large resources to load.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With