Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Puppeteer page doesn't want to load totally in headless mode

Tags:

Here is my code:

// Open the browser
let browser = await puppeteer.launch({
    args: ["--no-sandbox"]
});
let page = await browser.newPage();

navPromise = page.waitForSelector('#js_boite_reception').then(() => {
    console.log('received');
});
await page.goto(entMessagesURL);
await navPromise;

// Wait 10 seconds, to be sure that is not because my connection is slow (it's not)
logger.log(`On the messages page (session=${username})`);
await delay(10000);

// Write an html file with the page content
let pageContent = await page.content();
require('fs').writeFileSync('./test.html', pageContent);

The received is not displayed and I'm getting a timeout error. But, if I remove the waitForSelector function, and I only write the test.html file, we can see that:

Headless mode enabled, a part of the page is not loaded

headless mode en

Headless mode disabled, all the page is loaded

Headless mode dis

With headless mode, only a part of the page content is loaded. I don't know why. Even if I add a timeout of one minute, it won't load more... What can I do?

Note: I tried with a useragent:

await page.setUserAgent("Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36");

(under the let page = await browser.newPage())

like image 220
Androz2091 Avatar asked Jan 27 '20 16:01

Androz2091


2 Answers

await page.setUserAgent("Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36");

This worked for me! My website was blocking headless mode when tried it locally. After adding the header, I was finally able to find the selector.

like image 57
Boost Avatar answered Nov 03 '22 04:11

Boost


Im pretty sure its a condition race and it is happening because you are trying to get the selector before you go to the page.

Try to move those lines:

await page.goto(entMessagesURL);
navPromise = page.waitForSelector('#js_boite_reception').then(() => {
    console.log('received');
});

I cant try reproduce your error because i dont know what page is and the language that it had been writted

like image 39
Alejandro Molina Avatar answered Nov 03 '22 05:11

Alejandro Molina