Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Puppeteer querySelector returns null

I am trying to scrap some data with puppeteer but for some sites querySelector returns null and I have no idea what is wrong. I found some answers about this issue in stackoverflow but none of them worked. Here is the code with an example link that does not work.

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();

    await page.goto('https://www.macys.com/shop/product/the-north-face-mens- 
    logo-half-dome-t-shirt?ID=2085687&CategoryID=30423&cm_kws=2085687');

    const textContent = await page.evaluate(() => {
    return document.querySelector('.price');
});

console.log(textContent); 

browser.close();
})();
like image 233
Aram Sheroyan Avatar asked Jun 03 '18 06:06

Aram Sheroyan


2 Answers

Probably the elements are loaded asynchronously via javascript and are still not in the DOM when you're calling .evaluate().

Try to wait for the selector with puppeteer .waitForSelector function

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();

await page.goto('https://www.macys.com/shop/product/the-north-face-mens- 
logo-half-dome-t-shirt?ID=2085687&CategoryID=30423&cm_kws=2085687');

await page.waitForSelector('.price');

const textContent = await page.evaluate(() => {
   return document.querySelector('.price');
});

console.log(textContent); 

browser.close();
})();
like image 108
Pjotr Raskolnikov Avatar answered Oct 17 '22 14:10

Pjotr Raskolnikov


After taking a snapshot of the page, it turned out that my request gets blocked by a bot detection system. Here is the solution. We just need to pass some more data so it wont be detected as a bot. If it still not working, you can check out this tutorial.

const puppeteer = require('puppeteer');

// This is where we'll put the code to get around the tests.
const preparePageForTests = async (page) => {

// Pass the User-Agent Test.
const userAgent = 'Mozilla/5.0 (X11; Linux x86_64)' +
  'AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.39 Safari/537.36';
await page.setUserAgent(userAgent);
}


(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await preparePageForTests(page);

 // await page.setRequestInterception(true);
 await page.goto('websiteURL');

 const textContent = await page.evaluate(() => {
   return {document.querySelector('yourCSSselector').textContent,
 }
 });
  console.log(textContent);

  browser.close();
like image 38
Aram Sheroyan Avatar answered Oct 17 '22 14:10

Aram Sheroyan