Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get title from newly opened page puppeteer

I am trying to get the a new tab and scrape the title of that page with puppeteer.

This is what I have

// use puppeteer
const puppeteer = require('puppeteer');

//set wait length in ms: 1000ms = 1sec
const short_wait_ms = 1000

async function run() {
    const browser = await puppeteer.launch({
        headless: false, timeout: 0});
    const page = await browser.newPage();

        await page.goto('https://biologyforfun.wordpress.com/2017/04/03/interpreting-random-effects-in-linear-mixed-effect-models/');

    // second page DOM elements
    const CLICKHERE_SELECTOR = '#post-2068 > div > div.entry-content > p:nth-child(2) > a:nth-child(1)';

    // main page
    await page.waitFor(short_wait_ms);
    await page.click(CLICKHERE_SELECTOR);


    // new tab opens - move to new tab
    let pages = await browser.pages();

    //go to the newly opened page

    //console.log title -- Generalized Linear Mixed Models in Ecology and in R

}

run();

I can't figure out how to use browser.page() to start working on the new page.

like image 613
Alex Avatar asked Dec 14 '22 19:12

Alex


2 Answers

According to the Puppeteer Documentation:

page.title()

  • returns: <Promise<string>> Returns page's title.

Shortcut for page.mainFrame().title().

Therefore, you should use page.title() for getting the title of the newly opened page.

Alternatively, you can gain a slight performance boost by using the following:

page._frameManager._mainFrame.evaluate(() => document.title)

Note: Make sure to use the await operator when calling page.title(), as the title tag must be downloaded before Puppeteer can access its content.

like image 146
Grant Miller Avatar answered Dec 16 '22 10:12

Grant Miller


You shouldn't need to move to the new tab.

To get the title of any page you can use:

const pageTitle = await page.title();

Also after you click something and you're waiting for the new page to load you should wait for the load event or the network to be Idle:

// Wait for redirection
await page.waitForNavigation({waitUntil: 'networkidle', networkIdleTimeout: 1000});

Check the docs: https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagewaitfornavigationoptions

like image 42
EcoVirtual Avatar answered Dec 16 '22 09:12

EcoVirtual