Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read the value of an span element with Puppeteer

I am trying to do some web scraping reading some lines inside a html page. I need to look for a text which is repeated through the page inside some <span> elements. In the following example I would like to end with an array of strings with ['Text number 1','Text number 2','Text number 3']

<html>
    ...
    <span>Text number 1</span>
    ...  
    <span>Text number 2</span>
    ...
    <span>Text number 3</span>
    ...
</html>

I have the following code

sElements = ' ... span'; // I declare the selector.
cs = await page.$$(sElements); // I get an array of ElementHandle

The selector is working as in Google Chrome developer tools it captures exactly the 3 elements I am looking for. Also the cs variable is filled with an array of three elements. But then I am trying

for(c in cs)
    console.log(c.innerText);

But undefined is logged. I have tried with .text .value .innerText .innerHTML .textContent ... I do not know what I am missing as I think this is really simple

I have also tried this with the same undefined result.

cs = await page.$$eval(sElements, e => e.innerHTML);
like image 713
usuario Avatar asked Jul 12 '18 13:07

usuario


1 Answers

Here is an example that would get the innerText of the last span element.

  let spanElement;

  spanElement = await this.page.$$('span');
  spanElement = spanElement.pop();
  spanElement = await spanElement.getProperty('innerText');
  spanElement = await spanElement.jsonValue();

If you still are unable to get any text then ensure the selector is correct and that the span elements have an innerText defined (not outerText). You can run $(selector) in Chrome console to check.

like image 173
Matt Shirley Avatar answered Sep 19 '22 08:09

Matt Shirley