Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to get text from span in python using scrapy?

Tags:

python

scrapy

I'm placing here HTML code :

<div class="rendering rendering_person rendering_short rendering_person_short">
  <h3 class="title">
    <a rel="Person" href="https://moh-it.pure.elsevier.com/en/persons/massimo-eraldo-abate" class="link person"><span>Massimo Eraldo Abate</span></a>
  </h3>
  <ul class="relations email">
    <li class="email"><a href="[email protected]" class="link"><span>[email protected]</span></a></li>
  </ul>
  <p class="type"><span class="family">Person: </span>Academic</p>
</div>

From above code how to extract Massimo Eraldo Abate?

Please help me.

like image 732
rajeshbojja Avatar asked Aug 29 '17 06:08

rajeshbojja


People also ask

How do I extract text from Scrapy?

Description. /html/head/title − This will select the <title> element, inside the <head> element of an HTML document. /html/head/title/text() − This will select the text within the same <title> element. //td − This will select all the elements from <td>.

What is Selector in scrapy?

Scrapy Selectors is a thin wrapper around parsel library; the purpose of this wrapper is to provide better integration with Scrapy Response objects. parsel is a stand-alone web scraping library which can be used without Scrapy. It uses lxml library under the hood, and implements an easy API on top of lxml API.


1 Answers

You can extract the name using

response.xpath('//h3[@class="title"]/a/span/text()').extract_first()

Also, look at this Scrapinghub's blogpost for introduction to XPath.

like image 133
Tomáš Linhart Avatar answered Nov 04 '22 19:11

Tomáš Linhart