Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scrapy Xpath with text() contains

Tags:

xpath

scrapy

I'm using scrapy, and I'm trying to look for a span that contains a specific text. I have:

response.selector.xpath('//*[@class="ParamText"]/span/node()')

which returns:

<Selector xpath='//*[@class="ParamText"]/span/text()' data=u' MILES STODOLINK'>,
<Selector xpath='//*[@class="ParamText"]/span/text()' data=u'C'>,

<Selector xpath='//*[@class="ParamText"]/span/text()' data=u'  MILES STODOLINK'>]

However when I run:

>>> response.selector.xpath('//*[@class="ParamText"]/span[contains(text(),"STODOLINK")]')
Out[11]: []

Why does the contains function not work?

like image 494
user1592380 Avatar asked Oct 11 '16 02:10

user1592380


1 Answers

contains() can not evaluate multiple nodes at once :

/span[contains(text(),"STODOLINK")]

So, in case there are multiple text nodes within the span, and "STODOLINK" isn't located in the first text node child of the span, then contains() in the above expression won't work. You should try to apply contains() check on individual text nodes as follow :

//*[@class="ParamText"]/span[text()[contains(.,"STODOLINK")]]

Or if "STODOLINK" isn't necessarily located directly within span (can be nested within other element in the span), then you can simply use . instead of text() :

//*[@class="ParamText"]/span[contains(.,"STODOLINK")]
like image 146
har07 Avatar answered Oct 04 '22 19:10

har07