scrapy get the entire text including children

Question

I have a series of  elements inside a document I'm scraping with scrapy.
some of the are: bla bla bla or bla bla blasecond bla bla

I want to extract all the text with the children (assume I already have the selector of the <p)
(second example: to have a string bla bla bla second bla bla)

Brian Lynch · Accepted Answer

Here are 2 options, either can have their benefits depending on the situation.

html sample

<p>Something outside the span<span> and something inside the span</span></p>

Option 01: use //text() -> returns list

response.xpath('//p//text()').getall()

# returns
>>> ['Something outside the span', ' and something inside the span']

Option 02: use string()-> returns string

response.xpath('string(//p)').get()

# returns
>>> 'Something outside the span and something inside the span'

Anzel · Answer

you can just use //text() to extract all text from children nodes

for example:

.//p//text()

scrapy get the entire text including children

Tags:

python

html

scrapy

Boaz

2 Answers

Brian Lynch

Anzel

Recent Activity

Donate For Us

scrapy get the entire text including children

Tags:

python

html

scrapy

Boaz

2 Answers

Brian Lynch

Anzel

Related questions

Recent Activity

Donate For Us