This is a small sample of my xml file.
<w:p xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
<w:pPr>
<w:rPr>
<w:highlight w:val="yellow"/>
</w:rPr>
</w:pPr>
<w:bookmarkStart w:id="0" w:name="_GoBack"/>
<w:bookmarkEnd w:id="0"/>
<w:r w:rsidRPr="00D1434D">
<w:rPr>
<w:rFonts w:ascii="Times New Roman"
w:eastAsia="MS PGothic"
w:hAnsi="Times New Roman"/>
<w:b/>
<w:color w:val="000000"/>
<w:sz w:val="24"/>
<w:szCs w:val="24"/>
<w:highlight w:val="yellow"/>
</w:rPr>
<w:t xml:space="preserve">Responses to </w:t>
</w:r>
<w:r w:rsidR="00335D4A" w:rsidRPr="00D1434D">
<w:rPr>
<w:rFonts w:ascii="Times New Roman"
w:eastAsia="MS PGothic"
w:hAnsi="Times New Roman"/>
<w:b/>
<w:color w:val="000000"/>
<w:sz w:val="24"/>
<w:szCs w:val="24"/>
<w:highlight w:val="yellow"/>
<w:lang w:eastAsia="ja-JP"/>
</w:rPr>
<w:t>the Reviewer</w:t>
</w:r>
</w:p>
I want to extract text with the w:highlight
tag specifically having the attribute value
= "yellow" . I searched for it but wasn't able to come up with a solution.
The following works for highlight in general:
for t in source.xpath('.//*[local-name()="highlight"]/../..//*[local-name()="t"]'):
do something
I tried :
for t in lxml_tree.xpath('//*[local-name()="highlight"][@val="yellow"]/../..//*[local-name()="t"]'):
this doesn't work, returns nothing..
The local-name function returns a string representing the local name of the first node in a given node-set.
Definition of XPath attribute. For finding an XPath node in an XML document, use the XPath Attribute expression location path. We can use XPath to generate attribute expressions to locate nodes in an XML document.
XPath Tutorial from basic to advance level. This attribute can be easily retrieved and checked by using the @attribute-name of the element. @name − get the value of attribute "name". <td><xsl:value-of select = "@rollno"/></td> Attribute can be used to compared using operators.
For Relative XPath, the path starts from the middle of the HTML DOM structure. It starts with the double forward slash (//), which means it can search the element anywhere at the webpage. You can start from the middle of the HTML DOM structure with no need to write a long XPath.
w:val
attribute is in namespace, so you can't just address it by @val
. One possible solution is by using @*[local-name()='attribute name']
expression to address an attribute by it's local name, similar to what you've done for elements :
//*[local-name()="highlight"][@*[local-name()='val' and .='yellow']]/../..//*[local-name()="t"]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With