Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using XPath in ElementTree

My XML file looks like the following:

<?xml version="1.0"?> <ItemSearchResponse xmlns="http://webservices.amazon.com/AWSECommerceService/2008-08-19">   <Items>     <Item>       <ItemAttributes>         <ListPrice>           <Amount>2260</Amount>         </ListPrice>       </ItemAttributes>       <Offers>         <Offer>           <OfferListing>             <Price>               <Amount>1853</Amount>             </Price>           </OfferListing>         </Offer>       </Offers>     </Item>   </Items> </ItemSearchResponse> 

All I want to do is extract the ListPrice.

This is the code I am using:

>> from elementtree import ElementTree as ET >> fp = open("output.xml","r") >> element = ET.parse(fp).getroot() >> e = element.findall('ItemSearchResponse/Items/Item/ItemAttributes/ListPrice/Amount') >> for i in e: >>    print i.text >> >> e >> 

Absolutely no output. I also tried

>> e = element.findall('Items/Item/ItemAttributes/ListPrice/Amount') 

No difference.

What am I doing wrong?

like image 332
Ryan R. Rosario Avatar asked Aug 23 '09 19:08

Ryan R. Rosario


People also ask

How do you find the XPath of an element LXML?

If all you have in your section of code is the element and you want the element's xpath do then element. getroottree(). getpath(element) will do the job.

How do you read a specific tag in an XML file in Python?

Example Read XML File in Python To read an XML file, firstly, we import the ElementTree class found inside the XML library. Then, we will pass the filename of the XML file to the ElementTree. parse() method, to start parsing. Then, we will get the parent tag of the XML file using getroot() .

What is XPath and how it is used?

The XML Path Language (XPath) is used to uniquely identify or address parts of an XML document. An XPath expression can be used to search through an XML document, and extract information from any part of the document, such as an element or attribute (referred to as a node in XML) in it.


1 Answers

There are 2 problems that you have.

1) element contains only the root element, not recursively the whole document. It is of type Element not ElementTree.

2) Your search string needs to use namespaces if you keep the namespace in the XML.

To fix problem #1:

You need to change:

element = ET.parse(fp).getroot() 

to:

element = ET.parse(fp) 

To fix problem #2:

You can take off the xmlns from the XML document so it looks like this:

<?xml version="1.0"?> <ItemSearchResponse>   <Items>     <Item>       <ItemAttributes>         <ListPrice>           <Amount>2260</Amount>         </ListPrice>       </ItemAttributes>       <Offers>         <Offer>           <OfferListing>             <Price>               <Amount>1853</Amount>             </Price>           </OfferListing>         </Offer>       </Offers>     </Item>   </Items> </ItemSearchResponse> 

With this document you can use the following search string:

e = element.findall('Items/Item/ItemAttributes/ListPrice/Amount') 

The full code:

from elementtree import ElementTree as ET fp = open("output.xml","r") element = ET.parse(fp) e = element.findall('Items/Item/ItemAttributes/ListPrice/Amount') for i in e:   print i.text 

Alternate fix to problem #2:

Otherwise you need to specify the xmlns inside the srearch string for each element.

The full code:

from elementtree import ElementTree as ET fp = open("output.xml","r") element = ET.parse(fp)  namespace = "{http://webservices.amazon.com/AWSECommerceService/2008-08-19}" e = element.findall('{0}Items/{0}Item/{0}ItemAttributes/{0}ListPrice/{0}Amount'.format(namespace)) for i in e:     print i.text 

Both print:

2260

like image 137
Brian R. Bondy Avatar answered Sep 17 '22 18:09

Brian R. Bondy