Access nested children in xml file parsed with ElementTree

Tags:

I am new to xml parsing. This xml file has the following tree:

FHRSEstablishment
 |--> Header
 |    |--> ...
 |--> EstablishmentCollection
 |    |--> EstablishmentDetail
 |    |    |-->...
 |    |--> Scores
 |    |    |-->...
 |--> EstablishmentCollection
 |    |--> EstablishmentDetail
 |    |    |-->...
 |    |--> Scores
 |    |    |-->...

but when I access it with ElementTree and look for the child tags and attributes,

import xml.etree.ElementTree as ET
import urllib2
tree = ET.parse(
   file=urllib2.urlopen('http://ratings.food.gov.uk/OpenDataFiles/FHRS408en-GB.xml' % i))
root = tree.getroot()
for child in root:
   print child.tag, child.attrib

I only get:

Header {}
EstablishmentCollection {}

which I assume means that their attributes are empty. Why is it so, and how can I access the children nested inside EstablishmentDetail and Scores?

EDIT

Thanks to the answers below I can get inside the tree, but if I want to retrieve values such as those in Scores, this fails:

for node in root.find('.//EstablishmentDetail/Scores'):
    rating = node.attrib.get('Hygiene')
    print rating

and produces

None
None
None

Why is that?

925

asked May 11 '17 16:05

FaCoffee

2 Answers

Yo have to iter() over your root.

that is root.iter() would do the trick!

import xml.etree.ElementTree as ET
import urllib2
tree =ET.parse(urllib2.urlopen('http://ratings.food.gov.uk/OpenDataFiles/FHRS408en-GB.xml'))
root = tree.getroot()
for child in root.iter():
   print child.tag, child.attrib

Output:

FHRSEstablishment {}
Header {}
ExtractDate {}
ItemCount {}
ReturnCode {}
EstablishmentCollection {}
EstablishmentDetail {}
FHRSID {}
LocalAuthorityBusinessID {}
...

To get all tags inside EstablishmentDetail you need to find that tag and then loop through its children!

That is, for example.

for child in root.find('.//EstablishmentDetail'):
    print child.tag, child.attrib

Output:

FHRSID {}
LocalAuthorityBusinessID {}
BusinessName {}
BusinessType {}
BusinessTypeID {}
RatingValue {}
RatingKey {}
RatingDate {}
LocalAuthorityCode {}
LocalAuthorityName {}
LocalAuthorityWebSite {}
LocalAuthorityEmailAddress {}
Scores {}
SchemeType {}
NewRatingPending {}
Geocode {}

To get the score for Hygiene as you've mentioned in comment,

What you have done is, it will get the first Scores tag and that will have Hygiene, ConfidenceInManagement, Structural tags as child when you call for each in root.find('.//Scores'):rating=child.get('Hygiene'). That is, obviously all three child will not have the element!

You need to first - find all Scores tag. - find Hygiene in every tags found!

for each in root.findall('.//Scores'):
    rating = each.find('.//Hygiene')
    print '' if rating is None else rating.text

Output:

answered Oct 19 '22 13:10

Keerthana Prabhakaran

Hope it could be useful:

import xml.etree.ElementTree as etree
with open('filename.xml') as tmpfile:
    doc = etree.iterparse(tmpfile, events=("start", "end"))
    doc = iter(doc)
    event, root = doc.next()
    num = 0
    for event, elem in doc:
        print event, elem

answered Oct 19 '22 11:10

Andrea

Related questions
                            
                                how to return response of axios in return
                            
                                CSS Cursor pointer with SVG image
                            
                                iOS 11 UISearchBar in UINavigationBar
                            
                                What does the term "backpressure" mean in Rxjava?
                            
                                Where are Entity Framework Core conventions?
                            
                                What does :host /deep/ selector mean?
                            
                                Does Rust have an equivalent of C's typedef?
                            
                                Time zone issue involving date fns format()
                            
                                WARNING: sanitizing unsafe style value background-color
                            
                                Is there a constant for max/min int/double value in dart?
                            
                                Copy all dependencies from .Net Standard libraries to .Net Framework Console application
                            
                                Markdown: Reference to section from another file

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Access nested children in xml file parsed with ElementTree

Tags:

python

xml

xml-parsing

tree

elementtree

FaCoffee

People also ask

2 Answers

Keerthana Prabhakaran

Andrea

Recent Activity

Donate For Us