Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python version 2.7: XML ElementTree: How to iterate through certain elements of a child element in order to find a match

I'm a programming novice and only rarely use python so please bear with me as I try to explain what I am trying to do :)

I have the following XML:

<?xml version = "1.0" encoding = "utf-8"?>
<Patients>
    <Patient>
               <PatientCharacteristics>
                   <patientCode>3</patientCode>
               </PatientCharacteristics>
               <Visits>
                   <Visit>
                          <DAS>
                               <CRP>14</CRP>
                               <ESR/>
                               <Joints>
                                       <DAS_PROFILE>28/28</DAS_PROFILE>
                                       <SWOL28>20</SWOL28>
                                       <TEN28>20</TEN28>
                               </Joints>
                          </DAS>
                          <VisitDate>2010-02-17</VisitDate>
                   </Visit>
                   <Visit>
                          <DAS>
                               <CRP>10</CRP>
                               <ESR/>
                               <Joints>
                                       <DAS_PROFILE>28/28</DAS_PROFILE>
                                       <SWOL28>15</SWOL28>
                                       <TEN28>20</TEN28>
                               </Joints>
                          </DAS>
                          <VisitDate>2010-02-10</VisitDate>
                   </Visit>
               </Visits>
    </Patient>
    <Patient>
        <PatientCharacteristics>
                   <patientCode>3</patientCode>
        </PatientCharacteristics>
               <Visits>
                   <Visit>
                          <DAS>
                               <CRP>14</CRP>
                               <ESR/>
                               <Joints>
                                       <DAS_PROFILE>28/28</DAS_PROFILE>
                                       <SWOL28>34</SWOL28>
                                       <TEN28>0</TEN28>
                               </Joints>
                          </DAS>
                          <VisitDate>2010-08-17</VisitDate>
                   </Visit>
                   <Visit>
                          <DAS>
                               <CRP>10</CRP>
                               <ESR/>
                               <Joints>
                                       <DAS_PROFILE>28/28</DAS_PROFILE>
                                       <SWOL28></SWOL28>
                                       <TEN28>2</TEN28>
                               </Joints>
                          </DAS>
                          <VisitDate>2010-07-10</VisitDate>
                   </Visit>
                   <Visit>
                          <DAS>
                               <CRP>9</CRP>
                               <ESR/>
                               <Joints>
                                       <DAS_PROFILE>28/28</DAS_PROFILE>
                                       <SWOL28>56</SWOL28>
                                       <TEN28>6</TEN28>
                               </Joints>
                          </DAS>
                          <VisitDate>2009-07-10</VisitDate>
                   </Visit>
               </Visits>

    </Patient>
</Patients>

All I want to do here is update certain 'SWOL28' values if they match the patientCode and VisitDate that I have stored in a text file. As I understand, elementtree does not include a parent reference, as if it did, I could just use findall() from the root and work backwards from there. As it stands here is my psuedocode:

  1. For each line in the text file:
  2. Put Visit_Date Patient_Code New_SWOL28 into variables
  3. For each patient element:
  4. If patientCode = Patient_Code
  5. For each Visit element:
  6. If VisitDate = Visit_Date
  7. If SWOL28 element exists for this visit
  8. Update SWOL28 to New_SWOL28

But I am stuck at step number 5. How do I get a list of visits to iterated through? Apologies if this is a very dumb question but I have searched high and low for an answer I assure you! I have stripped down my code to the bare example of the part I need to fix below:

import xml.etree.ElementTree as ET
tree = ET.parse('DB3.xml')
root = tree.getroot()
for child in root: # THIS GETS ME ALL THE PATIENT ATTRIBUTES
    print child.tag 
    for x in child/Visit: # THIS IS WHAT I CANNOT FIND THE CORRECT SYNTAX FOR
        # I WOULD THEN PERFORM STEPS 6, 7 AND 8 HERE

I would be deeply appreciative of any ideas any of you may have on this. I am not a programming natural that's for sure!

Thanks in advance, Sarah

Edit 1:

On the advice of SVK below I tried the following:

import xml.etree.ElementTree as ET
tree = ET.parse('Untitled.xml')
root = tree.getroot()
for child in root:
    print child.tag 
    child.find( "visits" )
    for x in child.iter("visit"):
        print x.tag, x.text

But the only output I get is: Patient Patient and none of the lower tags. Any ideas?

like image 609
Sarah-Ann Avatar asked Mar 26 '13 17:03

Sarah-Ann


People also ask

How do I iterate through an XML node in Python?

To iterate over all nodes, use the iter method on the ElementTree , not the root Element. The root is an Element, just like the other elements in the tree and only really has context of its own attributes and children. The ElementTree has the context for all Elements.

How do you parse an XML string in Python?

3.2 Parsing an XML String We use the ElementTree. fromstring() method to parse an XML string. The method returns root Element directly: a subtle difference compared with the ElementTree. parse() method which returns an ElementTree object.


2 Answers

You can iterate over all the "visit" tags directly under an element "element" like this:

for x in element.iter("visit"):

You can find the first direct child of element matching a certain tag with:

element.find( "visits" )

It looks like you will first have to locate the "visits" element, which is the parent of "visit", and then iterate through its "visit" children. Putting those together you'd have something like this:

for patient_element in root:
    print patient_element.tag 
    visits_element = patient_element.find( "visits" )
    for visit_element in visits_element.iter("visit"):
        print visit_element.tag, visit_element.text
        # ... further processing of each visit element here

In general look at the section "Finding interesting elements" in the documentation for xml.etree.ElementTree: http://docs.python.org/2/library/xml.etree.elementtree.html#finding-interesting-elements

like image 74
svk Avatar answered Oct 12 '22 05:10

svk


This is untested by it should be fairly close to what you want.

for patient in root:
    patient_code =  patient.find('PatientCharacteristics').find('patientCode')
    if patient_code.text == code:
            for visit in patient.find('Visits'):
                    visit_date = visit.find('VisitDate')
                    if visit_date.text == date:
                        swol28 = visit.find('DAS').find('Joints').find('SWOL28')
                        if swol28.text:
                            visit.find('DAS').find('Joints').set('SWOL28', new_swol28)
like image 28
Peter Enns Avatar answered Oct 12 '22 05:10

Peter Enns