Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Editing the XML texts from a XML file using Python

Tags:

python

xml

I have an XML file which contains some data as given.

<?xml version="1.0" encoding="UTF-8" ?> 
- <ParameterData>
  <CreationInfo date="10/28/2009 03:05:14 PM" user="manoj" /> 
- <ParameterList count="85">
- <Parameter name="Spec 2 Included" type="boolean" mode="both">
  <Value>n/a</Value> 
  <Result>n/a</Result> 
  </Parameter>
- <Parameter name="Spec 2 Label" type="string" mode="both">
  <Value>n/a</Value> 
  <Result>n/a</Result> 
  </Parameter>
- <Parameter name="Spec 3 Included" type="boolean" mode="both">
  <Value>n/a</Value> 
  <Result>n/a</Result> 
  </Parameter>
- <Parameter name="Spec 3 Label" type="string" mode="both">
  <Value>n/a</Value> 
  <Result>n/a</Result> 
  </Parameter>
  </ParameterList>
  </ParameterData>

I have one text file with lines as

Spec 2 Included : TRUE
Spec 2 Label: 19-Flat2-HS3   
Spec 3 Included : FALSE
Spec 3 Label: 4-1-Bead1-HS3

Now I want to edit XML texts; i,e. I want to replace the field (n/a) with the corresponding values from the text file. Like I want the file to looks like

<?xml version="1.0" encoding="UTF-8" ?> 
- <ParameterData>
  <CreationInfo date="10/28/2009 03:05:14 PM" user="manoj" /> 
- <ParameterList count="85">
- <Parameter name="Spec 2 Included" type="boolean" mode="both">
  <Value>TRUE</Value> 
  <Result>TRUE</Result> 
  </Parameter>
- <Parameter name="Spec 2 Label" type="string" mode="both">
  <Value>19-Flat2-HS3</Value> 
  <Result>19-Flat2-HS3</Result> 
  </Parameter>
- <Parameter name="Spec 3 Included" type="boolean" mode="both">
  <Value>FALSE</Value> 
  <Result>FALSE</Result> 
  </Parameter>
- <Parameter name="Spec 3 Label" type="string" mode="both">
  <Value>4-1-Bead1-HS3</Value> 
  <Result>4-1-Bead1-HS3</Result> 
  </Parameter>
  </ParameterList>
  </ParameterData>

I am new to this Python-XML coding. I dont have idea about how to edit the text fields in a XML file. I am trying to Use elementtree.ElementTree module. but to read the lines in XML file and extract the attributes I dont know which modules need to be imported.

Please help.

Thanks and Regards.

like image 771
manoj1123 Avatar asked Dec 18 '09 05:12

manoj1123


Video Answer


1 Answers

You can convert your data text into python dictionary by regular expression

data="""Spec 2 Included : TRUE
Spec 2 Label: 19-Flat2-HS3
Spec 3 Included : FALSE
Spec 3 Label: 4-1-Bead1-HS3"""

#data=open("data.txt").read()

import re

data=dict(re.findall('(Spec \d+ (?:Included|Label))\s*:\s*(\S+)',data))

data will be as follows

{'Spec 3 Included': 'FALSE', 'Spec 2 Included': 'TRUE', 'Spec 3 Label': '4-1-Bead1-HS3', 'Spec 2 Label': '19-Flat2-HS3'}

Then you can convert it by using any of your favoriate xml parser, I will use minidom here.

from xml.dom import minidom

dom = minidom.parseString(xml_text)
params=dom.getElementsByTagName("Parameter")
for param in params:
    name=param.getAttribute("name")
    if name in data:
        for item in param.getElementsByTagName("*"): # You may change to "Result" or "Value" only
            item.firstChild.replaceWholeText(data[name])

print dom.toxml()

#write to file
open("output.xml","wb").write(dom.toxml())

Results

<?xml version="1.0" ?><ParameterData>
  <CreationInfo date="10/28/2009 03:05:14 PM" user="manoj"/>
  <ParameterList count="85">
    <Parameter mode="both" name="Spec 2 Included" type="boolean">
      <Value>TRUE</Value>
      <Result>TRUE</Result>
    </Parameter>
    <Parameter mode="both" name="Spec 2 Label" type="string">
      <Value>19-Flat2-HS3</Value>
      <Result>19-Flat2-HS3</Result>
    </Parameter>
    <Parameter mode="both" name="Spec 3 Included" type="boolean">
      <Value>FALSE</Value>
      <Result>FALSE</Result>
    </Parameter>
    <Parameter mode="both" name="Spec 3 Label" type="string">
      <Value>4-1-Bead1-HS3</Value>
      <Result>4-1-Bead1-HS3</Result>
    </Parameter>
  </ParameterList>
</ParameterData>
like image 81
YOU Avatar answered Nov 14 '22 23:11

YOU