Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to update/modify an XML file in python?

Tags:

python

io

xml

I have an XML document that I would like to update after it already contains data.

I thought about opening the XML file in "a" (append) mode. The problem is that the new data will be written after the root closing tag.

How can I delete the last line of a file, then start writing data from that point, and then close the root tag?

Of course I could read the whole file and do some string manipulations, but I don't think that's the best idea..

Thanks for your time.

like image 744
nunos Avatar asked Oct 19 '09 22:10

nunos


People also ask

How do I update an XML file?

To update data in an XML column, use the SQL UPDATE statement. Include a WHERE clause when you want to update specific rows. The entire column value will be replaced. The input to the XML column must be a well-formed XML document.

How does Python manage XML?

To read an XML file using ElementTree, firstly, we import the ElementTree class found inside xml library, under the name ET (common convension). Then passed the filename of the xml file to the ElementTree. parse() method, to enable parsing of our xml file. Then got the root (parent tag) of our xml file using getroot().


2 Answers

Using ElementTree:

import xml.etree.ElementTree  # Open original file et = xml.etree.ElementTree.parse('file.xml')  # Append new tag: <a x='1' y='abc'>body text</a> new_tag = xml.etree.ElementTree.SubElement(et.getroot(), 'a') new_tag.text = 'body text' new_tag.attrib['x'] = '1' # must be str; cannot be an int new_tag.attrib['y'] = 'abc'  # Write back to file #et.write('file.xml') et.write('file_new.xml') 

note: output written to file_new.xml for you to experiment, writing back to file.xml will replace the old content.

IMPORTANT: the ElementTree library stores attributes in a dict, as such, the order in which these attributes are listed in the xml text will NOT be preserved. Instead, they will be output in alphabetical order. (also, comments are removed. I'm finding this rather annoying)

ie: the xml input text <b y='xxx' x='2'>some body</b> will be output as <b x='2' y='xxx'>some body</b>(after alphabetising the order parameters are defined)

This means when committing the original, and changed files to a revision control system (such as SVN, CSV, ClearCase, etc), a diff between the 2 files may not look pretty.

like image 161
FraggaMuffin Avatar answered Sep 20 '22 08:09

FraggaMuffin


Useful Python XML parsers:

  1. Minidom - functional but limited
  2. ElementTree - decent performance, more functionality
  3. lxml - high-performance in most cases, high functionality including real xpath support

Any of those is better than trying to update the XML file as strings of text.

What that means to you:

Open your file with an XML parser of your choice, find the node you're interested in, replace the value, serialize the file back out.

like image 25
Gabriel Hurley Avatar answered Sep 22 '22 08:09

Gabriel Hurley