Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python/ElementTree: Write to file without namespaces

I'm trying to write an ElementTree object to disk. Everything works, except that the output file looks like this:

<html:html lang="en-US" xml:lang="en-US" xmlns:html="http://www.w3.org/1999/xhtml">
<html:head>
<html:title>vocab</html:title>
<html:style type="text/css"> ...

Since it's got the html: namespace info, the browser can't render it.

How can I make etree save some html to disk without the html: namespace info?

Here's the code I'm using to write:

with open('/path/to/file.html', mode='w', encoding='utf-8') as outfile:
mypage.write(outfile)

Thanks!

like image 486
Nathan Avatar asked May 22 '11 14:05

Nathan


2 Answers

I've been using this workaround:

from xml.etree import ElementTree as ET
ET.register_namespace('', 'http://www.w3.org/1999/xhtml')

Then the html: prefix will be replaced with whitespace when outputting.

like image 170
nupanick Avatar answered Nov 18 '22 02:11

nupanick


Well, I've got it working, but with a kind of roundabout method.

I'm getting a string for the tree (with etree.tostrng()), and then using re.sub('html:', '', thetext) to remove the namespace info. Then, I just write the string to disk normally.

like image 1
Nathan Avatar answered Nov 18 '22 01:11

Nathan