How can I tell ElementTree to ignore namespaces in an XML file?
For example, I would prefer to query modelVersion (as in statement 1) rather than {http://maven.apache.org/POM/4.0.0}modelVersion (as in statement 2).
pom="""
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
</project>
"""
from xml.etree import ElementTree
ElementTree.register_namespace("","http://maven.apache.org/POM/4.0.0")
root = ElementTree.fromstring(pom)
print 1,root.findall('modelVersion')
print 2,root.findall('{http://maven.apache.org/POM/4.0.0}modelVersion')
1 []
2 [<Element '{http://maven.apache.org/POM/4.0.0}modelVersion' at 0x1006bff10>]
There appears to be no straight-forward pathway, thus I'd simply wrap the find calls, e.g.
from xml.etree import ElementTree as ET
POM = """
<project xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://maven.apache.org/POM/4.0.0">
<modelVersion>4.0.0</modelVersion>
</project>
"""
NSPS = {'foo' : "http://maven.apache.org/POM/4.0.0"}
# sic!
def findall(node, tag):
return node.findall('foo:' + tag, NSPS)
root = ET.fromstring(POM)
print(map(ET.tostring, findall(root, 'modelVersion')))
output:
['<ns0:modelVersion xmlns:ns0="http://maven.apache.org/POM/4.0.0">4.0.0</ns0:modelVersion>\n']
Here's what I'm presently doing, which makes me incredibly confident that there's a better way.
$ cat pom.xml |
tr '\n' ' ' |
sed 's/<project [^>]*>/<project>/' |
myprogram |
sed 's/<project>/<project xmlns="http:\/\/maven.apache.org\/POM\/4.0.0" xmlns:xsi="http:\/\/www.w3.org\/2001\/XMLSchema-instance" xsi:schemaLocation="http:\/\/maven.apache.org\/POM\/4.0.0 http:\/\/maven.apache.org\/maven-v4_0_0.xsd">/'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With