Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Saving XML using ETree in Python. It's not retaining namespaces, and adding ns0, ns1 and removing xmlns tags

I see there are similar questions here, but nothing that has totally helped me. I've also looked at the official documentation on namespaces but can't find anything that is really helping me, perhaps I'm just too new at XML formatting. I understand that perhaps I need to create my own namespace dictionary? Either way, here is my situation:

I am getting a result from an API call, it gives me an XML that is stored as a string in my Python application.

What I'm trying to accomplish is just grab this XML, swap out a tiny value (The b:string value user ConditionValue/Default but that's irrelevant to this question) and then save it as a string to send later on in a Rest POST call.

The source XML looks like this:

<Context xmlns="http://Test.the.Sdk/2010/07" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<xmlns i:nil="true" xmlns="http://schema.test.org/2004/07/Test.Soa.Vocab" xmlns:a="http://schema.test.org/2004/07/System.Xml.Serialize"/>
<Conditions xmlns:a="http://schema.test.org/2004/07/Test.Soa.Vocab">
    <a:Condition>
        <a:xmlns i:nil="true" xmlns:b="http://schema.test.org/2004/07/System.Xml.Serialize"/>
        <Identifier>a23aacaf-9b6b-424f-92bb-5ab71505e3bc</Identifier>
        <Name>Code</Name>
        <ParameterSelections/>
        <ParameterSetCollections/>
        <Parameters/>
        <Summary i:nil="true"/>
        <Instance>25486d6c-36ba-4ab2-9fa6-0dbafbcf0389</Instance>
        <ConditionValue>
            <ComplexValue i:nil="true"/>
            <Text i:nil="true" xmlns:b="http://schemas.microsoft.com/2003/10/Serialization/Arrays"/>
            <Default>
                <ComplexValue i:nil="true"/>
                <Text xmlns:b="http://schemas.microsoft.com/2003/10/Serialization/Arrays">
                    <b:string>NULLCODE</b:string>
                </Text>
            </Default>
        </ConditionValue>
        <TypeCode>String</TypeCode>
    </a:Condition>
    <a:Condition>
        <a:xmlns i:nil="true" xmlns:b="http://schema.test.org/2004/07/System.Xml.Serialize"/>
        <Identifier>0af860f6-5611-4a23-96dc-eb3863975529</Identifier>
        <Name>Content Type</Name>
        <ParameterSelections/>
        <ParameterSetCollections/>
        <Parameters/>
        <Summary i:nil="true"/>
        <Instance>6364ec20-306a-4cab-aabc-8ec65c0903c9</Instance>
        <ConditionValue>
            <ComplexValue i:nil="true"/>
            <Text i:nil="true" xmlns:b="http://schemas.microsoft.com/2003/10/Serialization/Arrays"/>
            <Default>
                <ComplexValue i:nil="true"/>
                <Text xmlns:b="http://schemas.microsoft.com/2003/10/Serialization/Arrays">
                    <b:string>Standard</b:string>
                </Text>
            </Default>
        </ConditionValue>
        <TypeCode>String</TypeCode>
    </a:Condition>
</Conditions>

My job is to swap out one of the values, retaining the entire structure of the source, and use this to submit a POST later on in the application.

The problem that I am having is that when it saves to a string or to a file, it totally messes up the namespaces:

<ns0:Context xmlns:ns0="http://Test.the.Sdk/2010/07" xmlns:ns1="http://schema.test.org/2004/07/Test.Soa.Vocab" xmlns:ns3="http://schemas.microsoft.com/2003/10/Serialization/Arrays" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<ns1:xmlns xsi:nil="true" />
<ns0:Conditions>
<ns1:Condition>
<ns1:xmlns xsi:nil="true" />
<ns0:Identifier>a23aacaf-9b6b-424f-92bb-5ab71505e3bc</ns0:Identifier>
<ns0:Name>Code</ns0:Name>
<ns0:ParameterSelections />
<ns0:ParameterSetCollections />
<ns0:Parameters />
<ns0:Summary xsi:nil="true" />
<ns0:Instance>25486d6c-36ba-4ab2-9fa6-0dbafbcf0389</ns0:Instance>
<ns0:ConditionValue>
<ns0:ComplexValue xsi:nil="true" />
<ns0:Text xsi:nil="true" />
<ns0:Default>
<ns0:ComplexValue xsi:nil="true" />
<ns0:Text>
<ns3:string>NULLCODE</ns3:string>
</ns0:Text>
</ns0:Default>
</ns0:ConditionValue>
<ns0:TypeCode>String</ns0:TypeCode>
</ns1:Condition>
<ns1:Condition>
<ns1:xmlns xsi:nil="true" />
<ns0:Identifier>0af860f6-5611-4a23-96dc-eb3863975529</ns0:Identifier>
<ns0:Name>Content Type</ns0:Name>
<ns0:ParameterSelections />
<ns0:ParameterSetCollections />
<ns0:Parameters />
<ns0:Summary xsi:nil="true" />
<ns0:Instance>6364ec20-306a-4cab-aabc-8ec65c0903c9</ns0:Instance>
<ns0:ConditionValue>
<ns0:ComplexValue xsi:nil="true" />
<ns0:Text xsi:nil="true" />
<ns0:Default>
<ns0:ComplexValue xsi:nil="true" />
<ns0:Text>
<ns3:string>Standard</ns3:string>
</ns0:Text>
</ns0:Default>
</ns0:ConditionValue>
<ns0:TypeCode>String</ns0:TypeCode>
</ns1:Condition>
</ns0:Conditions>

I've narrowed the code down to the most basic form and I'm still getting the same results so it's not anything to do with how I'm manipulating the file normally:

import xml.etree.ElementTree as ET
import requests

get_context_xml = 'http://localhost/testapi/returnxml' #returns first XML example above.
source_context_xml = requests.get(get_context_xml)

Tree = ET.fromstring(source_context_xml)

#Ensure the original namespaces are intact.
for Conditions in Tree.iter('{http://schema.test.org/2004/07/Test.Soa.Vocab}Condition'): 
    print "success"

with open('/home/memyself/output.xml','w') as f:
    f.write(ET.tostring(Tree))
like image 285
emmdee Avatar asked Aug 04 '15 09:08

emmdee


1 Answers

You need to register the prefix and the namespace before you do fromstring() (Reading the xml) to avoid the default namespace prefixes (like ns0 and ns1 , etc.) .

You can use the ET.register_namespace() function for that, Example -

ET.register_namespace('<prefix>','http://Test.the.Sdk/2010/07')
ET.register_namespace('a','http://schema.test.org/2004/07/Test.Soa.Vocab')

You can leave the <prefix> empty if you do not want a prefix.


Example/Demo -

>>> r = ET.fromstring('<a xmlns="blah">a</a>')
>>> ET.tostring(r)
b'<ns0:a xmlns:ns0="blah">a</ns0:a>'
>>> ET.register_namespace('','blah')
>>> r = ET.fromstring('<a xmlns="blah">a</a>')
>>> ET.tostring(r)
b'<a xmlns="blah">a</a>'
like image 153
Anand S Kumar Avatar answered Sep 28 '22 11:09

Anand S Kumar