There seem to be lots of solutions on StackOverflow for converting XML to a Python dictionary, but none of them generate the output I'm looking for. I have the following XML: <pre class="prettyprint"><code><?xml version="1.0" encoding="UTF-8"?> <status xmlns:mystatus="http://localhost/mystatus"> <section1 mystatus:field1="data1" mystatus:field2="data2" /> <section2 mystatus:lineA="outputA" mystatus:lineB="outputB" /> </status> </code></pre> lxml has an elegantly simple solution for converting XML to a dictionary: <pre class="prettyprint"><code>def recursive_dict(element): return element.tag, dict(map(recursive_dict, element)) or element.text </code></pre> Unfortunately, I get: <pre class="prettyprint"><code>('status', {'section2': None, 'section1': None}) </code></pre> instead of: <pre class="prettyprint"><code>('status', {'section2': {'field1':'data1','field2':'data2'}, 'section1': {'lineA':'outputA','lineB':'outputB'} }) </code></pre> I can't figure out how to get my desired output without greatly complicating the recursive_dict() function. I'm not tied to lxml, and I'm also fine with a different organization of the dictionary, as long as it gives me all the info in the xml. Thanks!

Personally I like <code>xmltodict</code> from here. With pip you can install it like so <code>pip install xmltodict</code>. Note that this actually creates <code>OrderedDict</code> objects. Example usage: <pre class="prettyprint"><code>import xmltodict as xd with open('test.xml','r') as f: d = xd.parse(f) </code></pre>

Convert XML to dictionary in Python using lxml

Tags:

python

dictionary

xml

There seem to be lots of solutions on StackOverflow for converting XML to a Python dictionary, but none of them generate the output I'm looking for. I have the following XML:

<?xml version="1.0" encoding="UTF-8"?>
<status xmlns:mystatus="http://localhost/mystatus">
<section1
    mystatus:field1="data1"
    mystatus:field2="data2" />
<section2
    mystatus:lineA="outputA"
    mystatus:lineB="outputB" />
</status>

lxml has an elegantly simple solution for converting XML to a dictionary:

def recursive_dict(element):
 return element.tag, dict(map(recursive_dict, element)) or element.text

Unfortunately, I get:

('status', {'section2': None, 'section1': None})

instead of:

('status', {'section2': 
                       {'field1':'data1','field2':'data2'}, 
            'section1': 
                       {'lineA':'outputA','lineB':'outputB'}
            })

I can't figure out how to get my desired output without greatly complicating the recursive_dict() function.

I'm not tied to lxml, and I'm also fine with a different organization of the dictionary, as long as it gives me all the info in the xml. Thanks!

365

asked Oct 31 '14 01:10

proximous

2 Answers

Personally I like xmltodict from here. With pip you can install it like so pip install xmltodict.

Note that this actually creates OrderedDict objects. Example usage:

import xmltodict as xd

with open('test.xml','r') as f:
    d = xd.parse(f)

172

answered Oct 11 '22 04:10

TheSchwa

I found a solution in this gist: https://gist.github.com/jacobian/795571

def elem2dict(node):
    """
    Convert an lxml.etree node tree into a dict.
    """
    result = {}

    for element in node.iterchildren():
        # Remove namespace prefix
        key = element.tag.split('}')[1] if '}' in element.tag else element.tag

        # Process element as tree element if the inner XML contains non-whitespace content
        if element.text and element.text.strip():
            value = element.text
        else:
            value = elem2dict(element)
        if key in result:

            
            if type(result[key]) is list:
                result[key].append(value)
            else:
                tempvalue = result[key].copy()
                result[key] = [tempvalue, value]
        else:
            result[key] = value
    return result

answered Oct 11 '22 02:10

guettli

Related questions
                            
                                Many-To-Many Relationship in ndb
                            
                                Pandas drop duplicates if reverse is present between two columns
                            
                                How to verify an element contains ANY text?
                            
                                Pandas: Find the maximum range in all the columns of dataframe
                            
                                Django Rest Framework won't let me have more than one permission
                            
                                How to colorize the output of Python errors in the Gnome terminal?
                            
                                handle all exception in scrapy with sentry
                            
                                Converting a dictionary with lists for values into a dataframe
                            
                                python logging: is it possible to add module name to formatter
                            
                                How to avoid race condition with unique checks in Django
                            
                                Why won't this django-rest-swagger API documentation display/work properly?
                            
                                Python Pandas custom time format in Excel output
                            
                                Where to get sphinxcontrib.autohttp.flask?
                            
                                Slice pandas dataframe in groups of consecutive values
                            
                                lxml - get a flat list of elements
                            
                                Alembic - sqlalchemy initial migration
                            
                                Flask, cannot assign requested address [duplicate]
                            
                                Adding a column of zeroes to a csr_matrix
                            
                                Decrease array size by averaging adjacent values with numpy
                            
                                PuLP very slow when adding many constraints

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With