Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Return a list of all variable names in a python nested dict/json document in dot notation

I'm looking for a function that operates on a python arbitrarily nested dict/array in JSON-esque format and returns a list of strings keying all the variable names it contains, to infinite depth. So, if the object is...

x = {
    'a': 'meow',
    'b': {
        'c': 'asd'
    },
    'd': [
        {
            "e": "stuff",
            "f": 1
        },
        {
            "e": "more stuff",
            "f": 2
        }
    ]
}

mylist = f(x) would return...

>>> mylist
['a', 'b', 'b.c', 'd[0].e', 'd[0].f', 'd[1].e', 'd[1].f']
like image 222
Mittenchops Avatar asked Jul 30 '13 16:07

Mittenchops


2 Answers

def dot_notation(obj, prefix=''):
     if isinstance(obj, dict):
         if prefix: prefix += '.'
         for k, v in obj.items():
             for res in dot_notation(v, prefix+str(k)):
                 yield res
     elif isinstance(obj, list):
         for i, v in enumerate(obj):
             for res in dot_notation(v, prefix+'['+str(i)+']'):
                 yield res
     else:
         yield prefix

Example:

>>> list(dot_notation(x))
['a', 'b.c', 'd[0].e', 'd[0].f', 'd[1].e', 'd[1].f']
like image 56
Andrew Clark Avatar answered Oct 17 '22 00:10

Andrew Clark


This is a fun one. I solved it using recursion.

def parse(d):
    return parse_dict(d)

def parse_dict(d):
    items = []
    for key, val in d.iteritems():
        if isinstance(val, dict):
            # use dot notation for dicts
            items += ['{}.{}'.format(key, vals) for vals in parse_dict(val)]
        elif isinstance(val, list):
            # use bracket notation for lists
            items += ['{}{}'.format(key, vals) for vals in parse_list(val)]
        else:
            # just use the key for everything else
            items.append(key)
    return items

def parse_list(l):
    items = []
    for idx, val in enumerate(l):
        if isinstance(val, dict):
            items += ['[{}].{}'.format(idx, vals) for vals in parse_dict(val)]
        elif isinstance(val, list):
            items += ['[{}]{}'.format(idx, vals) for vals in parse_list(val)]
        else:
            items.append('[{}]'.format(val))
    return items

Here is my result:

>>> parse(x)
['a', 'b.c', 'd[0].e', 'd[0].f', 'd[1].e', 'd[1].f']

EDIT

Here it is again using generators, because I liked the answer by F.j

def parse(d):
    return list(parse_dict(d))

def parse_dict(d):
    for key, val in d.iteritems():
        if isinstance(val, dict):
            # use dot notation for dicts
            for item in parse_dict(val):
                yield '{}.{}'.format(key, item)
        elif isinstance(val, list):
            # use bracket notation
            for item in parse_list(val):
                yield '{}{}'.format(key, item)
        else:
            # lowest level - just use the key
            yield key

def parse_list(l):
    for idx, val in enumerate(l):
        if isinstance(val, dict):
            for item in parse_dict(val):
                yield '[{}].{}'.format(idx, item)
        elif isinstance(val, list):
            for item in parse_list(val):
                yield '[{}]{}'.format(idx, item)
        else:
            yield '[{}]'.format(val)

The same result:

>>> parse(x)
['a', 'b.c', 'd[0].e', 'd[0].f', 'd[1].e', 'd[1].f']
like image 34
Patch Rick Walsh Avatar answered Oct 16 '22 23:10

Patch Rick Walsh