I'm dealing with a JSON structure which is output to me in structures like this:
[{u'item': u'something',
u'data': {
u'other': u'',
u'else':
[
{
u'more': u'even more',
u'argh':
{
...etc..etc
As you can see, these are nested dicts and lists. There is much discussion about flattening these recursively, but I haven't found one yet that can deal with a list of dictionaries which may in turn contain either dictionaries of lists, lists of lists, dictionaries of dictionaries etc; which are of unknown depth! In some cases the depth may be up to 100 or so. I've been trying this so far without much luck (python 2.7.2):
def flatten(structure):
out = []
for item in structure:
if isinstance(item, (list, tuple)):
out.extend(flatten(item))
if isinstance(item, (dict)):
for dictkey in item.keys():
out.extend(flatten(item[dictkey]))
else:
out.append(item)
return out
Any ideas?
UPDATE This pretty much works:
def flatten(l):
out = []
if isinstance(l, (list, tuple)):
for item in l:
out.extend(flatten(item))
elif isinstance(l, (dict)):
for dictkey in l.keys():
out.extend(flatten(l[dictkey]))
elif isinstance(l, (str, int, unicode)):
out.append(l)
return out
Approach to flatten JSON: There are many ways to flatten JSON. There is one recursive way and another by using the json-flatten library. Now we can flatten the dictionary array by a recursive approach which is quite easy to understand. The recursive approach is a bit slower than using the json-flatten library.
Basically the same way you would flatten a nested list, you just have to do the extra work for iterating the dict by key/value, creating new keys for your new dictionary and creating the dictionary at final step. For Python >= 3.3, change the import to from collections.
Since the depth of your data is arbitrary, it is easier to resort to recursion to flatten it. This function creates a flat dictionary, with the path to each data item composed as the key, in order to avoid collisions.
You can retrieve its contents later with for key in sorted(dic_.keys())
, for example.
I didn't test it, since you did not provide a "valid" snippet of your data.
def flatten(structure, key="", path="", flattened=None):
if flattened is None:
flattened = {}
if type(structure) not in(dict, list):
flattened[((path + "_") if path else "") + key] = structure
elif isinstance(structure, list):
for i, item in enumerate(structure):
flatten(item, "%d" % i, path + "_" + key, flattened)
else:
for new_key, value in structure.items():
flatten(value, new_key, path + "_" + key, flattened)
return flattened
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With