Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python 3 replacement for deprecated compiler.ast flatten function

What's the recommended way to flatten nested lists since the deprecation of the compiler package?

>>> from compiler.ast import flatten
>>> flatten(["junk",["nested stuff"],[],[[]]])
['junk', 'nested stuff']

I know that there are a few stack overflow answers for list flattening, but I'm hoping for the pythonic, standard package, "one, and preferably only one, obvious way" to do this.

like image 815
Mittenchops Avatar asked Apr 23 '13 18:04

Mittenchops


4 Answers

itertools.chain is the best solution for flattening any nested iterable one level - it's highly efficient compared to any pure-python solution.

That said, it will work on all iterables, so some checking is required if you want to avoid it flattening out strings, for example.

Likewise, it won't magically flatten out to an arbitrary depth. That said, generally, such a generic solution isn't required - instead it's best to keep your data structured so that it doesn't require flattening in that way.

Edit: I would argue that if one had to do arbitrary flattening, this is the best way:

import collections

def flatten(iterable):
    for el in iterable:
        if isinstance(el, collections.Iterable) and not isinstance(el, str): 
            yield from flatten(el)
        else:
            yield el

Remember to use basestring in 2.x over str, and for subel in flatten(el): yield el instead of yield from flatten(el) pre-3.3.

As noted in the comments, I would argue this is the nuclear option, and is likely to cause more problems than it solves. Instead, the best idea is to make your output more regular (output that contains one item still give it as a one item tuple, for example), and do regular flattening by one level where it is introduced, rather than all at the end.

This will produce more logical, readable, and easier to work with code. Naturally, there are cases where you need to do this kind of flattening (if the data is coming from somewhere you can't mess with, so you have no option but to take it in the poorly-structured format), in which case, this kind of solution might be needed, but in general, it's probably a bad idea.

like image 73
Gareth Latty Avatar answered Nov 15 '22 19:11

Gareth Latty


Your stated function takes a nested list and flattens that into a new list.

To flatten an arbitrarily nested list into a new list, this works on Python 3 as you expect:

import collections
def flatten(x):
    result = []
    for el in x:
        if isinstance(x, collections.Iterable) and not isinstance(el, str):
            result.extend(flatten(el))
        else:
            result.append(el)
    return result

print(flatten(["junk",["nested stuff"],[],[[]]]))  

Prints:

['junk', 'nested stuff']

If you want a generator that does the same thing:

def flat_gen(x):
    def iselement(e):
        return not(isinstance(e, collections.Iterable) and not isinstance(e, str))
    for el in x:
        if iselement(el):
            yield el
        else:
            for sub in flat_gen(el): yield sub

print(list(flat_gen(["junk",["nested stuff"],[],[[[],['deep']]]]))) 
# ['junk', 'nested stuff', 'deep']

For Python 3.3 and later, use yield from instead of the loop:

def flat_gen(x):
    def iselement(e):
        return not(isinstance(e, collections.Iterable) and not isinstance(e, str))
    for el in x:
        if iselement(el):
            yield el
        else:
            yield from flat_gen(el)   
like image 37
dawg Avatar answered Nov 15 '22 19:11

dawg


You can use flatten function from funcy library:

from funcy import flatten, isa
flat_list = flatten(your_list)

You can also explicitly specify which values to follow:

# Follow only sets
flat_list = flatten(your_list, follow=isa(set))

Take a peek at its implementation if you want an algorythm.

like image 22
Suor Avatar answered Nov 15 '22 18:11

Suor


My ugly while-chain solution, just for fun:

from collections import Iterable
from itertools import chain

def flatten3(seq, exclude=(str,)):
    sub = iter(seq)
    try:
        while sub:
            while True:
                j = next(sub)
                if not isinstance(j, Iterable) or isinstance(j, exclude):
                    yield j
                else:
                    sub = chain(j, sub)
                    break
    except StopIteration:
        return
like image 42
kalgasnik Avatar answered Nov 15 '22 18:11

kalgasnik