Python 3.3's yield from

Tags:

Python 3 brings the yield from semantics. As far as I understand it's supposed to yield to the outermost generator in which case I'd expect this code to be linear in N.

from collections import Iterable

def flatten(L):
  for e in L:
    if isinstance(e, Iterable):
      yield from flatten(e)
    else:
      yield e 

N = 100
L = [-1]
for i in range(N):
  L = [i, [L], i]
for i in range(100):
  f = list(flatten(L))
print(len(f))

If I set N=200 however the computation time is about four times longer suggesting flatten is quadratic in the length of L. I can't understand why it would be since the code visits each element only once and the yield from keyword should be sending values directly from the inner generator to where the terms are collected into the list.

Is this a bug, simply not intended, or am I using it wrong? Is there a nice way to do O(N) flattening in Python?

737

asked Sep 08 '12 13:09

LordLing

1 Answers

yield from, just like for item in x: yield x, is linear. However, function calls are slow, and because of the nesting in your l, when you double N, you're not merely doubling the number of terms, you're doubling the number of calls needed. Anything which scales with the number of calls, like function overhead itself esp. due to the recursion, any yield from overhead, for loop initialization overhead, whatever, would therefore cause a problem. If this is right, then we'd expect that a list with the same number of elements and no nesting would be linear, mid-nesting would be in-between, and lots of nesting would be super-slow. And that's what we see:

import timeit

s = """

from collections import Iterable

def flatten(l):
   for e in l:
       if isinstance(e, Iterable):
           yield from flatten(e)
       else:
           yield e 

def build_lots(N):
    l = [-1]
    for i in range(N):
        l = [i, [l], i]
    return l

def build_some(N):
    l = [-1]
    for i in range(N):
        l = [i]+l+[i] if i % 2 else [i,[l],i]
    return l

def build_none(N):
    return range(2*N+1)

"""

def get_time(size, which_build, n=100):
    setup = s + "l = build_{}({});".format(which_build, size)
    ans = timeit.Timer(setup=setup, stmt="z=list(flatten(l))").timeit(n)
    return ans

print('length', 'none','some','lots')
for w in range(0, 500, 50):
    print(2*w+1, 
          get_time(w, 'none'), 
          get_time(w, 'some'),
          get_time(w, 'lots'))

produces

length none some lots
1 0.0012789969332516193 0.0006600483320653439 0.000653265044093132
101 0.030214487109333277 0.06863495009019971 0.10554128512740135
201 0.05980244372040033 0.188231083098799 0.3237948380410671
301 0.08960435865446925 0.3752179825678468 0.6493003228679299
401 0.11986956000328064 0.6066137161105871 1.147628225851804
501 0.14737469609826803 0.9323012446984649 1.7087256000377238
601 0.18555369088426232 1.2575508910231292 2.2957410947419703
701 0.20820995513349771 1.712264522910118 3.094764341134578
801 0.23618148919194937 2.100640726275742 4.079551971051842
901 0.26863432209938765 2.617169467266649 4.858607416041195

simple plot of time vs nesting

189

answered Sep 28 '22 17:09

DSM

Related questions
                            
                                Why can't I add a tuple to a list with the '+' operator in Python?
                            
                                how to convert raw images to png in python?
                            
                                Creating list from retrlines in Python
                            
                                Is there a way to get an item from a set in O(1) time? [duplicate]
                            
                                complex eigen values in PCA calculation
                            
                                Django: Is separating views.py into its own module a good idea?
                            
                                Finding unusual value in an array, list
                            
                                Can django-pagination do multiple paginations per page?
                            
                                django: How do I hash a URL from the database object's primary key?
                            
                                Comparing numpy datatypes to strings
                            
                                How do I get linux to automatically run my python script in the Python interpreter?
                            
                                Python DB API list tables
                            
                                randomly choose 100 documents under a directory
                            
                                Python Glob.glob: a wildcard for the number of directories between the root and the destination
                            
                                Call methods by string
                            
                                Python big file parsing with re
                            
                                How do I create a zip file of a file path using Python, including empty directories?
                            
                                Use Regular Expressions (or another python module) to compare text/characters?
                            
                                Pull random keys from dictionary in Python that are not equal to one another
                            
                                Make a simple Python dictionary generator

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python 3.3's yield from

Tags:

python

yield

functional-programming

LordLing

People also ask

1 Answers

DSM

Recent Activity

Donate For Us