Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generator speed in python 3

I am going through a link about generators that someone posted. In the beginning he compares the two functions below. On his setup he showed a speed increase of 5% with the generator.

I'm running windows XP, python 3.1.1, and cannot seem to duplicate the results. I keep showing the "old way"(logs1) as being slightly faster when tested with the provided logs and up to 1GB of duplicated data.

Can someone help me understand whats happening differently?

Thanks!

def logs1():
    wwwlog = open("big-access-log")
    total = 0
    for line in wwwlog:
        bytestr = line.rsplit(None,1)[1]
        if bytestr != '-':
            total += int(bytestr)
    return total

def logs2():
    wwwlog = open("big-access-log")
    bytecolumn = (line.rsplit(None,1)[1] for line in wwwlog)
    getbytes      = (int(x) for x in bytecolumn if x != '-')
    return sum(getbytes)

*edit, spacing messed up in copy/paste

like image 467
Will Avatar asked Mar 08 '10 04:03

Will


1 Answers

For what it's worth, the main purpose of the speed comparison in the presentation was to point out that using generators does not introduce a huge performance overhead. Many programmers, when first seeing generators, might start wondering about the hidden costs. For example, is there all sorts of fancy magic going on behind the scenes? Is using this feature going to make my program run twice as slow?

In general that's not the case. The example is meant to show that a generator solution can run at essentially the same speed, if not slightly faster in some cases (although it depends on the situation, version of Python, etc.). If you are observing huge differences in performance between the two versions though, then that would be something worth investigating.

like image 151
David Beazley Avatar answered Oct 21 '22 12:10

David Beazley