Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

tee() function from itertools library

Here is an simple example that gets min, max, and avg values from a list. The two functions below have same result. I want to know the difference between these two functions. And why use itertools.tee()? What advantage does it provide?

from statistics import median
from itertools import tee

purchases = [1, 2, 3, 4, 5]

def process_purchases(purchases):
    min_, max_, avg = tee(purchases, 3)
    return min(min_), max(max_), median(avg)

def _process_purchases(purchases):
    return min(purchases), max(purchases), median(purchases)

def main():
    stats = process_purchases(purchases=purchases)
    print("Result:", stats)
    stats = _process_purchases(purchases=purchases)
    print("Result:", stats)

if __name__ == '__main__':
    main()
like image 375
Yen Avatar asked May 17 '20 16:05

Yen


People also ask

What does Itertools tee do?

tee() function This iterator splits the container into a number of iterators mentioned in the argument. Parameter: This method contains two arguments, the first argument is iterator and the second argument is a integer. Return Value: This method returns the number of iterators mentioned in the argument.

What does Python Itertools Groupby () do?

groupby() This method calculates the keys for each element present in iterable. It returns key and iterable of grouped items.

Is Itertools a library?

Itertools is a Python module that is part of the Python 3 standard libraries. It lets us perform memory and computation efficient tasks on iterators. It is inspired by constructs from APL, Haskell, and SML.


1 Answers

Iterators can only be iterated once in python. After that they are "exhausted" and don't return more values.

You can see this in functions like map(), zip(), filter() and many others:

purchases = [1, 2, 3, 4, 5]

double = map(lambda n: n*2, purchases)

print(list(double))
# [2, 4, 6, 8, 10]

print(list(double))
# [] <-- can't use it twice

You can see the difference between your two functions if you pass them an iterator, such as the return value from map(). In this case _process_purchases() fails because min() exhausts the iterator and leaves no values for max() and median().

tee() takes an iterator and gives you two or more, allowing you to use the iterator passed into the function more than once:

from itertools import tee
from statistics import median

purchases = [1, 2, 3, 4, 5]

def process_purchases(purchases):
    min_, max_, avg = tee(purchases, 3)
    return min(min_), max(max_), median(avg)


def _process_purchases(purchases):
    return min(purchases), max(purchases), median(purchases)

double = map(lambda n: n*2, purchases)
_process_purchases(double)
# ValueError: max() arg is an empty sequence

double = map(lambda n: n*2, purchases)
process_purchases(double)
# (2, 10, 6)
like image 133
Mark Avatar answered Oct 11 '22 17:10

Mark