Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

zip iterators asserting for equal length in python

I am looking for a nice way to zip several iterables raising an exception if the lengths of the iterables are not equal.

In the case where the iterables are lists or have a len method this solution is clean and easy:

def zip_equal(it1, it2):     if len(it1) != len(it2):         raise ValueError("Lengths of iterables are different")     return zip(it1, it2) 

However, if it1 and it2 are generators, the previous function fails because the length is not defined TypeError: object of type 'generator' has no len().

I imagine the itertools module offers a simple way to implement that, but so far I have not been able to find it. I have come up with this home-made solution:

def zip_equal(it1, it2):     exhausted = False     while True:         try:             el1 = next(it1)             if exhausted: # in a previous iteration it2 was exhausted but it1 still has elements                 raise ValueError("it1 and it2 have different lengths")         except StopIteration:             exhausted = True             # it2 must be exhausted too.         try:             el2 = next(it2)             # here it2 is not exhausted.             if exhausted:  # it1 was exhausted => raise                 raise ValueError("it1 and it2 have different lengths")         except StopIteration:             # here it2 is exhausted             if not exhausted:                 # but it1 was not exhausted => raise                 raise ValueError("it1 and it2 have different lengths")             exhausted = True         if not exhausted:             yield (el1, el2)         else:             return 

The solution can be tested with the following code:

it1 = (x for x in ['a', 'b', 'c'])  # it1 has length 3 it2 = (x for x in [0, 1, 2, 3])     # it2 has length 4 list(zip_equal(it1, it2))           # len(it1) < len(it2) => raise it1 = (x for x in ['a', 'b', 'c'])  # it1 has length 3 it2 = (x for x in [0, 1, 2, 3])     # it2 has length 4 list(zip_equal(it2, it1))           # len(it2) > len(it1) => raise it1 = (x for x in ['a', 'b', 'c', 'd'])  # it1 has length 4 it2 = (x for x in [0, 1, 2, 3])          # it2 has length 4 list(zip_equal(it1, it2))                # like zip (or izip in python2) 

Am I overlooking any alternative solution? Is there a simpler implementation of my zip_equal function?

Update:

  • Requiring python 3.10 or newer, see Asocia's answer
  • Thorough performance benchmarking and best performing solution on python<3.10: Stefan's answer
  • Simple answer without external dependencies: Martijn Pieters' answer (please check the comments for a bugfix in some corner cases)
  • More complex than Martijn's, but with better performance: cjerdonek's answer
  • If you don't mind a package dependency, see pylang's answer
like image 596
zeehio Avatar asked Oct 05 '15 17:10

zeehio


People also ask

What is Zip_longest in Python?

zip_longest (*iterables, fillvalue=None) Make an iterator that aggregates elements from each of the iterables. If the iterables are of uneven length, missing values are filled-in with fillvalue. Iteration continues until the longest iterable is exhausted.

How do you zip tuple?

The zip() function returns an iterator of tuples based on the iterable objects. If a single iterable is passed, zip() returns an iterator of tuples with each tuple having only one element. If multiple iterables are passed, zip() returns an iterator of tuples with each tuple having elements from all the iterables.

How do you zip a value in Python?

Python's zip() function is defined as zip(*iterables) . The function takes in iterables as arguments and returns an iterator. This iterator generates a series of tuples containing elements from each iterable. zip() can accept any type of iterable, such as files, lists, tuples, dictionaries, sets, and so on.


1 Answers

An optional boolean keyword argument, strict, is introduced for the built-in zip function in PEP 618.

Quoting What’s New In Python 3.10:

The zip() function now has an optional strict flag, used to require that all the iterables have an equal length.

When enabled, a ValueError is raised if one of the arguments is exhausted before the others.

>>> list(zip('ab', range(3))) [('a', 0), ('b', 1)] >>> list(zip('ab', range(3), strict=True)) Traceback (most recent call last):   File "<stdin>", line 1, in <module> ValueError: zip() argument 2 is longer than argument 1 
like image 70
Asocia Avatar answered Oct 20 '22 09:10

Asocia