Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python's PEP 484 type annotation for Generator Expression

What is the correct type annotation for a function that returns a generator expression?

e.g.:

def foo():
    return (x*x for x in range(10))

I can't figure out if this is -> Iterator[int], -> Iterable[int], -> Generator[int, None, None], or something else.

If there should be one-- and preferably only one --obvious way to do it, then what is the obvious way here?

like image 765
Chen Levy Avatar asked Feb 14 '17 13:02

Chen Levy


People also ask

What is the return type of a generator Python?

A generator is a special type of function which does not return a single value, instead, it returns an iterator object with a sequence of values.

What is a type annotation in Python?

Type annotations — also known as type signatures — are used to indicate the datatypes of variables and input/outputs of functions and methods. In many languages, datatypes are explicitly stated. In these languages, if you don't declare your datatype — the code will not run.

What does type () do in Python?

Python has a lot of built-in functions. The type() function is used to get the type of an object. When a single argument is passed to the type() function, it returns the type of the object. Its value is the same as the object.

How do you annotate code in Python?

First, annotations can be fully implemented as decorators. You can just define an @annotate decorator and have it take an argument name and a Python expression as arguments and then store them in the target function's annotations attribute. This can be done for Python 2 as well.


2 Answers

All three forms mentioned by you in question are listed as valid alternatives in documentation, Generator expression simply creates a generator that only yields.

Quote 1:

A generator can be annotated by the generic type Generator[YieldType, SendType, ReturnType].

Quote 2:

If your generator will only yield values, set the SendType and ReturnType to None

Quote 3:

Alternatively, annotate your generator as having a return type of either Iterable[YieldType] or Iterator[YieldType]:

like image 200
Łukasz Rogalski Avatar answered Sep 28 '22 12:09

Łukasz Rogalski


Quick note: your function is a "regular function which returns a generator", not a "generator function". To understand the distinction, read this answer.

For your foo, I suggest using -> Iterator[int].

Explanation

It boils down to what kind of interface you want.

First, make yourself familiar with this page in the python documentation where the hierarchy of the most important Python types is defined.

You can see there that these expressions return True:

import typing as t
issubclass(t.Iterator, t.Iterable)
issubclass(t.Generator, t.Iterator)

You should also notice on the same page that Generator has methods that Iterator doesn't have. These methods are send, throw and close (documentation), and they allow you to do more with generators than just simple single passthrough iteration. Check this question for examples of the possibilities with generators: python generator "send" function purpose?

Going back to choosing an interface. If you want others to use the results of your generator function like a generator, i.e.

def gen(limit: int): -> Generator[int, None, None]
    for x in range(limit):
        yield x

g = gen(3)
next(g)  # => 0
g.send(10)  # => 1

Then you should specify -> Generator[int, None, None].

But notice that above is nonsense. You in fact can call send, but it doesn't change the execution because gen doesn't do anything with sent value (there is nothing like x = yield). Knowing that, you can limit the knowledge of people using gen and define it as -> Iterator[int]. In this way, you can make a contract with users that "my function returns iterator of integers and you should use it as such". If you later change implementation to, e.g.

def gen(limit: int): -> Iterator[int]
    return iter(list(range(limit)))

Those who used a returned object like Generator (because they peeked implementation) would have their code broken. However, you shouldn't be bothered by that because they used it in a different way to the way specified in your contract. As such, this kind of breakage is not your responsibility.

Put simply, if you end up with Generator[Something, None, None] (two Nones) then consider Iterable[Something] or Iterator[Something].

The same goes for Iterator vs Iterable. If you want your users to be able to use your object only with the iter function (and thus be used in iteration context e.g. [x for x in g]), then use Iterable. If you want them to use both next and iter on the object, use Iterator.

Note

This line of thought applies mostly to the annotated type of returned values. In the case of parameters, you should specify the types according to what interface (read: methods/functions) you want to use on that object inside your function.

like image 36
WloHu Avatar answered Sep 28 '22 10:09

WloHu