I'm using strict type checks via pyright.
When I have a method that returns a pytorch DataLoader
, then pyright complains about my type definition:
Declared return type, "DataLoader[Unknown]", is partially unknown Pyright (reportUnknownVariableType)
Taking a look at the type stub from pytorch's DataLoader
(reduced to the important parts):
class DataLoader(Generic[T_co]):
dataset: Dataset[T_co]
@overload
def __init__(self, dataset: Dataset[T_co], ...
As far as I can see, the generic type T_co
of the DataLoader
should be defined by the __init__
dataset parameter.
Pyright also complains about my Dataset
type definition:
Type of parameter "dataset" is partially unknown Parameter type is "Dataset[Unknown]" Pyright (reportUnknownParameterType)
Taking a look at the Dataset
type stub:
class Dataset(Generic[T_co]):
def __getitem__(self, index: int) -> T_co: ...
shows to me that the type should be inferred by the return type of __getitem__
.
My dataset's type signature of __getitem__
looks like this:
def __getitem__(self, index: int) -> Tuple[Tensor, Tensor]:
Based on this I would expect Dataset
and DataLoader
to be inferred as Dataset[Tuple[Tensor, Tensor]]
and DataLoader[Tuple[Tensor, Tensor]]
but that is not the case.
My guess is that pyright fails to statically infer the types here.
I thought I could define the type signature my self like this:
Dataset[Tuple[Tensor, Tensor]]
but that actually results in my python script crashing with:
TypeError: 'type' object is not subscriptable
How can I properly define the type for Dataset
and DataLoader
?
Since there was no reply on this question I was not sure if it is actually a bug in pyright. I therefore opened this issue on the github repository: https://github.com/microsoft/pyright/issues/698
Eric Traut explained in detail what the issue is and that pyright is working as designed. I try to give the gist of the main points here.
Pyright attempts to infer return types if they are not provided but if they are provided as in this case, they need to be fully typed. Pyright does not fill in missing parts of a given type annotation.
For example, pyright will try to infer the return type for the following function definition:
def get_dataset():
But if the return type is given as Dataset
then that is the return type pyright expects.
def get_dataset() -> Dataset:
In this case Dataset
is a generic class that does not handle subscripting like Dataset[int]
.
In Python 3.7 (what we are using) the Python interpreter will evaluate these type annotations what leads to the mentioned exception.
As of Python 3.10 the Python interpreter will no longer evaluate type annotations and the following type annotation will just work:
def get_dataset() -> Dataset[int]:
As of Python 3.7 it is possible to enable this behavior via the following import:
from __future__ import annotations
This is documented in PEP 563. You will also need to disable the rule E1136 for pylint to not warn about "unsubscriptable-object".
Another workaround is to quote the type definition like this:
def get_dataset() -> "Dataset[int]":
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With