Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When to use IO[str]/IO[bytes] and TextIO/BinaryIO in Python type hinting?

From the documentation, it says that:

Generic type IO[AnyStr] and its subclasses TextIO(IO[str]) and BinaryIO(IO[bytes]) represent the types of I/O streams such as returned by open().

— Python Docs: typing.IO

The docs did not specify when BinaryIO/TextIO shall be used over their counterparts IO[str] and IO[bytes].

Through a simple inspection of the Python Typeshed source, only 30 hits found when searching for BinaryIO, and 109 hits for IO[bytes].

I was trying to switch to BinaryIO from IO[bytes] for better compatibility with sphinx-autodoc-typehints, but the switch-over has broken many type checks as methods like tempfile.NamedTemporaryFile is typed as IO[bytes] instead of the other.

Design-wise speaking, what are the correct situations to use each type of these IO type hints?

like image 978
Eana Hufwe Avatar asked Jan 11 '20 09:01

Eana Hufwe


People also ask

What is io module in Python?

The io module provides Python's main facilities for dealing with various types of I/O. There are three main types of I/O: text I/O, binary I/O and raw I/O. These are generic categories, and various backing stores can be used for each of them.

What does io BytesIO do?

StringIO and BytesIO are methods that manipulate string and bytes data in memory. StringIO is used for string data and BytesIO is used for binary data. This classes create file like object that operate on string data. The StringIO and BytesIO classes are most useful in scenarios where you need to mimic a normal file.


1 Answers

BinaryIO and TextIO directly subclass IO[bytes] and IO[str] respectively, and add on a few extra methods -- see the definitions in typeshed for the specifics.

So if you need these extra methods, use BinaryIO/TextIO. Otherwise, it's probably best to use IO[...] for maximum flexibility. For example, if you annotate a method as accepting an IO[str], it's a bit easier for the end-user to provide an instance of that object.

Though all this being said, the IO classes in general are kind of messy at present: they define a lot of methods that not all functions will actually need. So, the typeshed maintainers are actually considering breaking up the IO class into smaller Protocols. You could perhaps do the same, if you're so inclined. This approach is mostly useful if you want to define your own IO-like classes but don't want the burden of implementing the full typing.IO[...] API -- or if you're using some class that's almost IO-like, but not quite.

All this being said, all three approaches -- using BinaryIO/TextIO, IO[...], or defining more compact custom Protocols -- are perfectly valid. If the sphinx extension doesn't seem to be able to handle one particular approach for some reason, that's probably a bug on their end.

like image 142
Michael0x2a Avatar answered Oct 21 '22 09:10

Michael0x2a