Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to type-hint / type-check a dictionary (at runtime) for an arbitrary number of arbitrary key/value pairs?

I am defining a class about as follows:

from numbers import Number
from typing import Dict

from typeguard import typechecked

Data = Dict[str, Number]

@typechecked
class Foo:
    def __init__(self, data: Data):
        self._data = dict(data)
    @property
    def data(self) -> Data:
        return self._data

I am using typeguard. My intention is to restrict the types that can go into the data dictionary. Obviously, typeguard does check the entire dictionary if it is passed into a function or returned from one. If the dictionary is "exposed" directly, it becomes the dictionary's "responsibility" to check types - which does not work, obviously:

bar = Foo({'x': 2, 'y': 3}) # ok

bar = Foo({'x': 2, 'y': 3, 'z': 'not allowed'}) # error as expected

bar.data['z'] = 'should also be not allowed but still is ...' # no error, but should cause one

PEP 589 introduces typed dictionaries, but for a fixed set of keys (similar to struct-like constructs in other languages). In contrast, I need this for a flexible number of arbitrary keys.

My best bad idea is to go "old-school": Sub-classing dict and re-implementing every bit of API through which data can go in (and out) of the dictionary and adding type checks to them:

@typechecked
class TypedDict(dict): # just a sketch
    def __init__(
        self,
        other: Union[Data, None] = None,
        **kwargs: Number,
    ):
        pass # TODO
    def __setitem__(self, key: str, value: Number):
        pass # TODO
    # TODO

Is there a valid alternative that does not require the "old-school" approach?

like image 863
s-m-e Avatar asked Oct 13 '21 11:10

s-m-e


People also ask

How do you type hinting in Python?

Here's how you can add type hints to our function: Add a colon and a data type after each function parameter. Add an arrow ( -> ) and a data type after the function to specify the return data type.

How do you check if a dictionary contains a value?

Check if a value exists in a dictionary: in operator, values() To check if a value exists in a dictionary, i.e., if a dictionary has/contains a value, use the in operator and the values() method. Use not in to check if a value does not exist in a dictionary.

Can a dictionary have arbitrary data types in Python?

Python's dictionaries are kind of hash table type. They work like associative arrays or hashes found in Perl and consist of key-value pairs. A dictionary key can be almost any Python type, but are usually numbers or strings. Values, on the other hand, can be any arbitrary Python object.

How do you check if a number is in a dictionary Python?

Check If Key Exists using has_key() method Using has_key() method returns true if a given key is available in the dictionary, otherwise, it returns a false. With the Inbuilt method has_key(), use the if statement to check if the key is present in the dictionary or not.

How to add type hints to a dictionary?

Special construct to add type hints to a dictionary. At runtime it is a plain dict. TypedDict declares a dictionary type that expects all of its instances to have a certain set of keys, where each key is associated with a value of a consistent type. This expectation is not checked at runtime but is only enforced by type checkers.

What is typeddict at runtime?

At runtime it is a plain dict. TypedDict declares a dictionary type that expects all of its instances to have a certain set of keys, where each key is associated with a value of a consistent type. This expectation is not checked at runtime but is only enforced by type checkers.

How to check syntax for type hints in Python?

To check the syntax for type hints, you need to use a static type checker tool. Python doesn’t have an official static type checker tool. At the moment, the most popular third-party tool is Mypy. Since Mypy is a third-party package, you need to install it using the following pip command:

What is an any type in Java?

A special kind of type is Any. A static type checker will treat every type as being compatible with Any and Any as being compatible with every type. This means that it is possible to perform any operation or method call on a value of type Any and assign it to any variable:


Video Answer


1 Answers

There seem to be several parts to your question.


(1) Creating a type-checked dictionary at runtime


As @juanpa.arrivillaga says in the comments, this has everything to do with type-checking, but doesn't seem to have anything to do with type-hinting. However, it's fairly trivial to design your own custom type-checked data structure. You can do it like this using collections.UserDict:

from collections import UserDict
from numbers import Number

class StrNumberDict(UserDict):
    def __setitem__(self, key, value):
        if not isinstance(key, str):
            raise TypeError(
                f'Invalid type for dictionary key: '
                f'expected "str", got "{type(key).__name__}"'
            )
        if not isinstance(value, Number):
            raise TypeError(
                f'Invalid type for dictionary value: '
                f'expected "Number", got "{type(value).__name__}"'
            )
        super().__setitem__(key, value)

In usage:

>>> d = StrNumberDict()
>>> d['foo'] = 5
>>> d[5] = 6
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<string>", line 5, in __setitem__
TypeError: Invalid type for dictionary key: expected "str", got "int"
>>> d['bar'] = 'foo'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<string>", line 10, in __setitem__
TypeError: Invalid type for dictionary value: expected "Number", got "str"

If you wanted to generalise this kind of thing, you could do it like this:

from collections import UserDict

class TypeCheckedDict(UserDict):
    def __init__(self, key_type, value_type, initdict=None):
        self._key_type = key_type
        self._value_type = value_type
        super().__init__(initdict)

    def __setitem__(self, key, value):
        if not isinstance(key, self._key_type):
            raise TypeError(
                f'Invalid type for dictionary key: '
                f'expected "{self._key_type.__name__}", '
                f'got "{type(key).__name__}"'
            )
        if not isinstance(value, self._value_type):
            raise TypeError(
                f'Invalid type for dictionary value: '
                f'expected "{self._value_type.__name__}", '
                f'got "{type(value).__name__}"'
            )
        super().__setitem__(key, value)

In usage:

>>> from numbers import Number
>>> d = TypeCheckedDict(key_type=str, value_type=Number, initdict={'baz': 3.14})
>>> d['baz']
3.14
>>> d[5] = 5
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<string>", line 9, in __setitem__
TypeError: Invalid type for dictionary key: expected "str", got "int"
>>> d['foo'] = 'bar'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<string>", line 15, in __setitem__
TypeError: Invalid type for dictionary value: expected "Number", got "str"
>>> d['foo'] = 5
>>> d['foo']
5

Note that you don't need to do type checks for the dictionary you pass to super().__init__(). UserDict.__init__ calls self.__setitem__, which you've already overridden, so if you pass an invalid dictionary to TypeCheckedDict.__init__, you'll find an exception is raised in just the same way as if you try to add an invalid key or value to the dictionary after it has been constructed:

>>> from numbers import Number
>>> d = TypeCheckedDict(str, Number, {'foo': 'bar'})
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<string>", line 5, in __init__
  line 985, in __init__
    self.update(dict)
  line 842, in update
    self[key] = other[key]
  File "<string>", line 16, in __setitem__
TypeError: Invalid type for dictionary value: expected "Number", got "str"

UserDict is specifically designed for easy subclassing in this way, which is why it is a better base class in this instance than dict.

If you wanted to add type hints to TypeCheckedDict, you'd do it like this:

from collections import UserDict
from collections.abc import Mapping, Hashable
from typing import TypeVar, Optional

K = TypeVar('K', bound=Hashable)
V = TypeVar('V')

class TypeCheckedDict(UserDict[K, V]):
    def __init__(
        self, 
        key_type: type[K], 
        value_type: type[V], 
        initdict: Optional[Mapping[K, V]] = None
    ) -> None:
        self._key_type = key_type
        self._value_type = value_type
        super().__init__(initdict)

    def __setitem__(self, key: K, value: V) -> None:
        if not isinstance(key, self._key_type):
            raise TypeError(
                f'Invalid type for dictionary key: '
                f'expected "{self._key_type.__name__}", '
                f'got "{type(key).__name__}"'
            )
        if not isinstance(value, self._value_type):
            raise TypeError(
                f'Invalid type for dictionary value: '
                f'expected "{self._value_type.__name__}", '
                f'got "{type(value).__name__}"'
            )
        super().__setitem__(key, value)

(The above passes MyPy.)

Note, however, that adding type hints has no relevance at all to how this data structure works at runtime.


(2) Type-hinting dictionaries "for a flexible number of arbitrary keys"


I'm not quite sure what you mean by this, but if you want MyPy to raise an error if you add a string value to a dictionary you only want to have numeric values, you could do it like this:

from typing import SupportsFloat

d: dict[str, SupportsFloat] = {}
d['a'] = 5  # passes MyPy 
d['b'] = 4.67 # passes MyPy
d[5] = 6 # fails MyPy
d['baz'] = 'foo' # fails Mypy 

If you want MyPy static checks and runtime checks, your best bet (in my opinion) is to use the type-hinted version of TypeCheckedDict above:

d = TypeCheckedDict(str, SupportsFloat) # type: ignore[misc]
d['a'] = 5  # passes MyPy 
d['b'] = 4.67  # passes MyPy 
d[5] = 6  # fails Mypy 
d['baz'] = 'foo'  # fails Mypy

Mypy isn't too happy about us passing an abstract type in as a parameter to TypeCheckedDict.__init__, so you have to add a # type: ignore[misc] when instantiating the dict. (That feels like a MyPy bug to me.) Other than that, however, it works fine.

(See my previous answer for caveats about using SupportsFloat to hint numeric types. Use typing.Dict instead of dict for type-hinting if you're on Python <= 3.8.)


(3) Using typeguard


Since you're using typeguard, you could simplify the logic in my StrNumberDict class a little, like so:

from collections import UserDict
from typeguard import typechecked
from typing import SupportsFloat

class StrNumberDict(UserDict[str, SupportsFloat]):
    @typechecked
    def __setitem__(self, key: str, value: SupportsFloat) -> None:
        super().__setitem__(key, value)

However, I don't think there's a way of doing this with typeguard if you want to have a more generic TypeCheckedDict that can be instantiated with arbitrary type-checking. The following does not work:

### THIS DOES NOT WORK ###

from typing import TypeVar, SupportsFloat
from collections.abc import Hashable
from collections import UserDict
from typeguard import typechecked

K = TypeVar('K', bound=Hashable)
V = TypeVar('V')

class TypeCheckedDict(UserDict[K, V]):
    @typechecked
    def __setitem__(self, key: K, value: V) -> None:
        super().__setitem__(key, value)

d = TypeCheckedDict[str, SupportsFloat]()
d[5] = 'foo'  # typeguard raises no error here.

It may also be worth noting that typeguard is not currently maintained, so there is a certain amount of risk involved in using that particular library.

like image 200
Alex Waygood Avatar answered Oct 21 '22 17:10

Alex Waygood