Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create dataclass instance from union type based on string literal

I'm trying to strongly type our code base. A big part of the code is handling events that come from external devices and forwarding them to different handlers. These events all have a value attribute, but this value can have different types. This value type is mapped per event name. So a temperature event always has an int value, an register event always as RegisterInfo as its value.

So I would like to map the event name to the value type. But we are struggling with implementation.

This setup comes the closest to what we want:

@dataclass
class EventBase:
    name: str
    value: Any
    value_type: str

@dataclass
class RegisterEvent(EventBase):
    value: RegisterInfo
    name: Literal["register"]
    value_type: Literal["RegisterInfo"] = "RegisterInfo"


@dataclass
class NumberEvent(EventBase):
    value: float | int
    name: Literal["temperature", "line_number"]
    value_type: Literal["number"] = "number"

@dataclass
class StringEvent(EventBase):
    value: str
    name: Literal["warning", "status"]
    value_type: Literal["string"] = "string"


Events: TypeAlias = RegisterEvent | NumberEvent | StringEvent

With this setup mypy will flag incorrect code like:

def handle_event(event: Events):
    if event.name == "temperature":
        event.value.upper()

(It sees that a temperature event should have value type int, and that doesn't have an upper() method)

But creating the events becomes ugly this way. I don't want a big if statement that maps each event name to a specific event class. We have lots of different event types, and this mapping info is already inside these classes.

Ideally I would like it to look like this:

def handle_device_message(message_info):
    event_name = message_info["event_name"]
    event_value = message_info["event_value"]

    event = Events(event_name, event_value)

Is a "one-liner" like this possible?

I feel like we are kinda walking against wall here, could it be that the code is architecturally wrong?

like image 659
Quint van Dijk Avatar asked Jan 22 '26 10:01

Quint van Dijk


1 Answers

UPDATE: Using Pydantic v2

If you are willing to switch to Pydantic instead of dataclasses, you can define a discriminated union via typing.Annotated and use the TypeAdapter as a "universal" constructor that is able to discriminate between distinct Event subtypes based on the provided name string.

Here is what I would suggest:

from typing import Annotated, Any, Literal

from pydantic import BaseModel, Field, TypeAdapter


class EventBase(BaseModel):
    name: str
    value: Any


class NumberEvent(EventBase):
    name: Literal["temperature", "line_number"]
    value: float


class StringEvent(EventBase):
    name: Literal["warning", "status"]
    value: str


Event = TypeAdapter(Annotated[
    NumberEvent | StringEvent,
    Field(discriminator="name"),
])


event_temp = Event.validate_python({"name": "temperature", "value": 3.14})
event_status = Event.validate_python({"name": "status", "value": "spam"})

print(repr(event_temp))    # NumberEvent(name='temperature', value=3.14)
print(repr(event_status))  # StringEvent(name='status', value='spam')

An invalid name would of course cause a validation error, just like a completely wrong and type for value (that cannot be coerced). Example:

from pydantic import ValidationError

try:
    Event.validate_python({"name": "temperature", "value": "foo"})
except ValidationError as err:
    print(err.json(indent=4))

try:
    Event.validate_python({"name": "foo", "value": "bar"})
except ValidationError as err:
    print(err.json(indent=4))

Output:

[
    {
        "type": "float_parsing",
        "loc": [
            "temperature",
            "value"
        ],
        "msg": "Input should be a valid number, unable to parse string as a number",
        "input": "foo",
        "url": "https://errors.pydantic.dev/2.1/v/float_parsing"
    }
]
[
    {
        "type": "union_tag_invalid",
        "loc": [],
        "msg": "Input tag 'foo' found using 'name' does not match any of the expected tags: 'temperature', 'line_number', 'warning', 'status'",
        "input": {
            "name": "foo",
            "value": "bar"
        },
        "ctx": {
            "discriminator": "'name'",
            "tag": "foo",
            "expected_tags": "'temperature', 'line_number', 'warning', 'status'"
        },
        "url": "https://errors.pydantic.dev/2.1/v/union_tag_invalid"
    }
]

Original Answer: Using Pydantic v1

If you are willing to switch to Pydantic instead of dataclasses, you can define a discriminated union via typing.Annotated and use the parse_obj_as function as a "universal" constructor that is able to discriminate between distinct Event subtypes based on the provided name string.

Here is what I would suggest:

from typing import Annotated, Any, Literal

from pydantic import BaseModel, Field, parse_obj_as


class EventBase(BaseModel):
    name: str
    value: Any


class NumberEvent(EventBase):
    name: Literal["temperature", "line_number"]
    value: float


class StringEvent(EventBase):
    name: Literal["warning", "status"]
    value: str


Event = Annotated[
    NumberEvent | StringEvent,
    Field(discriminator="name"),
]


event_temp = parse_obj_as(Event, {"name": "temperature", "value": "3.14"})
event_status = parse_obj_as(Event, {"name": "status", "value": -10})

print(repr(event_temp))    # NumberEvent(name='temperature', value=3.14)
print(repr(event_status))  # StringEvent(name='status', value='-10')

In this usage demo I purposefully used the "wrong" types for the respective value fields to show that Pydantic will automatically try to coerce them to the right types, once it determines the correct model based on the provided name.

An invalid name would of course cause a validation error, just like a completely wrong and type for value (that cannot be coerced). Example:

from pydantic import ValidationError

try:
    parse_obj_as(Event, {"name": "temperature", "value": "foo"})
except ValidationError as err:
    print(err.json(indent=4))

try:
    parse_obj_as(Event, {"name": "foo", "value": "bar"})
except ValidationError as err:
    print(err.json(indent=4))

Output:

[
    {
        "loc": [
            "__root__",
            "NumberEvent",
            "value"
        ],
        "msg": "value is not a valid float",
        "type": "type_error.float"
    }
]
[
    {
        "loc": [
            "__root__"
        ],
        "msg": "No match for discriminator 'name' and value 'foo' (allowed values: 'temperature', 'line_number', 'warning', 'status')",
        "type": "value_error.discriminated_union.invalid_discriminator",
        "ctx": {
            "discriminator_key": "name",
            "discriminator_value": "foo",
            "allowed_values": "'temperature', 'line_number', 'warning', 'status'"
        }
    }
]

Side notes

An alias for a union of types like NumberEvent | StringEvent should still have a singular name, i.e. Event rather than Events because semantically the annotation e: Event indicates e should be an instance of one of those types, whereas e: Events would suggest e will be multiple instances (a collection) of either of those types.

Also the union float | int is almost always equivalent to float because int is by convention considered a subtype of float by all type checkers.

like image 92
Daniil Fajnberg Avatar answered Jan 24 '26 23:01

Daniil Fajnberg



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!