Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to ignore invalid values when creating model instance

Given a sample pydantic v2+ model:

from pydantic import BaseModel

class Foo(BaseModel):
     age: int | None
     name: str | None

I want my model to digest but ignore invalid values to receive an instance in any case. For example, Foo(age="I", name="Jim") should (instead of raising a ValidationError) automatically discard the value for the age field and result Foo(age=None, name='Jim').

I could manually loop over the ValidationErrors and drop the corresponding data or loop over the values and use validate_assignment, but I was thinking I am missing something built-in.

like image 414
djangonaut Avatar asked Jun 17 '26 08:06

djangonaut


2 Answers

Pydantic 2.x

This is the prime use case for the WrapValidator metadata class.

You can define a function to be called around the actual validator. That way you can try to apply the actual validator and upon failure return None instead.

On a per-field basis

For your example age field, you would do it like this:

from collections.abc import Callable
from typing import Annotated, Any

from pydantic import BaseModel, ValidationError, WrapValidator


def invalid_to_none(v: Any, handler: Callable[[Any], Any]) -> Any:
    try:
        return handler(v)
    except ValidationError:
        return None


class Foo(BaseModel):
    age: Annotated[int | None, WrapValidator(invalid_to_none)]
    name: str | None

Demo:

instance = Foo(age="invalid", name="Jim")
print(instance.model_dump())  # {'age': None, 'name': 'Jim'}

The problem of course is that it is applied on a per-field basis.

Applying to all model fields

To avoid repeating the Annotated type definition for every single field (assuming you want it to apply to all fields), we need to get a bit creative.

One option is to hook into class construction via __init_subclass__ and patch every field annotation with the same WrapValidator. Here is a somewhat crude example of a custom base model that you could use:

from collections.abc import Callable
from typing import Annotated, Any, get_args, get_origin

from pydantic import BaseModel, ValidationError, WrapValidator


def invalid_to_none(v: Any, handler: Callable[[Any], Any]) -> Any:
    try:
        return handler(v)
    except ValidationError:
        return None


class CustomBaseModel(BaseModel):
    def __init_subclass__(cls, **kwargs: Any) -> None:
        for name, annotation in cls.__annotations__.items():
            if name.startswith("_"):  # exclude protected/private attributes
                continue
            validator = WrapValidator(invalid_to_none)
            if get_origin(annotation) is Annotated:
                cls.__annotations__[name] = Annotated[
                    *get_args(annotation),
                    validator,
                ]
            else:
                cls.__annotations__[name] = Annotated[annotation, validator]

Note that I am consciously using __annotations__ directly rather than calling typing.get_type_hints here because it would attempt to resolve forward references and I want to interfere as little as possible with the Pydantic model construction algorithm.

I am simply ignoring every name in the class namespace that starts with an underscore because those are typically reserved. I am also taking care to re-create already existing Annotated type hints and append the WrapValidator to the end of any metadata already present.

Demo:

...

class Foo(CustomBaseModel):
    age: int | None
    name: str | None
    bar: Annotated[float | None, "other metadata"]


instance = Foo(age="invalid", name="Jim", bar=object())
print(instance.model_dump())  # {'age': None, 'name': 'Jim', 'bar': None}

As you can see, both the age and the bar fields were assigned None and no ValidationError was actually raised.

I should emphasize that this method of patching the field annotations is just an example to illustrate the overall approach. I am not claiming that I thought of every possible pitfall. Make sure to test it with your specific use cases.


Pydantic 1.x

The solution proposed by @larsks with a root_validator is very reasonable in principle. You just need to be careful with the type checks because the field annotations can be very tricky.

One fool-proof but inefficient approach is to just call ModelField.validate for all fields inside the custom root validator and see if it returns errors. If it does, assign None and move on.

Here is a working example:

from pydantic import BaseModel, root_validator


class CustomBaseModel(BaseModel):
    @root_validator(pre=True)
    def invalid_to_none(cls, values: dict[str, object]) -> dict[str, object]:
        validated_values: dict[str, object] = {}
        for name, value in values.items():
            field = cls.__fields__.get(name)
            if field is None:  # must be extra data
                continue
            validated_value, errors = field.validate(
                value,
                validated_values,
                loc="__root__",
                cls=cls,  # type: ignore[arg-type]
            )
            validated_values[name] = validated_value
            if errors:
                values[name] = None
        return values


class Foo(CustomBaseModel):
    age: int | None
    name: str | None


instance = Foo(age="invalid", name="Jim", extra=object())
print(instance.dict())  # {'age': None, 'name': 'Jim'}

As you can see, it works as expected.

The reason this is inefficient though is that it will effectively call all validators for all fields twice -- once in that custom root validator and once in the "regular" validation cycle.

It is still idempotent because we don't actually do anything with the pre-validated values. We discard them and just return the non-validated values from the root validator. But still, this can be a deal-breaker, if you have many custom validation functions and performance is critical.

like image 178
Daniil Fajnberg Avatar answered Jun 19 '26 22:06

Daniil Fajnberg


Create a pre-validator to convert non-int values to None:

from pydantic import BaseModel, validator
from typing import Optional

class Foo(BaseModel):
     age: Optional[int]
     name: Optional[str]

     @validator('age', pre=True)
     def validate_age(cls, v):
        if not isinstance(v, int):
            v = None
        return v

With the above code:

>>> Foo(age='I', name='alice')
Foo(age=None, name='alice')

Unrelated to your question, but if you're using a current version of Python you can replace your use of Optional. The following code is equivalent:

from pydantic import BaseModel, validator

class Foo(BaseModel):
     age: int|None
     name: str|None

     @validator('age', pre=True)
     def validate_age(cls, v):
        if not isinstance(v, int):
            v = None
        return v

Update

You can validate all fields using a root_validator (pydantic 1.x) or a model_validator (pydantic 2.x). For pydantic 2.x, that might look like:

from pydantic import BaseModel, model_validator


class Foo(BaseModel):
    age: int | None
    name: str | None

    @model_validator(mode="before")
    def validate_all(cls, v):
        for name, spec in cls.model_fields.items():
            if not isinstance(v[name], spec.annotation):
                v[name] = None

        return v

Hopefully the above code is relatively obvious: we're iterating over the field definitions, checking if a given value in v matches the type annotation for the field, and if not, replacing the value with None.

like image 24
larsks Avatar answered Jun 19 '26 21:06

larsks



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!