Given a sample pydantic v2+ model:
from pydantic import BaseModel
class Foo(BaseModel):
age: int | None
name: str | None
I want my model to digest but ignore invalid values to receive an instance in any case. For example, Foo(age="I", name="Jim") should (instead of raising a ValidationError) automatically discard the value for the age field and result Foo(age=None, name='Jim').
I could manually loop over the ValidationErrors and drop the corresponding data or loop over the values and use validate_assignment, but I was thinking I am missing something built-in.
2.xThis is the prime use case for the WrapValidator metadata class.
You can define a function to be called around the actual validator. That way you can try to apply the actual validator and upon failure return None instead.
For your example age field, you would do it like this:
from collections.abc import Callable
from typing import Annotated, Any
from pydantic import BaseModel, ValidationError, WrapValidator
def invalid_to_none(v: Any, handler: Callable[[Any], Any]) -> Any:
try:
return handler(v)
except ValidationError:
return None
class Foo(BaseModel):
age: Annotated[int | None, WrapValidator(invalid_to_none)]
name: str | None
Demo:
instance = Foo(age="invalid", name="Jim")
print(instance.model_dump()) # {'age': None, 'name': 'Jim'}
The problem of course is that it is applied on a per-field basis.
To avoid repeating the Annotated type definition for every single field (assuming you want it to apply to all fields), we need to get a bit creative.
One option is to hook into class construction via __init_subclass__ and patch every field annotation with the same WrapValidator. Here is a somewhat crude example of a custom base model that you could use:
from collections.abc import Callable
from typing import Annotated, Any, get_args, get_origin
from pydantic import BaseModel, ValidationError, WrapValidator
def invalid_to_none(v: Any, handler: Callable[[Any], Any]) -> Any:
try:
return handler(v)
except ValidationError:
return None
class CustomBaseModel(BaseModel):
def __init_subclass__(cls, **kwargs: Any) -> None:
for name, annotation in cls.__annotations__.items():
if name.startswith("_"): # exclude protected/private attributes
continue
validator = WrapValidator(invalid_to_none)
if get_origin(annotation) is Annotated:
cls.__annotations__[name] = Annotated[
*get_args(annotation),
validator,
]
else:
cls.__annotations__[name] = Annotated[annotation, validator]
Note that I am consciously using __annotations__ directly rather than calling typing.get_type_hints here because it would attempt to resolve forward references and I want to interfere as little as possible with the Pydantic model construction algorithm.
I am simply ignoring every name in the class namespace that starts with an underscore because those are typically reserved. I am also taking care to re-create already existing Annotated type hints and append the WrapValidator to the end of any metadata already present.
Demo:
...
class Foo(CustomBaseModel):
age: int | None
name: str | None
bar: Annotated[float | None, "other metadata"]
instance = Foo(age="invalid", name="Jim", bar=object())
print(instance.model_dump()) # {'age': None, 'name': 'Jim', 'bar': None}
As you can see, both the age and the bar fields were assigned None and no ValidationError was actually raised.
I should emphasize that this method of patching the field annotations is just an example to illustrate the overall approach. I am not claiming that I thought of every possible pitfall. Make sure to test it with your specific use cases.
1.xThe solution proposed by @larsks with a root_validator is very reasonable in principle. You just need to be careful with the type checks because the field annotations can be very tricky.
One fool-proof but inefficient approach is to just call ModelField.validate for all fields inside the custom root validator and see if it returns errors. If it does, assign None and move on.
Here is a working example:
from pydantic import BaseModel, root_validator
class CustomBaseModel(BaseModel):
@root_validator(pre=True)
def invalid_to_none(cls, values: dict[str, object]) -> dict[str, object]:
validated_values: dict[str, object] = {}
for name, value in values.items():
field = cls.__fields__.get(name)
if field is None: # must be extra data
continue
validated_value, errors = field.validate(
value,
validated_values,
loc="__root__",
cls=cls, # type: ignore[arg-type]
)
validated_values[name] = validated_value
if errors:
values[name] = None
return values
class Foo(CustomBaseModel):
age: int | None
name: str | None
instance = Foo(age="invalid", name="Jim", extra=object())
print(instance.dict()) # {'age': None, 'name': 'Jim'}
As you can see, it works as expected.
The reason this is inefficient though is that it will effectively call all validators for all fields twice -- once in that custom root validator and once in the "regular" validation cycle.
It is still idempotent because we don't actually do anything with the pre-validated values. We discard them and just return the non-validated values from the root validator. But still, this can be a deal-breaker, if you have many custom validation functions and performance is critical.
Create a pre-validator to convert non-int values to None:
from pydantic import BaseModel, validator
from typing import Optional
class Foo(BaseModel):
age: Optional[int]
name: Optional[str]
@validator('age', pre=True)
def validate_age(cls, v):
if not isinstance(v, int):
v = None
return v
With the above code:
>>> Foo(age='I', name='alice')
Foo(age=None, name='alice')
Unrelated to your question, but if you're using a current version of Python you can replace your use of Optional. The following code is equivalent:
from pydantic import BaseModel, validator
class Foo(BaseModel):
age: int|None
name: str|None
@validator('age', pre=True)
def validate_age(cls, v):
if not isinstance(v, int):
v = None
return v
You can validate all fields using a root_validator (pydantic 1.x) or a model_validator (pydantic 2.x). For pydantic 2.x, that might look like:
from pydantic import BaseModel, model_validator
class Foo(BaseModel):
age: int | None
name: str | None
@model_validator(mode="before")
def validate_all(cls, v):
for name, spec in cls.model_fields.items():
if not isinstance(v[name], spec.annotation):
v[name] = None
return v
Hopefully the above code is relatively obvious: we're iterating over the field definitions, checking if a given value in v matches the type annotation for the field, and if not, replacing the value with None.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With