I want to create a Pydantic model in which there is a list field, which left uninitialized has a default value of an empty list. Is there an idiomatic way to do this?
For Python's built-in dataclass objects you can use field(default_factory=list)
, however in my own experiments this seems to prevent my Pydantic models from being pickled. A naive implementation might be, something like this:
from pydantic import BaseModel
class Foo(BaseModel):
defaulted_list_field: Sequence[str] = [] # Bad!
But we all know not to use a mutable value like the empty-list literal as a default.
So what's the correct way to give a Pydantic list-field a default value?
For pydantic you can use mutable default value, like:
class Foo(BaseModel):
defaulted_list_field: List[str] = []
f1, f2 = Foo(), Foo()
f1.defaulted_list_field.append("hey!")
print(f1) # defaulted_list_field=['hey!']
print(f2) # defaulted_list_field=[]
It will be handled correctly (deep copy) and each model instance will have its own empty list.
Pydantic also has default_factory
parameter. In the case of an empty list, the result will be identical, it is rather used when declaring a field with a default value, you may want it to be dynamic (i.e. different for each model).
from typing import List
from pydantic import BaseModel, Field
from uuid import UUID, uuid4
class Foo(BaseModel):
defaulted_list_field: List[str] = Field(default_factory=list)
uid: UUID = Field(default_factory=uuid4)
While reviewing my colleague's merge request I saw the usage of a mutable object as a default argument and pointed that out. To my surprise, it works as if have done a deepcopy of the object. I found an example in the project's readme, but without any clarification. And suddenly realized that developers constantly ignore this question for a long time (see links at the bottom).
Indeed, you can write something like this. And expect correct behavior:
from pydantic import BaseModel
class Foo(BaseModel):
defaulted_list_field: List[str] = []
But what happens underhood? We need to go deeper...
After a quick search through the source code I found this:
class ModelField(Representation):
...
def get_default(self) -> Any:
return smart_deepcopy(self.default) if self.default_factory is None else self.default_factory()
While smart_deepcopy
function is:
def smart_deepcopy(obj: Obj) -> Obj:
"""
Return type as is for immutable built-in types
Use obj.copy() for built-in empty collections
Use copy.deepcopy() for non-empty collections and unknown objects
"""
obj_type = obj.__class__
if obj_type in IMMUTABLE_NON_COLLECTIONS_TYPES:
return obj # fastest case: obj is immutable and not collection therefore will not be copied anyway
try:
if not obj and obj_type in BUILTIN_COLLECTIONS:
# faster way for empty collections, no need to copy its members
return obj if obj_type is tuple else obj.copy() # type: ignore # tuple doesn't have copy method
except (TypeError, ValueError, RuntimeError):
# do we really dare to catch ALL errors? Seems a bit risky
pass
return deepcopy(obj) # slowest way when we actually might need a deepcopy
Also, as mentioned in the comments you can not use mutable defaults in databases attributes declaration directly (use default_factory instead). So this example is not valid:
from pydantic.dataclasses import dataclass
@dataclass
class Foo:
bar: list = []
And gives:
ValueError: mutable default <class 'list'> for field bar is not allowed: use default_factory
Links to open discussions (no answers so far):
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With