Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django DRY Model/Form/Serializer Validation

I'm having some issues figuring out the best (read: DRY & maintainable) place for introducing validation logic in Django, namely between models, forms, and DRF serializers.

I've worked with Django for several years and have been following the various conventions for handling model, form, and REST API endpoint validation. I've tried a lot of variations for ensuring overall data integrity, but I've hit a bit of a stumbling block recently. Here is a brief list of what I've tried after looking through many articles, SO posts, and tickets:

  1. Validation at the model level; namely, ensuring all of my custom constraints are matched before calling myModel.save() by overriding myModel.clean() (as well as field-specific and unique together methods). To do this, I ensured myModel.full_clean() was called in myForm.clean() (for forms -- and the admin panel actually already does this) and mySerializer.validate() (for DRF serializers) methods.

  2. Validation at the form and serializer level, calling a shared method for maintainable, DRY code.

  3. Validation at the form and serializer level, with a distinct method for each to ensure maximum flexibility (i.e. for when forms and endpoints have different constraints).

Method one seems the most intuitive to me for when forms and serializers have identical constraints, but is a bit messy in practice; first, data is automatically cleaned and validated by the form or serializer, then the model entity is instantiated, and more validation is run again -- which is a little convoluted and can get complicated.

Method three is what Django Rest Framework recommends as of version 3.0; they eliminated a lot of their model.save() hooks and prefer to leave validation to the user-facing aspects of your application. This makes some sense to me, since Django's base model.save() implementation doesn't call model.full_clean() anyway.

So, method two seems to be the best overall generalized outcome to me; validation lives in a distinct place -- before the model is ever touched -- and the codebase is less cluttered / more DRY due to the shared validation logic.

Unfortunately, most of the trouble I've encountered is with getting Django Rest Framework's serializers to cooperate. All three approaches work well for forms, and in fact work well for most HTTP methods (most notably when POSTing for entity creation) -- but none seem to play well when updating an existing entity (PUT, PATCH).

Long story short, it has proved rather difficult to validate incoming data when it is incomplete (but otherwise valid -- often the case for PATCH). The request data may only contain some fields -- those that contain different / new information -- and the model instance's existing information is maintained for all other fields. In fact, DRF issue #4306 perfectly sums up this particular challenge.

I've also considered running custom model validation at the viewset level (after serializer.validated_data is populated and serializer.instance exists, but before serializer.save() is called), but I'm still struggling to come up with a clean, generalized approach due to the complexities of handling updates.

TL;DR Django Rest Framework makes it a bit hard to write clean, maintainable validation logic in an obvious place, especially for partial updates that rely on a blend of existing model data and incoming request data.

I'd love to have some Django gurus weigh in on what they've gotten to work, because I'm not seeing any convenient solution.

Thanks.

like image 796
SamuelMS Avatar asked Nov 19 '16 20:11

SamuelMS


3 Answers

I agree, the link between models/serializers/validation is broken.

The best DRY solution I've found is to keep validation in model, with validators specified on fields, then if needed, model level validation in clean() overridden.

Then in serializer, override validate and call the model clean() e.g. in MySerializer:

def validate(self, data):
    instance = FooModel(**data)
    instance.clean()
    return data

It's not nice, but I prefer this to 2-level validation in serializer and model.

like image 52
jmoz Avatar answered Oct 16 '22 21:10

jmoz


Just realized I never posted my solution back to this question. I ended up writing a model mixin to always run validation before saving; it's a bit inconvenient as validation will technically be run twice in Django's forms (i.e. in the admin panel), but it lets me guarantee that validation is run -- regardless of what triggers a model save. I generally don't use Django's forms, so this doesn't have much impact on my applications.

Here's a quick snippet that does the trick:

class ValidatesOnSaveModelMixin:
    """ ValidatesOnSaveModelMixin
    A mixin that ensures valid model state prior to saving.
    """
    def save(self, **kwargs):
        self.full_clean()
        super(ValidatesOnSaveModelMixin, self).save(**kwargs)

Here is how you'd use it:

class ImportantModel(ValidatesOnSaveModelMixin, models.Model):
    """ Will always ensure its fields pass validation prior to saving. """

There is one important caveat: any of Django's direct-to-database operations (i.e. ImportantModel.objects.update()) don't call a model's save() method, and therefore will not be validated. There's not much to do about this, since these methods are really about optimizing performance by skipping a bunch of database calls -- so just be aware of their impact if you use them.

like image 5
SamuelMS Avatar answered Oct 16 '22 19:10

SamuelMS


Just wanted to add on SamuelMS's answer. In case you use F() expressions and similar. As explained here this will fail.

class ValidatesOnSaveModelMixin:
    """ ValidatesOnSaveModelMixin
    A mixin that ensures valid model state prior to saving.
    """
    def save(self, **kwargs):
        if 'clean_on_save_exclude' in kwargs:
             self.full_clean(exclude=kwargs.pop('clean_on_save_exclude', None)
        else:
             self.full_clean()
        super(ValidatesOnSaveModelMixin, self).save(**kwargs)

Then just use it the same way he explained. And now when calling save, if you use query expressions can just call

instance.save(clean_on_save_exclude=['field_name'])

Just like you would exclude if you were calling full_clean and exclude the fields with query expressions. See https://docs.djangoproject.com/en/2.2/ref/models/instances/#django.db.models.Model.full_clean

like image 1
sanchaz Avatar answered Oct 16 '22 20:10

sanchaz