Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Preemptive Validation or Exception handling?

I am trying to decide between 2 patterns regarding data validation:

  1. I try to follow the nominal workflow and catch exceptions thrown by my models and services: unique/foreign constraint violation, empty fields, invalid arguments, etc... (!! I catch only exceptions that I know I should)

    • pros: Very few code to write in my Controllers and Services: I just have to handle exceptions and transcribe them to a user-understandable message. Code very simple and readable.

    • cons: I need to write specific exceptions, which can be a lot of different exceptions sometimes. I also need to catch and parse generic PDO/Doctrine exceptions for database exceptions (constraint violations, etc...) to translate them into exceptions that are meaningfull (eg: DuplicateEntryException). I also can't bypass some validation: let's say an object of my model is marked as locked: trying to delete it will raise an exception. However I may want to force its deletion (with a confirmation popup for example). I won't be able to bypass the exception here.

  2. I test and pre-validate everything explicitly with code and DB queries. For example, i'll test that something is not null and is an integer before setting it as an attribute in my model. Or I'll make a DB query to check that I am not going to create a duplicate entry.

    • pros: no need to write specific exceptions, because I prevalidate everything so I shouldn't be doing a lot of try/catch anyway. Also I can bypass some validation if I want to.

    • cons: Lots of tests and validation to write in the controllers, services and models. I will be performing more queries (the validation part). The DB already does the validation for foreign keys, unique constraints, not null columns... I shouldn't ignore that and recode it myself. Also this leads to very boring code!

I would rather use one pattern or the other, not a mix, in order to keep things as simple as possible.

The first solution seems to me like the best, but I'm afraid it might be some kind of anti-pattern? or maybe behind its theoretical simplicity it is hiding situations very hard to handle?

like image 236
Matthieu Napoli Avatar asked Oct 25 '12 14:10

Matthieu Napoli


1 Answers

I would suggest that data validation should happen at the perimeter of an application. That is to say, that any data coming in should be checked to make sure it meets your expectations. Once allowed into the application, it's no longer validated, but it is always escaped according to context (DB, email, etc.) This allows you to keep all of the validation together and avoids the potential duplication of validation work (it's easy to come up with examples where data could validated twice by two models that both use it.) Joe Armstrong promotes this approach in his book on Erlang, and the software he's written for telcom stations runs for years without restarting, so it does seem to work well :)

Additionally, model expectations don't always perfectly line up with the expectations established by a particular interface (maybe the form is only showing a subset of the potential options, or maybe the interface had a dropdown of US states and the model stores states from many different countries, etc.) Sometimes complex interfaces can integrate several different model objects in a manner that enhances the user experience. While nice for the user, the interaction of these models using the exception approach can be very difficult to handle because some of the inputs may be hybrid inputs that neither model alone can validate. You always want to ensure validation matches the expectations of the UI first and foremost, and the second approach allows you to do this in even the most complex interfaces.

Also, exception handling is relatively expensive in terms of cycles. Validation issues can be quite frequent, and I'd try and avoid such an expensive operation for handling issues than have the potential of being quite frequent.

Last, some validation isn't really necessary for the model, but it's there to prevent attacks. While you can add this to the model, the added functionality can quickly muddy the model code.

So, of these two approaches, I would suggest the second approach because:

  1. You can craft a clear perimeter to your app.
  2. All of the validation is in one place and can be shared.
  3. There's no duplication of validation if two or more models make use of the same input.
  4. The models can focus on what they're good at: mapping knowledge of abstract entities to application state.
  5. Even the most complex UI's can be appropriately validated.
  6. Preemption likely will be more efficient.
  7. Security-focused validation tasks that don't really belong in any model can be cleanly added to the app.
like image 88
AdamJonR Avatar answered Oct 10 '22 12:10

AdamJonR