Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to best validate JSON on the server-side

When handling POST, PUT, and PATCH requests on the server-side, we often need to process some JSON to perform the requests.

It is obvious that we need to validate these JSONs (e.g. structure, permitted/expected keys, and value types) in some way, and I can see at least two ways:

  1. Upon receiving the JSON, validate the JSON upfront as it is, before doing anything with it to complete the request.

  2. Take the JSON as it is, start processing it (e.g. access its various key-values) and try to validate it on-the-go while performing business logic, and possibly use some exception handling to handle vogue data.

The 1st approach seems more robust compared to the 2nd, but probably more expensive (in time cost) because every request will be validated (and hopefully most of them are valid so the validation is sort of redundant).

The 2nd approach may save the compulsory validation on valid requests, but mixing the checks within business logic might be buggy or even risky.

Which of the two above is better? Or, is there yet a better way?

like image 507
skyork Avatar asked May 10 '15 20:05

skyork


1 Answers

What you are describing with POST, PUT, and PATCH sounds like you are implementing a REST API. Depending on your back-end platform, you can use libraries that will map JSON to objects which is very powerful and performs that validation for you. In JAVA, you can use Jersey, Spring, or Jackson. If you are using .NET, you can use Json.NET.

If efficiency is your goal and you want to validate every single request, it would be ideal if you could evaluate on the front-end if you are using JavaScript you can use json2.js.

In regards to comparing your methods, here is a Pro / Cons list.

Method #1: Upon Request

Pros

  1. The business logic integrity is maintained. As you mentioned trying to validate while processing business logic could result in invalid tests that may actually be valid and vice versa or also the validation could inadvertently impact the business logic negatively.
  2. As Norbert mentioned, catching the errors before hand will improve efficiency. The logical question this poses is why spend the time processing, if there are errors in the first place?
  3. The code will be cleaner and easier to read. Having validation and business logic separated will result in cleaner, easier to read and maintain code.

Cons

  1. It could result in redundant processing meaning longer computing time.

Method #2: Validation on the Go

Pros

  1. It's efficient theoretically by saving process and compute time doing them at the same time.

Cons

  1. In reality, the process time that is saved is likely negligible (as mentioned by Norbert). You are still doing the validation check either way. In addition, processing time is wasted if an error was found.
  2. The data integrity can be comprised. It could be possible that the JSON becomes corrupt when processing it this way.
  3. The code is not as clear. When reading the business logic, it may not be as apparent what is happening because validation logic is mixed in.

What it really boils down to is Accuracy vs Speed. They generally have an inverse relationship. As you become more accurate and validate your JSON, you may have to compromise some on speed. This is really only noticeable in large data sets as computers are really fast these days. It is up to you to decide what is more important given how accurate you think you data may be when receiving it or whether that extra second or so is crucial. In some cases, it does matter (i.e. with the stock market and healthcare applications, milliseconds matter) and both are highly important. It is in those cases, that as you increase one, for example accuracy, you may have to increase speed by getting a higher performant machine.

Hope this helps.

like image 175
jth_92 Avatar answered Oct 18 '22 01:10

jth_92