Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JSON schema for data description vs data validation vs input validation

In what I can find about using JSON schema, there seems to be a confusing conflation of (or at least a lack of distinction among) the tasks of describing valid data, validating stored data, and validating input data.

A typical example looks like:

var schema = {
    type: 'object',
    properties: {
        id: { type: 'integer', required: true },
        name: { type: 'string', required: true },
        description: { type: 'string', required: false }
    }
};

This works well for describing what valid data in a data store should look like, and therefore for validating it (the latter isn't terribly useful—if it's in a store it should be valid already):

var storedData = {
    id: 123,
    name: 'orange',
    description: 'delicious'
};

It doesn't work that well for validating input. id is most likely left for the application to generate and not for the user to provide as part of the input. The following input fails validation because it lacks the id which the schema declares to be required:

var inputData = {
    name: 'orange',
    description: 'delicious'
};

Fine, one might say, the schema isn't meant to validate direct input, validation should only occur after the application added an id and the data is what is meant to be stored.

If the schema isn't meant to validate direct input, however, what is 1) the point of JavaScript validators running in the browser, presumably being fed direct input and 2) the point of the obviously input-oriented readonly schema feature in the spec?

Ground gets shakier when thinking of properties that can be set once but not updated (like a username), as well as different access levels (e.g. the admin and the owner of the orange should be able to change the description, while for other users it should stay readonly).

What is the best (or at least working) practice to deal with this? A different schema for each use case, like below?

var baseSchema = {
    type: 'object',
    properties: {
        id: { type: 'integer', required: true },
        name: { type: 'string', required: true },
        description: { type: 'string', required: false }
    }
};

var ownerUpdateSchema = {
    type: 'object',
    properties: {
        id: { type: 'integer', required: false, readonly: true },
        name: { type: 'string', required: true },
        description: { type: 'string', required: false }
    }
};

var userUpdateSchema = {
    type: 'object',
    properties: {
        id: { type: 'integer', required: false, readonly: true },
        name: { type: 'string', required: false, readonly: true },
        description: { type: 'string', required: false, readonly: true }
    }
};

Or something else?

like image 757
mmr Avatar asked Feb 24 '13 02:02

mmr


People also ask

What is JSON input validation?

Validation of the input JSON message or the message tree is performed against the JSON schema files or OpenAPI definition files that are deployed. JSON schema must be contained in a file with a . json file extension, and it must either contain schema in the name (for example, . schema.

How does JSON Schema validate JSON data?

The simplest way to check if JSON is valid is to load the JSON into a JObject or JArray and then use the IsValid(JToken, JsonSchema) method with the JSON Schema. To get validation error messages, use the IsValid(JToken, JsonSchema, IList<String> ) or Validate(JToken, JsonSchema, ValidationEventHandler) overloads.

Can we validate JSON with schema?

JSON Schema is a powerful tool. It enables you to validate your JSON structure and make sure it meets the required API. You can create a schema as complex and nested as you need, all you need are the requirements. You can add it to your code as an additional test or in run-time.

What is meant by JSON Schema validation?

JSON Schema validation asserts constraints on the structure of instance data. An instance location that satisfies all asserted constraints is then annotated with any keywords that contain non-assertion information, such as descriptive metadata and usage hints.


1 Answers

Side-note: "required" is now an array in the parent element in v4, and "readOnly" is capitalised differently - I'll be using that form for my examples

I agree that validating the stored data is pretty rare. And if you're just describing the data, then you don't need to specify that "id" is required.

Another thing to say is that these schemas should all have URIs at which they can be referenced (e.g. /schemas/baseSchema). At that point, you can extend the schemas to make "id" required in some of them:

var ownerInputSchema = {
    type: 'object',
    properties: {
        id: {type: 'integer', readOnly: true},
        name: {type: 'string'},
        description: {type: 'string'}
    },
    required: ['name']
};

var userInputSchema = {
    allOf: [{"$ref": "/schemas/inputSchema"}],
    properties: {
        name: {readOnly: true}
    }
};

var storedSchema = {
    allOf: [{"$ref": "/schemas/inputSchema"}],
    required: ["id"]
}

Although, as I said above, I'm not sure storedSchema should be necessary. What you end up with is one "owner" schema that describes the data format (as served, and as editable by the data owner), and you have a secondary schema that extends that to declare readOnly on an additional property.

like image 100
cloudfeet Avatar answered Oct 01 '22 01:10

cloudfeet