One of the benefits of XML is being able to validate a document against an XSD. YAML doesn't have this feature, so how can I validate that the YAML document I open is in the format expected by my application?
Validate the YAML with its corresponding schema for basic data type check. Custom validations like IP address, random strings can be added in schema.
yaml", "r") as file: for line_number, line in enumerate(file, start=1): if class_indentation in line: print(f"Indentation for '{class_indentation}' is valid: {line_number}") break else: print(f"Indentation for '{class_indentation}' is NOT valid: {line_number}") print("Search completed.")
YAML Validator works well on Windows, MAC, Linux, Chrome, Firefox, Edge, and Safari. This YAML Linter helps a developer who works with JSON data to test and verify.
Given that JSON and YAML are pretty similar beasts, you could make use of JSON-Schema to validate a sizable subset of YAML. Here's a code snippet (you'll need PyYAML and jsonschema installed):
from jsonschema import validate import yaml schema = """ type: object properties: testing: type: array items: enum: - this - is - a - test """ good_instance = """ testing: ['this', 'is', 'a', 'test'] """ validate(yaml.load(good_instance), yaml.load(schema)) # passes # Now let's try a bad instance... bad_instance = """ testing: ['this', 'is', 'a', 'bad', 'test'] """ validate(yaml.load(bad_instance), yaml.load(schema)) # Fails with: # ValidationError: 'bad' is not one of ['this', 'is', 'a', 'test'] # # Failed validating 'enum' in schema['properties']['testing']['items']: # {'enum': ['this', 'is', 'a', 'test']} # # On instance['testing'][3]: # 'bad'
One problem with this is that if your schema spans multiple files and you use "$ref"
to reference the other files then those other files will need to be JSON, I think. But there are probably ways around that. In my own project, I'm playing with specifying the schema using JSON files whilst the instances are YAML.
I find Cerberus to be very reliable with great documentation and straightforward to use.
Here is a basic implementation example:
my_yaml.yaml
:
name: 'my_name' date: 2017-10-01 metrics: percentage: value: 87 trend: stable
Defining the validation schema in schema.py
:
{ 'name': { 'required': True, 'type': 'string' }, 'date': { 'required': True, 'type': 'date' }, 'metrics': { 'required': True, 'type': 'dict', 'schema': { 'percentage': { 'required': True, 'type': 'dict', 'schema': { 'value': { 'required': True, 'type': 'number', 'min': 0, 'max': 100 }, 'trend': { 'type': 'string', 'nullable': True, 'regex': '^(?i)(down|equal|up)$' } } } } } }
Using the PyYaml to load a yaml
document:
import yaml def load_doc(): with open('./my_yaml.yaml', 'r') as stream: try: return yaml.load(stream) except yaml.YAMLError as exception: raise exception ## Now, validating the yaml file is straightforward: from cerberus import Validator schema = eval(open('./schema.py', 'r').read()) v = Validator(schema) doc = load_doc() print(v.validate(doc, schema)) print(v.errors)
Keep in mind that Cerberus is an agnostic data validation tool, which means that it can support formats other than YAML, such as JSON, XML and so on.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With