Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python lxml: how to validate file using xml schema file defined in the xml file

Tags:

python

xml

lxml

xsd

I'm using Python lxml library to parse xml files. I need to validate a xml file against a xml schema. lxml supports XML schema validation, but the xml schema filepath/content must be provided (information available here: http://lxml.de/validation.html). However, I don't know the xml schema filepath beforehand, it should be parsed from the xml file header tags. I can't find a way to access these tags.

lxml supports this use case somehow?

like image 336
Alan Evangelista Avatar asked Feb 15 '26 03:02

Alan Evangelista


1 Answers

If the schema is linked using attributes on the root element, in the http://www.w3.org/2001/XMLSchema-instance namespace, you can retrieve these with lxml by prefixing the attribute name with the namespace URL in curly braces:

XMLSchemaNamespace = '{http://www.w3.org/2001/XMLSchema-instance}'
document = lxml.parse(xmlfile)
schemaLink = document.get(XMLSchemaNamespace + 'schemaLocation')
if schemaLink is None:
    schemaLink = document.get(XMLSchemaNamespace + 'noNamespaceSchemaLocation')

and then use a URL library to load the schema from the location referred. See lxml namespace handling for more information.

like image 88
Martijn Pieters Avatar answered Feb 16 '26 15:02

Martijn Pieters



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!