Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In Java, how do I parse an xml schema (xsd) to learn what's valid at a given element?

Tags:

java

xsd

I'd like to be able to read in an XML schema (i.e. xsd) and from that know what are valid attributes, child elements, values as I walk through it.

For example, let's say I have an xsd that this xml will validate against:

<root>
  <element-a type="something">
    <element-b>blah</element-b>
    <element-c>blahblah</element-c>
  </element-a>
</root>

I've tinkered with several libraries and I can confidently get <root> as the root element. Beyond that I'm lost.

Given an element I need to know what child elements are required or allowed, attributes, facets, choices, etc. Using the above example I'd want to know that element-a has an attribute type and may have children element-b and element-c...or must have children element-b and element-c...or must have one of each...you get the picture I hope.

I've looked at numerous libraries such as XSOM, Eclipse XSD, Apache XmlSchema and found they're all short on good sample code. My search of the Internet has also been unsuccessful.

Does anyone know of a good example or even a book that demonstrates how to go through an XML schema and find out what would be valid options at a given point in a validated XML document?

clarification

I'm not looking to validate a document, rather I'd like to know the options at a given point to assist in creating or editing a document. If I know "I am here" in a document, I'd like to determing what I can do at that point. "Insert one of element A, B, or C" or "attach attribute 'description'".

like image 824
Paul Avatar asked Nov 26 '11 23:11

Paul


3 Answers

This is a good question. Although, it is old, I did not find an acceptable answer. The thing is that the existing libraries I am aware of (XSOM, Apache XmlSchema) are designed as object models. The implementors did not have the intention to provide any utility methods — you should consider implement them yourself using the provided object model.

Let's see how querying context-specific elements can be done by the means of Apache XmlSchema.

You can use their tutorial as a starting point. In addition, Apache CFX framework provides the XmlSchemaUtils class with lots of handy code examples.

First of all, read the XmlSchemaCollection as illustrated by the library's tutorial:

XmlSchemaCollection xmlSchemaCollection = new XmlSchemaCollection();
xmlSchemaCollection.read(inputSource, new ValidationEventHandler());

Now, XML Schema defines two kinds of data types:

  • Simple types
  • Complex types

Simple types are represented by the XmlSchemaSimpleType class. Handling them is easy. Read the documentation: https://ws.apache.org/commons/XmlSchema/apidocs/org/apache/ws/commons/schema/XmlSchemaSimpleType.html. But let's see how to handle complex types. Let's start with a simple method:

@Override
public List<QName> getChildElementNames(QName parentElementName) {
    XmlSchemaElement element = xmlSchemaCollection.getElementByQName(parentElementName);
    XmlSchemaType type = element != null ? element.getSchemaType() : null;

    List<QName> result = new LinkedList<>();
    if (type instanceof XmlSchemaComplexType) {
        addElementNames(result, (XmlSchemaComplexType) type);
    }
    return result;
}

XmlSchemaComplexType may stand for both real type and for the extension element. Please see the public static QName getBaseType(XmlSchemaComplexType type) method of the XmlSchemaUtils class.

private void addElementNames(List<QName> result, XmlSchemaComplexType type) {
    XmlSchemaComplexType baseType = getBaseType(type);
    XmlSchemaParticle particle = baseType != null ? baseType.getParticle() : type.getParticle();

    addElementNames(result, particle);
}

When you handle XmlSchemaParticle, consider that it can have multiple implementations. See: https://ws.apache.org/commons/XmlSchema/apidocs/org/apache/ws/commons/schema/XmlSchemaParticle.html

private void addElementNames(List<QName> result, XmlSchemaParticle particle) {
    if (particle instanceof XmlSchemaAny) {

    } else if (particle instanceof XmlSchemaElement) {

    } else if (particle instanceof XmlSchemaGroupBase) {

    } else if (particle instanceof XmlSchemaGroupRef) {

    }
}

The other thing to bear in mind is that elements can be either abstract or concrete. Again, the JavaDocs are the best guidance.

like image 113
shapiy Avatar answered Nov 09 '22 10:11

shapiy


Many of the solutions for validating XML in java use the JAXB API. There's an extensive tutorial available here. The basic recipe for doing what you're looking for with JAXB is as follows:

  1. Obtain or create the XML schema to validate against.
  2. Generate Java classes to bind the XML to using xjc, the JAXB compiler.
  3. Write java code to:
    1. Open the XML content as an input stream.
    2. Create a JAXBContext and Unmarshaller
    3. Pass the input stream to the Unmarshaller's unmarshal method.

The parts of the tutorial you can read for this are:

  1. Hello, world
  2. Unmarshalling XML
like image 40
Paul Morie Avatar answered Nov 09 '22 11:11

Paul Morie


I see you have tried Eclipse XSD. Have you tried Eclipse Modeling Framework (EMF)? You can:

Generating an EMF Model using XML Schema (XSD)

Create a dynamic instance from your metamodel (3.1 With the dynamic instance creation tool)

This is for exploring the xsd. You can create the dynamic instance of the root element then you can right click the element and create child element. There you will see what the possible children element and so on.

As for saving the created EMF model to an xml complied xsd: I have to look it up. I think you can use JAXB for that (How to use EMF to read XML file?).


Some refs:

EMF: Eclipse Modeling Framework, 2nd Edition (written by creators)
Eclipse Modeling Framework (EMF)
Discover the Eclipse Modeling Framework (EMF) and Its Dynamic Capabilities
Creating Dynamic EMF Models From XSDs and Loading its Instances From XML as SDOs

like image 39
user802421 Avatar answered Nov 09 '22 10:11

user802421