Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Handling different XSD versions when binding XML to Java classes

I want to be able to expose an object that represents an XML file to my users. This can be done with many libraries (xmlBeans, JAXB...) and it's all well and fine until I have to support different versions of that XML file (evolving schema problem) for backward compatibility.

I want this to be completely transparent to my users, meaning it is my system that needs to decide which which version of XML file needs to be used at a certain point in time.

Here's a short pseudo code of what I want to achieve:

public VersionIndependantObject getVersionSpecificXmlBindedObject() {
    //Determening XSD version and binding XML file to JAVA object
    return javaObject;
}

VersionIndependantObject - this is an object representation of XML file found on the system at that time (it could be v1,v2...).

Is there a way of doing with one of an already existing libraries for XML -> Java Object binding?

like image 276
esper Avatar asked Dec 17 '13 13:12

esper


1 Answers

XML Schema is Backwards Compatible

Often times XML schemas will evolve so that that they are backwards compatible. This is done by only adding new attributes and elements that are optional. This means an old XML document will still be valid against the new schema. When this strategy is used you simply need to regenerate the model against the new XML schema.

XML Schema is Not Backwards Compatible

If the model is not backwards compatable then things are more complicated.

Generating the Model

You could generate a model for each version of the XML schema. If the namespace doesn't change you will need to override the default package name.

Unmarshalling

The instead of unmarshalling the XML directly you could parse it with a StAX parser. Then you can use the XMLStreamReader to get the version attribute and determine the model to be used. Then unmarshal the XMLStreamReader into that model.


UPDATE

I've already done that (the logic behind which model to use when unmarshaling part is a bit different though). the problem is transparency towards the user (return type). It's not until runtime that I know which model will be returned. How would you handle that?

You either need to have a generic return type (i.e. Object) that can be of any of the generated models, or like in your question have version specific methods that each return their corresponding generated model. I would investigate what schema evolution strategy is being used. Many people try to be backwards friendly (since it helps their processing too).

like image 198
bdoughan Avatar answered Sep 28 '22 13:09

bdoughan