Parsing XML with references to previous tags, and with children corresponding to subtypes of some class

Tags:

I have to deal with (a variation of) the following scenario. My model classes are:

class Car {
    String brand;
    Engine engine;
}

abstract class Engine {
}

class V12Engine extends Engine {
    int horsePowers;
}

class V6Engine extends Engine {
    String fuelType;
}

And I have to deserialize (no need for serialization support ATM) the following input:

<list>

    <brand id="1">
        Volvo
    </brand>

    <car>
        <brand>BMW</brand>
        <v12engine horsePowers="300" />
    </car>

    <car>
        <brand refId="1" />
        <v6engine fuel="unleaded" />
    </car>

</list>

What I've tried / issues:

I've tried using XStream, but it expects me to write tags such as:

<engine class="cars.V12Engine">
    <horsePowers>300</horsePowers>
</engine>

etc. (I don't want an <engine>-tag, I want a <v6engine>-tag or a <v12engine>-tag.

Also, I need to be able to refer back to "predefined" brands based on identifiers, as shown with the brand-id above. (For instance by maintaining a Map<Integer, String> predefinedBrands during the deserialization). I don't know if XStream is well suited for such scenario.

I realize that this could be done "manually" with a push or pull parser (such as SAX or StAX) or a DOM-library. I would however prefer to have some more automation. Ideally, I should be able to add classes (such as new Engines) and start using them in the XML right away. (XStream is by no means a requirement, the most elegant solutions wins the bounty.)

476

asked Dec 27 '12 11:12

aioobe

1 Answers

JAXB (javax.xml.bind) can do everything you're after, though some bits are easier than others. For the sake of simplicity I'm going to assume that all your XML files have a namespace - it's trickier if they don't but can be worked around using the StAX APIs.

<list xmlns="http://example.com/cars">

    <brand id="1">
        Volvo
    </brand>

    <car>
        <brand>BMW</brand>
        <v12engine horsePowers="300" />
    </car>

    <car>
        <brand refId="1" />
        <v6engine fuel="unleaded" />
    </car>

</list>

and assume a corresponding package-info.java of

@XmlSchema(namespace = "http://example.com/cars",
           elementFormDefault = XmlNsForm.QUALIFIED)
package cars;
import javax.xml.bind.annotation.*;

Engine type by element name

This is simple, using @XmlElementRef:

package cars;
import javax.xml.bind.annotation.*;

@XmlRootElement
@XmlAccessorType(XmlAccessType.FIELD)
public class Car {
    String brand;
    @XmlElementRef
    Engine engine;
}

@XmlRootElement
abstract class Engine {
}

@XmlRootElement(name = "v12engine")
@XmlAccessorType(XmlAccessType.FIELD)
class V12Engine extends Engine {
    @XmlAttribute
    int horsePowers;
}

@XmlRootElement(name = "v6engine")
@XmlAccessorType(XmlAccessType.FIELD)
class V6Engine extends Engine {
    // override the default attribute name, which would be fuelType
    @XmlAttribute(name = "fuel")
    String fuelType;
}

The various types of Engine are all annotated @XmlRootElement and marked with appropriate element names. At unmarshalling time the element name found in the XML is used to decide which of the Engine subclasses to use. So given XML of

<car xmlns="http://example.com/cars">
    <brand>BMW</brand>
    <v12engine horsePowers="300" />
</car>

and unmarshalling code

JAXBContext ctx = JAXBContext.newInstance(Car.class, V6Engine.class, V12Engine.class);
Unmarshaller um = ctx.createUnmarshaller();
Car c = (Car)um.unmarshal(new File("file.xml"));

assert "BMW".equals(c.brand);
assert c.engine instanceof V12Engine;
assert ((V12Engine)c.engine).horsePowers == 300;

To add a new type of Engine simply create the new subclass, annotate it with @XmlRootElement as appropriate, and add this new class to the list passed to JAXBContext.newInstance().

Cross-references for brands

JAXB has a cross-referencing mechanism based on @XmlID and @XmlIDREF but these require that the ID attribute be a valid XML ID, i.e. an XML name, and in particular not entirely consisting of digits. But it's not too difficult to keep track of the cross references yourself, as long as you don't require "forward" references (i.e. a <car> that refers to a <brand> that has not yet been "declared").

The first step is to define a JAXB class to represent the <brand>

package cars;

import javax.xml.bind.annotation.*;

@XmlRootElement
public class Brand {
  @XmlValue // i.e. the simple content of the <brand> element
  String name;

  // optional id and refId attributes (optional because they're
  // Integer rather than int)
  @XmlAttribute
  Integer id;

  @XmlAttribute
  Integer refId;
}

Now we need a "type adapter" to convert between the Brand object and the String required by Car, and to maintain the id/ref mapping

package cars;

import javax.xml.bind.annotation.adapters.*;
import java.util.*;

public class BrandAdapter extends XmlAdapter<Brand, String> {
  private Map<Integer, Brand> brandCache = new HashMap<Integer, Brand>();

  public Brand marshal(String s) {
    return null;
  }


  public String unmarshal(Brand b) {
    if(b.id != null) {
      // this is a <brand id="..."> - cache it
      brandCache.put(b.id, b);
    }
    if(b.refId != null) {
      // this is a <brand refId="..."> - pull it from the cache
      b = brandCache.get(b.refId);
    }

    // and extract the name
    return (b.name == null) ? null : b.name.trim();
  }
}

We link the adapter to the brand field of Car using another annotation:

@XmlRootElement
@XmlAccessorType(XmlAccessType.FIELD)
public class Car {
    @XmlJavaTypeAdapter(BrandAdapter.class)
    String brand;
    @XmlElementRef
    Engine engine;
}

The final part of the puzzle is to ensure that <brand> elements found at the top level get saved in the cache. Here is a complete example

package cars;

import javax.xml.bind.*;
import java.io.File;
import java.util.*;

import javax.xml.stream.*;
import javax.xml.transform.stream.StreamSource;

public class Main {
  public static void main(String[] argv) throws Exception {
    List<Car> cars = new ArayList<Car>();

    JAXBContext ctx = JAXBContext.newInstance(Car.class, V12Engine.class, V6Engine.class, Brand.class);
    Unmarshaller um = ctx.createUnmarshaller();

    // create an adapter, and register it with the unmarshaller
    BrandAdapter ba = new BrandAdapter();
    um.setAdapter(BrandAdapter.class, ba);

    // create a StAX XMLStreamReader to read the XML file
    XMLInputFactory xif = XMLInputFactory.newFactory();
    XMLStreamReader xsr = xif.createXMLStreamReader(new StreamSource(new File("file.xml")));

    xsr.nextTag(); // root <list> element
    xsr.nextTag(); // first <brand> or <car> child

    // read each <brand>/<car> in turn
    while(xsr.getEventType() == XMLStreamConstants.START_ELEMENT) {
      Object obj = um.unmarshal(xsr);

      // unmarshal from an XMLStreamReader leaves the reader pointing at
      // the event *after* the closing tag of the element we read.  If there
      // was a text node between the closing tag of this element and the opening
      // tag of the next then we will need to skip it.
      if(xsr.getEventType() != XMLStreamConstants.START_ELEMENT && xsr.getEventType() != XMLStreamConstants.END_ELEMENT) xsr.nextTag();

      if(obj instanceof Brand) {
        // top-level <brand> - hand it to the BrandAdapter so it can be
        // cached if necessary
        ba.unmarshal((Brand)obj);
      }
      if(obj instanceof Car) {
        cars.add((Car)obj);
      }
    }
    xsr.close();

    // at this point, cars contains all the Car objects we found, with
    // any <brand> refIds resolved.
  }
}

183

answered Oct 22 '22 23:10

Ian Roberts

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Parsing XML with references to previous tags, and with children corresponding to subtypes of some class

Tags:

java

xml-parsing

xml-deserialization

xstream

aioobe

People also ask

1 Answers

Engine type by element name

Cross-references for brands

Ian Roberts

Recent Activity

Donate For Us

Parsing XML with references to previous tags, and with children corresponding to subtypes of some class

Tags:

java

xml-parsing

xml-deserialization

xstream

aioobe

People also ask

1 Answers

Engine type by element name

Cross-references for brands

Ian Roberts

Related questions

Recent Activity

Donate For Us