Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

StAX - Setting the version and encoding using XMLStreamWriter

I am using StAX for creating XML files and then validating the file with and XSD.

I am getting an error while creating the XML file:

javax.xml.stream.XMLStreamException: Underlying stream encoding 'Cp1252' and input paramter for writeStartDocument() method 'UTF-8' do not match.
        at com.sun.xml.internal.stream.writers.XMLStreamWriterImpl.writeStartDocument(XMLStreamWriterImpl.java:1182)

Here is the code snippet:

XMLOutputFactory xof =  XMLOutputFactory.newInstance();

try{

  XMLStreamWriter xtw = xof.createXMLStreamWriter(new FileWriter(fileName));
  xtw.writeStartDocument("UTF-8","1.0");} catch(XMLStreamException e) {
  e.printStackTrace();

} catch(IOException ie) {

  ie.printStackTrace();

}

I am running this code on Unix. Does anybody know how to set the version and encoding style?

like image 986
Anurag Avatar asked May 31 '10 12:05

Anurag


3 Answers

I would try to use the createXMLStreamWriter() with an output parameter too.

[EDIT] Tried, it works by changing the createXMLStreamWriter line:

XMLStreamWriter xtw = xof.createXMLStreamWriter(new FileOutputStream(fileName), "UTF-8");

[EDIT 2] Made a little more complex test, for the record:

String fileName = "Test.xml";
XMLOutputFactory xof =  XMLOutputFactory.newInstance();
XMLStreamWriter xtw = null;
try
{
  xtw = xof.createXMLStreamWriter(new FileOutputStream(fileName), "UTF-8");
  xtw.writeStartDocument("UTF-8", "1.0");
  xtw.writeStartElement("root");
  xtw.writeComment("This is an attempt to create an XML file with StAX");

  xtw.writeStartElement("foo");
  xtw.writeAttribute("order", "1");
    xtw.writeStartElement("meuh");
    xtw.writeAttribute("active", "true");
      xtw.writeCharacters("The cows are flying high this Spring");
    xtw.writeEndElement();
  xtw.writeEndElement();

  xtw.writeStartElement("bar");
  xtw.writeAttribute("order", "2");
    xtw.writeStartElement("tcho");
    xtw.writeAttribute("kola", "K");
      xtw.writeCharacters("Content of tcho tag");
    xtw.writeEndElement();
  xtw.writeEndElement();

  xtw.writeEndElement();
  xtw.writeEndDocument();
}
catch (XMLStreamException e)
{
  e.printStackTrace();
}
catch (IOException ie)
{
  ie.printStackTrace();
}
finally
{
  if (xtw != null)
  {
    try
    {
      xtw.close();
    }
    catch (XMLStreamException e)
    {
      e.printStackTrace();
    }
  }
}
like image 124
PhiLho Avatar answered Nov 15 '22 04:11

PhiLho


This should work:

// ...
Writer writer = new OutputStreamWriter(new FileOutputStream(fileName), "UTF-8");
XMLStreamWriter xtw = xof.createXMLStreamWriter(writer);
xtw.writeStartDocument("UTF-8", "1.0");
// ...
like image 32
chris Avatar answered Nov 15 '22 04:11

chris


From the code it is hard to know for sure, but if you are relying on the default Stax implementation that JDK 1.6 provides (Sun sjsxp) I would recommend upgrading to use Woodstox. It is known to be less buggy than Sjsxp, supports the whole Stax2 API and has been actively developed and supported (whereas Sun version was just written and there has been limited number of bug fixes).

But the bug in your code is this:

XMLStreamWriter xtw = xof.createXMLStreamWriter(new FileWriter(fileName));

you are relying on the default platform encoding (which must be CP-1252, windows?). You should always explicitly specify encoding you are using. Stream writer is just verifying that you are not doing something dangerous, and spotted inconsistence that can cause corrupt document. Pretty smart, which actually suggests that this is not the default Stax processor. :-)

(the other answer points a correct workaround, too, by just passing OutputStream and encoding to let XMLStreamWriter do the right thing)

like image 2
StaxMan Avatar answered Nov 15 '22 03:11

StaxMan