Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I run an XQuery against XML in a String?

I have a String representation of some XML, and I want to run an XQuery on it in memory. I've been playing with Saxon and came up with a solution, but to make it work I did an ugly, ugly thing. I have a feeling it's because of my lack of experience with Saxon. Here is some code that works:

import javax.xml.transform.URIResolver;
import net.sf.saxon.Configuration;
import net.sf.saxon.s9api.*;

public class XmlTest {
  public static void main(String[] args) {
    try {
      final String tableXml = 
        "<table>" + 
        "  <columns>" + 
        "    <column>Foo</column><column>Bar</column>" + 
        "  </columns>" + 
        "  <rows>" + 
        "    <row><cell>Foo1</cell><cell>Bar1</cell></row>" + 
        "    <row><cell>Foo2</cell><cell>Bar2</cell></row>" + 
        "  </rows>" + 
        "</table>";

      Configuration saxonConfig = new Configuration();
      Processor processor = new Processor(saxonConfig);

      XQueryCompiler xqueryCompiler = processor.newXQueryCompiler();
      XQueryExecutable xqueryExec = xqueryCompiler
              .compile("<result>{"
                       + "doc('')/table/rows/row/cell/text()='Foo2'"
                       + "}</result>");

      XQueryEvaluator xqueryEval = xqueryExec.load();
      xqueryEval.setSource(new SAXSource(new InputSource(
          new StringReader(tableXml))));

      XdmDestination destination = new XdmDestination();

      xqueryEval.setDestination(destination);

      // Avert your eyes!
      xqueryEval.setURIResolver(new URIResolver() {
        @Override
        public Source resolve(String href, String base) throws TransformerException {
            return new StreamSource(new StringReader(tableXml));
        }
      });

      xqueryEval.run();

      System.out.println(destination.getXdmNode());

    } catch (Exception e) {
      e.printStackTrace();
    }
  }
}

The issue I was having was with the base URI of the XML document. Since it was in-memory, there was no base document to reference. I know the XML will always be self-contained, so I decided to override the URIResolver to just pass back the XML wrapped in a Source type object. I know this is wrong, but it works. If I don't do it, I get a Content not allowed in prolog error. From the rest of the error message it looks like it's trying to read in the current directory as an XML file. That part is a little cryptic to me, but I'm willing to learn! Is there a correct way to do what I want to do?

like image 790
Rob Heiser Avatar asked Oct 04 '22 19:10

Rob Heiser


1 Answers

If you want to access the source document using doc('') then this is the way to do it. However, it's much simpler if you write your query to access the source document as the value of the context item. So you change your query to

"<result>{/table/rows/row/cell='Foo2'}</result>"

You're already supplying the context item using setSource(), even though you aren't using it, so this is the only change you need to make.

(I've also cut out the "/text()" from the query, because it's much better to test the value of the element directly - it means your query will still work if the source document contains comments).

like image 199
Michael Kay Avatar answered Oct 13 '22 09:10

Michael Kay