Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Simple Xpath query in scala

Tags:

xml

scala

xpath

I'm trying to run a XPath query with scala and it doesn't seem to work. My Xml looks like ( simplified):

<application>
  <process type="output" size ="23"> 
     <channel offset="0"/>
      ....
     <channel offset="4"/>
  </process>
  <process type="input" size ="16"> 
     <channel offset="20"/>
      ....
     <channel offset="24"/>
  </process>
</application>

I want to retrieve the process with the input attribute and for that i use this XPath query:

//process[@type='input']

This should work, i verified it with xpathtester Now, my scala code looks like:

import scala.xml._
val x = XML.loadFile("file.xml")

val process = (x \\ "process[@type='input']")  // will return empty NodeSeq() !!!

The process ends up empty, it does't capture what I want. I worked it around like this:

val process = (x \\ "process" filter( _ \"@type" contains Text("input")))

which is much uglier. Any known reason why my original query shouldn't work?

like image 311
Chirlo Avatar asked Apr 01 '13 12:04

Chirlo


People also ask

What is an XPath query?

XPath (XML Path Language) is a query language that can be used to query data from XML documents. In RUEI, XPath queries can be used for content scanning of XML documents. A complete specification of XPath is available at http://www.w3.org/TR/xpath .

What is XPath syntax?

What Is XPath? XPath is defined as XML path. It is a syntax or language for finding any element on the web page using the XML path expression. XPath is used to find the location of any element on a webpage using HTML DOM structure.

What is XPath selector?

XPath stands for XML Path Language. It uses a non-XML syntax to provide a flexible way of addressing (pointing to) different parts of an XML document. It can also be used to test addressed nodes within a document to determine whether they match a pattern or not.

How we can use and operator in XPath?

The "and " operator is used to combining two different conditions or attributes to identify any element from a webpage using XPath efficiently. For example, if we have two attributes, a and b, we can combine both to uniquely identify an element on the webpage using the "and " operator.


2 Answers

"XPath" should not be used to describe what the Scala standard library supports. XPath is a full-fledged expression language, with so far two final versions and a third in the works:

  • XPath 1.0 from 1999
  • XPath 2.0 from 2007 (2nd edition 2010)
  • XPath 3.0 from 2013 (candidate recommendation)

At best you could say that Scala has a very small subset of XPath-inspired operations. So you can't expect to take XPath expressions and directly paste them to Scala without doing a bit more work.

Third-party libraries can give you better support for actual XPath expressions, including:

  • Scales Xml
    • Scala library
    • "provides a far more XPath like experience than the normal Scala XML, Paths look like XPaths and work like them too (with many of the same functions and axes)"
    • it's still not actual XPath if I understand well
    • designed to integrate well with Scala
  • Saxon
    • Java library
    • open source
    • complete and conformant support for XPath 2 (and XSLT 2)
    • has an XPath API which works on DOM and other data models, but no specific Scala support at this time
like image 193
ebruchez Avatar answered Sep 28 '22 09:09

ebruchez


One way to do that would be to use kantan.xpath:

import kantan.xpath._
import kantan.xpath.implicits._

val input = """
     | <application>
     |   <process type="output" size ="23">
     |      <channel offset="0"/>
     |      <channel offset="4"/>
     |   </process>
     |   <process type="input" size ="16">
     |      <channel offset="20"/>
     |      <channel offset="24"/>
     |   </process>
     | </application>
     | """.stripMargin

val inputs = input.evalXPath[List[Node]](xp"//process[@type='input']")

This yields a List[Node], but you could retrieve values with more interesting types - the list of channel offsets, for example:

input.evalXPath[List[Int]](xp"//process[@type='input']/channel/@offset")
// This yields Success(List(20, 24))
like image 35
Nicolas Rinaudo Avatar answered Sep 28 '22 09:09

Nicolas Rinaudo