Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Changing XML Namespace with Scala

Tags:

xml

scala

I am using scala to load a XML file from file via the scala.xml.XML.loadFile() method. The documents I'm working with have namespaces already defined and I wish to change the namespace to something else using scala. For example, a document has a xmlns of "http://foo.com/a" with a prefix of "a" - I would like to change the namespace and prefix for the document to "http://foo.com/b" and "b" respectively.

Seems easy and I feel like I'm missing something obvious here. I do not have a problem getting the namespace from the return Elem from the referenced loadFile() method.

like image 521
toddk Avatar asked Jul 07 '09 20:07

toddk


People also ask

How do I change the default namespace in XML?

You can change the default namespace within a particular element by adding an xmlns attribute to the element. Example 4-4 is an XML document that initially sets the default namespace to http://www.w3.org/1999/xhtml for all the XHTML elements.

How do I add a namespace to an XML file?

XML Namespaces - The xmlns Attribute When using prefixes in XML, a namespace for the prefix must be defined. The namespace can be defined by an xmlns attribute in the start tag of an element. The namespace declaration has the following syntax. xmlns:prefix="URI".

Can XML have multiple namespaces?

When you use multiple namespaces in an XML document, you can define one namespace as the default namespace to create a cleaner looking document. The default namespace is declared in the root element and applies to all unqualified elements in the document. Default namespaces apply to elements only, not to attributes.

How does namespace work in XML?

An XML namespace is a collection of names that can be used as element or attribute names in an XML document. The namespace qualifies element names uniquely on the Web in order to avoid conflicts between elements with the same name.


1 Answers

Here it is. Since NamespaceBinding is nested (each ns has a parent, except TopScope), we need to recurse to fix that. Also, each ns has an URI and a prefix, and we need to change both.

The function below will change just one particular URI and prefix, and it will check all namespaces, to see if either prefix or URI needs changing. It will change a prefix or a URI independent of each other, which might not be what is wanted. Not a big deal fixing that, though.

As for the rest, just pattern match on Elem to recurse into each part of the XML. Ah, yes, it changes the prefix of the elements too. Again, if that's not what is wanted, it's easy to change.

The code assumes there is no need to recurse into "other" parts of XML -- the rest will usually be Text elements. Also, it assumes there is no namespace elsewhere. I'm no expert on XML, so I could be wrong on both counts. Once more, it should be easy to change that -- just follow the pattern.

def changeNS(el: Elem, 
             oldURI: String, newURI: String, 
             oldPrefix: String, newPrefix: String): Elem = {
  def replace(what: String, before: String, after: String): String =
    if (what == before) after else what

  def fixScope(ns: NamespaceBinding): NamespaceBinding =
    if(ns == TopScope)
      TopScope
    else new NamespaceBinding(replace(ns.prefix, oldPrefix, newPrefix),
                              replace(ns.uri, oldURI, newURI),
                              fixScope(ns.parent))

  def fixSeq(ns: Seq[Node]): Seq[Node] = for(node <- ns) yield node match {
    case Elem(prefix, label, attribs, scope, children @ _*) => 
      Elem(replace(prefix, oldPrefix, newPrefix), 
           label, 
           attribs, 
           fixScope(scope), 
           fixSeq(children) : _*)
    case other => other
  }
  fixSeq(el.theSeq)(0).asInstanceOf[Elem]
}

This produces an unexpected result, though. The scope is getting added to all elements. That's because NamespaceBinding does not define a equals method, thus using reference equality. I have opened a ticket for it, 2138, which has already been closed, so Scala 2.8 won't have this problem.

Meanwhile, the following code will work properly. It keeps a cache of namespaces. It also decomposes NamespaceBinding into a list before handling it.

  def changeNS(el: Elem, 
             oldURI: String, newURI: String, 
             oldPrefix: String, newPrefix: String): Elem = {
  val namespaces = scala.collection.mutable.Map.empty[List[(String, String)],NamespaceBinding]

  def replace(what: String, before: String, after: String): String =
    if (what == before) after else what

  def unfoldNS(ns: NamespaceBinding): List[(String, String)] = ns match {
    case TopScope => Nil
    case _ => (ns.prefix, ns.uri) :: unfoldNS(ns.parent)
    }

  def foldNS(unfoldedNS: List[(String, String)]): NamespaceBinding = unfoldedNS match {
    case knownNS if namespaces.isDefinedAt(knownNS) => namespaces(knownNS)
    case (prefix, uri) :: tail =>
      val newNS = new NamespaceBinding(prefix, uri, foldNS(tail))
      namespaces(unfoldedNS) = newNS
      newNS
    case Nil => TopScope
  }

  def fixScope(ns: NamespaceBinding): NamespaceBinding =
    if(ns == TopScope)
      ns
    else {
      val unfoldedNS = unfoldNS(ns)
      val fixedNS = for((prefix, uri) <- unfoldedNS) 
                    yield (replace(prefix, oldPrefix, newPrefix), replace(uri, oldURI, newURI))

      if(!namespaces.isDefinedAt(unfoldedNS))
        namespaces(unfoldedNS) = ns  // Save for future use

      if(fixedNS == unfoldedNS)
        ns
      else 
        foldNS(fixedNS)
    }

  def fixSeq(ns: Seq[Node]): Seq[Node] = for(node <- ns) yield node match {
    case Elem(prefix, label, attribs, scope, children @ _*) => 
      Elem(replace(prefix, oldPrefix, newPrefix), 
           label, 
           attribs, 
           fixScope(scope), 
           fixSeq(children) : _*)
    case other => other
  }
  fixSeq(el.theSeq)(0).asInstanceOf[Elem]
}
like image 145
Daniel C. Sobral Avatar answered Oct 09 '22 12:10

Daniel C. Sobral