Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dealing with namespaces while parsing XML in Go

I am trying to parse a piece if XML in Go:

package main

import (
    "encoding/xml"
    "fmt"
)

type XML struct {
    Foo string `xml:"foo"`
}

func main() {
    rawXML := []byte(`
<xml>
  <foo>A</foo>
  <ns:foo>B</ns:foo>
</xml>`)

    x := new(XML)
    xml.Unmarshal(rawXML, x)
    fmt.Printf("foo: %s\n", x.Foo)
}

This outputs:

foo: B

While I expected it to produce:

foo: A

How do I get content of the first foo tag (i.e. one without namespace)?

like image 747
Victor Deryagin Avatar asked Dec 16 '22 15:12

Victor Deryagin


2 Answers

I don't think the xml decoder can specify an element should have no namespace with struct tags. But I do know that it can retrieve the information about the namespaces for you and you could then post process the data after to get the same result:

package main

import (
    "encoding/xml"
    "fmt"
)

type Foo struct {
    XMLName xml.Name
    Data string `xml:",chardata"`
}

type XML struct {
    Foo []Foo `xml:"foo"`
}

func main() {
    rawXML := []byte(`
<xml>
  <foo>A</foo>
  <ns:foo>B</ns:foo>
</xml>`)

    x := new(XML)
    xml.Unmarshal(rawXML, x)
    //fmt.Printf("foo: %#v\n", x)
    for _, el := range x.Foo {
       if el.XMLName.Space == "" {
          fmt.Printf("non namespaced foo %q", el.Data)
      }
    }
}

http://play.golang.org/p/aDEFPmHPc0

like image 152
Jeremy Wall Avatar answered Jan 08 '23 07:01

Jeremy Wall


You have two values in series in your xml document. You only have room for one value in your struct. The xml parser is parsing the first one and then overwriting it with the second one.

Change Foo to a slice in the struct and then you'll get both values.

http://play.golang.org/p/BRgsuMQ7rK

package main

import (
    "encoding/xml"
    "fmt"
)

type XML struct {
    Foo []string `xml:"foo"`
}

func main() {
    rawXML := []byte(`
<xml>
  <foo>A</foo>
  <ns:foo>B</ns:foo>
</xml>`)

    x := new(XML)
    xml.Unmarshal(rawXML, x)
    fmt.Printf("foo: %s\n", x.Foo[0])
    fmt.Printf("both: %v\n", x.Foo)
}
like image 23
Daniel Avatar answered Jan 08 '23 05:01

Daniel