How can you parse a huge XML file that's having various elements (i.e. not same element repeated multiple times).
Example:
<stuff>
<header>...</header>
<item>...</item>
...
<item>...</item>
<something>...</sometihng>
</stuff>
I want to write a script in Go that would allow me to split this file in multiple smaller files with specific amount of tags per file. All examples on how to parse XML with Go seems to rely on knowing the elements that you have in the file.
Can the file be parsed without knowing that? Something like for each element in XML no matter what element is there (header, item, something, etc...)
Use the standard xml Decoder.
Call Token to read tokens one by one. When a start element of interest is found, call DecodeElement to decode the element to a Go value.
Here's a sketch of how to use the decoder:
d := xml.NewDecoder(r)
for {
t, tokenErr := d.Token()
if tokenErr != nil {
if tokenErr == io.EOF {
break
}
// handle error somehow
return fmt.Errorf("decoding token: %v", err)
}
switch t := t.(type) {
case xml.StartElement:
if t.Name.Space == "foo" && t.Name.Local == "bar" {
var b bar
if err := d.DecodeElement(&b, &t); err != nil {
// handle error somehow
return fmt.Errorf("decoding element %q: %v", t.Name.Local, err)
}
// do something with b
}
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With