I have a 100Gb sized xml file and parse it with SAX method in go with this code
file, err := os.Open(filename)
handle(err)
defer file.Close()
buffer := bufio.NewReaderSize(file, 1024*1024*256) // 33554432
decoder := xml.NewDecoder(buffer)
for {
t, _ := decoder.Token()
if t == nil {
break
}
switch se := t.(type) {
case xml.StartElement:
if se.Name.Local == "House" {
house := House{}
err := decoder.DecodeElement(&house, &se)
handle(err)
}
}
}
But golang working very slow, its seems by execution time and disk usage. My hdd capable to read data with speed around 100-120 mb/s, but golang uses only 10-13 mb/s. For experiment i rewrite this code in c#:
using (XmlReader reader = XmlReader.Create(filename)
{
while (reader.Read())
{
switch (reader.NodeType)
{
case XmlNodeType.Element:
if (reader.Name == "House")
{
//Code
}
break;
}
}
}
And i got full hdd loaded, c# read data with 100-110mb/s speed. And execution time around 10 times lower.
How can i improve xml parse performance using golang?
These 5 things can help increase speed using the encoding/xml
library:
(Tested against XMB with 75k entries, 20MB, %s are applied to previous bullet)
xml.Unmarshaller
on all your structures
d.DecodeElement(&foo, &token)
with foo.UnmarshallXML(d, &token)
d.RawToken()
instead of d.Token()
d.Skip()
, reimplement it using d.RawToken()
I reduced time and allocs by 40% on my specific usecase at the cost of more code, boileplate, and potentially worse handling of corner cases, but my inputs are fairly consistent, however it's not enough.
benchstat first.bench.txt parseraw.bench.txt
name old time/op new time/op delta
Unmarshal-16 1.06s ± 6% 0.66s ± 4% -37.55% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
Unmarshal-16 461MB ± 0% 280MB ± 0% -39.20% (p=0.029 n=4+4)
name old allocs/op new allocs/op delta
Unmarshal-16 8.42M ± 0% 5.03M ± 0% -40.26% (p=0.016 n=4+5)
On my experiments, the lack of memoizing issue is the reason for large time/allocs on the XML parser which slows down significantly, mostly because of Go copying by value.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With