I have a massive JSON array stored in a file ("file.json") I need to iterate through the array and do some operation on each element.
err = json.Unmarshal(dat, &all_data)
Causes an out of memory - I'm guessing because it loads everything into memory first.
Is there a way to stream the JSON element by element?
There is an example of this sort of thing in encoding/json
documentation:
package main
import (
"encoding/json"
"fmt"
"log"
"strings"
)
func main() {
const jsonStream = `
[
{"Name": "Ed", "Text": "Knock knock."},
{"Name": "Sam", "Text": "Who's there?"},
{"Name": "Ed", "Text": "Go fmt."},
{"Name": "Sam", "Text": "Go fmt who?"},
{"Name": "Ed", "Text": "Go fmt yourself!"}
]
`
type Message struct {
Name, Text string
}
dec := json.NewDecoder(strings.NewReader(jsonStream))
// read open bracket
t, err := dec.Token()
if err != nil {
log.Fatal(err)
}
fmt.Printf("%T: %v\n", t, t)
// while the array contains values
for dec.More() {
var m Message
// decode an array value (Message)
err := dec.Decode(&m)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%v: %v\n", m.Name, m.Text)
}
// read closing bracket
t, err = dec.Token()
if err != nil {
log.Fatal(err)
}
fmt.Printf("%T: %v\n", t, t)
}
So, as commenters suggested, you could use the streaming API of "encoding/json" for reading one string at a time:
r := ... // get some io.Reader (e.g. open the big array file)
d := json.NewDecoder(r)
// read "["
d.Token()
// read strings one by one
for d.More() {
s, _ := d.Token()
// do something with s which is the newly read string
fmt.Printf("read %q\n", s)
}
// (optionally) read "]"
d.Token()
Note that for simplicity I've left error handling out which needs to be implemented.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With