Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write custom splitFunc for bufio.Scaner that scan json objects

Tags:

json

go

I have a code like this

scanner := bufio.NewScanner(reader)
scanner.Split(splitJSON)

for scanner.Scan() {
    bb := scanner.Bytes()
}

I would like to get from Scanner only valid JSON objects one at a time. In some case in Scanner may be bytes that represent struct like this

{
    "some_object": "name",
    "some_fileds": {}
}
{
    "some_object": 
}

I need only the first part of this

{
    "some_object": "name",
    "some_fileds": {}
}

For the other, I should wait for the end of JSON object.

I have a function like this, but it's horrible and doesn't work.

func splitJSON(
    bb []byte, atEOF bool,
) (advance int, token []byte, err error) {
    print(string(bb))
    if len(bb) < 10 {
        return 0, nil, nil
    }

    var nested, from, to int
    var end bool

    for i, b := range bb {
        if string(b) == "{" {
            if end {
                to = i

                break
            }

            if nested == 0 {
                from = i
            }

            nested++
        }

        if string(b) == "}" {
            nested--
            if nested == 0 {
                to = i
                end = true
            }
        }
    }

    if atEOF {
        return len(bb), bb, nil
    }

    return len(bb[from:to]), bb[from:to], nil
}

UPD It was decided by this splitFunc

func splitJSON(data []byte, atEOF bool) (advance int, token []byte, err error) {
    if atEOF && len(data) == 0 {
        return 0, nil, nil
    }

    reader := bytes.NewReader(data)
    dec := json.NewDecoder(reader)

    var raw json.RawMessage
    if err := dec.Decode(&raw); err != nil {
        return 0, nil, nil
    }

    return len(raw) + 1, raw, nil
}
like image 216
Mark Eaton Avatar asked May 21 '26 07:05

Mark Eaton


1 Answers

Use json.Decoder for this. Each Decoder.Decode() call will decode the next JSON-encoded value from the input, JSON objects in your case.

If you don't want to decode the JSON objects just need the JSON data (byte slice), use a json.RawMessage to unmarshal into.

For example:

func main() {
    reader := strings.NewReader(src)
    dec := json.NewDecoder(reader)

    for {
        var raw json.RawMessage
        if err := dec.Decode(&raw); err != nil {
            if err == io.EOF {
                break
            }
            fmt.Printf("Error:", err)
            return
        }
        fmt.Println("Next:", string(raw))
    }
}

const src = `{
    "some_object": "name",
    "some_fileds": {}
}
{
    "some_object": "foo"
}`

This will output (try it on the Go Playground):

Next: {
    "some_object": "name",
    "some_fileds": {}
}
Next: {
    "some_object": "foo"
}
like image 93
icza Avatar answered May 23 '26 21:05

icza