Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Slice chunking in Go

Tags:

slice

go

chunking

I have a slice with ~2.1 million log strings in it, and I would like to create a slice of slices with the strings being as evenly distributed as possible.

Here is what I have so far:

// logs is a slice with ~2.1 million strings in it.
var divided = make([][]string, 0)
NumCPU := runtime.NumCPU()
ChunkSize := len(logs) / NumCPU
for i := 0; i < NumCPU; i++ {
    temp := make([]string, 0)
    idx := i * ChunkSize
    end := i * ChunkSize + ChunkSize
    for x := range logs[idx:end] {
        temp = append(temp, logs[x])
    }
    if i == NumCPU {
        for x := range logs[idx:] {
            temp = append(temp, logs[x])
        }
    }
    divided = append(divided, temp)
}

The idx := i * ChunkSize will give me the current "chunk start" for the logs index, and end := i * ChunkSize + ChunkSize will give me the "chunk end", or the end of the range of that chunk. I couldn't find any documentation or examples on how to chunk/split a slice or iterate over a limited range in Go, so this is what I came up with. However, it only copies the first chunk multiple times, so it doesn't work.

How do I (as evenly as possible) chunk an slice in Go?

like image 768
SiennaD. Avatar asked Feb 03 '16 14:02

SiennaD.


4 Answers

You don't need to make new slices, just append slices of logs to the divided slice.

http://play.golang.org/p/vyihJZlDVy

var divided [][]string

chunkSize := (len(logs) + numCPU - 1) / numCPU

for i := 0; i < len(logs); i += chunkSize {
    end := i + chunkSize

    if end > len(logs) {
        end = len(logs)
    }

    divided = append(divided, logs[i:end])
}

fmt.Printf("%#v\n", divided)
like image 192
JimB Avatar answered Nov 16 '22 15:11

JimB


Another variant. It works about 2.5 times faster than the one proposed by JimB. The tests and benchmarks are here.

https://play.golang.org/p/WoXHqGjozMI

func chunks(xs []string, chunkSize int) [][]string {
    if len(xs) == 0 {
        return nil
    }
    divided := make([][]string, (len(xs)+chunkSize-1)/chunkSize)
    prev := 0
    i := 0
    till := len(xs) - chunkSize
    for prev < till {
        next := prev + chunkSize
        divided[i] = xs[prev:next]
        prev = next
        i++
    }
    divided[i] = xs[prev:]
    return divided
}
like image 41
SIREN Avatar answered Nov 16 '22 15:11

SIREN


Using generics (Go version >=1.18):

func chunkBy[T any](items []T, chunkSize int) (chunks [][]T) {
    for chunkSize < len(items) {
        items, chunks = items[chunkSize:], append(chunks, items[0:chunkSize:chunkSize])
    }
    return append(chunks, items)
}

Playground URL

Or if you want to manually set the capacity:

func chunkBy[T any](items []T, chunkSize int) [][]T {
    var _chunks = make([][]T, 0, (len(items)/chunkSize)+1)
    for chunkSize < len(items) {
        items, _chunks = items[chunkSize:], append(_chunks, items[0:chunkSize:chunkSize])
    }
    return append(_chunks, items)
}

Playground URL

like image 7
Alfonso M. García Astorga Avatar answered Nov 16 '22 15:11

Alfonso M. García Astorga


func chunkSlice(items []int32, chunkSize int32) (chunks [][]int32) {
 //While there are more items remaining than chunkSize...
 for chunkSize < int32(len(items)) {
    //We take a slice of size chunkSize from the items array and append it to the new array
    chunks = append(chunks, items[0:chunkSize])
    //Then we remove those elements from the items array
    items = items[chunkSize:]
 }
 //Finally we append the remaining items to the new array and return it
 return append(chunks, items)
}

Visual example

Say we want to split an array into chunks of 3

items:  [1,2,3,4,5,6,7]
chunks: []

items:  [1,2,3,4,5,6,7]
chunks: [[1,2,3]]

items:  [4,5,6,7]
chunks: [[1,2,3]]

items:  [4,5,6,7]
chunks: [[1,2,3],[4,5,6]]

items:  [7]
chunks: [[1,2,3],[4,5,6]]

items:  [7]
chunks: [[1,2,3],[4,5,6],[7]]
return
like image 1
Omkesh Sajjanwar Avatar answered Nov 16 '22 16:11

Omkesh Sajjanwar