If application does some heavy lifting with multiple file descriptors (e.g., opening - writing data - syncing - closing), what actually happens to Go runtime? Does it block all the goroutines at the time when expensive syscall occures (like syscall.Fsync
)? Or only the calling goroutine is blocked while the others are still operating?
So does it make sense to write programs with multiple workers that do a lot of user space - kernel space context switching? Does it make sense to use multithreading patterns for disk input?
package main
import (
"log"
"os"
"sync"
)
var data = []byte("some big data")
func worker(filenamechan chan string, wg *sync.waitgroup) {
defer wg.done()
for {
filename, ok := <-filenamechan
if !ok {
return
}
// open file is a quite expensive operation due to
// the opening new descriptor
f, err := os.openfile(filename, os.o_create|os.o_wronly, os.filemode(0644))
if err != nil {
log.fatal(err)
continue
}
// write is a cheap operation,
// because it just moves data from user space to the kernel space
if _, err := f.write(data); err != nil {
log.fatal(err)
continue
}
// syscall.fsync is a disk-bound expensive operation
if err := f.sync(); err != nil {
log.fatal(err)
continue
}
if err := f.close(); err != nil {
log.fatal(err)
}
}
}
func main() {
// launch workers
filenamechan := make(chan string)
wg := &sync.waitgroup{}
for i := 0; i < 2; i++ {
wg.add(1)
go worker(filenamechan, wg)
}
// send tasks to workers
filenames := []string{
"1.txt",
"2.txt",
"3.txt",
"4.txt",
"5.txt",
}
for i := range filenames {
filenamechan <- filenames[i]
}
close(filenamechan)
wg.wait()
}
https://play.golang.org/p/O0omcPBMAJ
In Go, goroutines are cheap to create and efficient to schedule. The Go runtime has been written for programs with tens of thousands of goroutines as the norm, hundreds of thousands are not unexpected. But goroutines do have a finite cost in terms of memory footprint; you cannot create an infinite number of them.
They are cheaper in: memory consumption: A thread starts with a large memory as opposed to a few Kb. Threads are scheduled preemptively, and during a thread switch, the scheduler needs to save/restore ALL registers.
A goroutine is created with initial only 2KB of stack size. Each function in go already has a check if more stack is needed or not and the stack can be copied to another region in memory with twice the original size. This makes goroutine very light on resources.
Threads are hardware dependent. Goroutines have easy communication medium known as channel. Thread does not have easy communication medium. Due to the presence of channel one goroutine can communicate with other goroutine with low latency.
If a syscall blocks, the Go runtime will launch a new thread so that the number of threads available to run goroutines remains the same.
A fuller explanation can be found here: https://morsmachine.dk/go-scheduler
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With