Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does it make sense to make expensive syscalls from different goroutines?

If application does some heavy lifting with multiple file descriptors (e.g., opening - writing data - syncing - closing), what actually happens to Go runtime? Does it block all the goroutines at the time when expensive syscall occures (like syscall.Fsync)? Or only the calling goroutine is blocked while the others are still operating?

So does it make sense to write programs with multiple workers that do a lot of user space - kernel space context switching? Does it make sense to use multithreading patterns for disk input?

package main

import (
    "log"
    "os"
    "sync"
)

var data = []byte("some big data")

func worker(filenamechan chan string, wg *sync.waitgroup) {
    defer wg.done()
    for {
        filename, ok := <-filenamechan
        if !ok {
            return
        }

        // open file is a quite expensive operation due to
        // the opening new descriptor
        f, err := os.openfile(filename, os.o_create|os.o_wronly, os.filemode(0644))
        if err != nil {
            log.fatal(err)
            continue
        }

        // write is a cheap operation,
        // because it just moves data from user space to the kernel space
        if _, err := f.write(data); err != nil {
            log.fatal(err)
            continue
        }

        // syscall.fsync is a disk-bound expensive operation
        if err := f.sync(); err != nil {
            log.fatal(err)
            continue
        }

        if err := f.close(); err != nil {
            log.fatal(err)
        }
    }
}

func main() {

    // launch workers
    filenamechan := make(chan string)
    wg := &sync.waitgroup{}
    for i := 0; i < 2; i++ {
        wg.add(1)
        go worker(filenamechan, wg)
    }

    // send tasks to workers
    filenames := []string{
        "1.txt",
        "2.txt",
        "3.txt",
        "4.txt",
        "5.txt",
    }
    for i := range filenames {
        filenamechan <- filenames[i]
    }
    close(filenamechan)

    wg.wait()
}

https://play.golang.org/p/O0omcPBMAJ

like image 881
Vitaly Isaev Avatar asked Mar 18 '17 10:03

Vitaly Isaev


People also ask

Are Goroutines expensive?

In Go, goroutines are cheap to create and efficient to schedule. The Go runtime has been written for programs with tens of thousands of goroutines as the norm, hundreds of thousands are not unexpected. But goroutines do have a finite cost in terms of memory footprint; you cannot create an infinite number of them.

Why are Goroutines cheaper than threads?

They are cheaper in: memory consumption: A thread starts with a large memory as opposed to a few Kb. Threads are scheduled preemptively, and during a thread switch, the scheduler needs to save/restore ALL registers.

Why are Goroutines lighter than threads?

A goroutine is created with initial only 2KB of stack size. Each function in go already has a check if more stack is needed or not and the stack can be copied to another region in memory with twice the original size. This makes goroutine very light on resources.

Are Goroutines the same as threads?

Threads are hardware dependent. Goroutines have easy communication medium known as channel. Thread does not have easy communication medium. Due to the presence of channel one goroutine can communicate with other goroutine with low latency.


1 Answers

If a syscall blocks, the Go runtime will launch a new thread so that the number of threads available​ to run goroutines remains the same.

A fuller explanation can be found here: https://morsmachine.dk/go-scheduler

like image 64
Colin Stewart Avatar answered Sep 24 '22 04:09

Colin Stewart