Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

golang design pattern for cancelling routines inflight

Tags:

go

I am a golang newbie who is trying to understand the correct design pattern for this problem. My current solution seems very verbose, and I'm not sure what the better approach would be.

I am trying to design a system that:

  1. executes N goroutines
  2. returns the result of each goroutine as soon as it is available
  3. if a goroutine returns a particular value, it should kill other goroutines will cancel.

The goal: I want to kick off a number of goroutines, but I want to cancel the routines if one routine returns a particular result.

I'm trying to understand if my code is super "smelly" or if this is the prescribed way of doing things. I still don't have a great feeling for go, so any help would be appreciated.

Here is what I've written:

package main

import (
    "context"
    "fmt"
    "time"
)

func main() {

    ctx := context.Background()
    ctx, cancel := context.WithCancel(ctx)

    fooCheck := make(chan bool)
    barCheck := make(chan bool)

    go foo(ctx, 3000, fooCheck)
    go bar(ctx, 5000, barCheck)

    for fooCheck != nil ||
        barCheck != nil {
        select {
        case res, ok := <-fooCheck:
            if !ok {
                fooCheck = nil
                continue
            }
            if res == false {
                cancel()
            }
            fmt.Printf("result of foocheck: %t\n", res)
        case res, ok := <-barCheck:
            if !ok {
                barCheck = nil
                continue
            }
            fmt.Printf("result of barcheck: %t\n", res)
        }
    }
    fmt.Printf("here we are at the end of the loop, ready to do some more processing...")
}

func foo(ctx context.Context, pretendWorkTime int, in chan<- bool) {
    fmt.Printf("simulate doing foo work and pass ctx down to cancel down the calltree\n")
    time.Sleep(time.Millisecond * time.Duration(pretendWorkTime))

    select {
    case <-ctx.Done():
        fmt.Printf("\n\nWe cancelled this operation!\n\n")
        break
    default:
        fmt.Printf("we have done some foo work!\n")
        in <- false
    }
    close(in)
}

func bar(ctx context.Context, pretendWorkTime int, in chan<- bool) {
    fmt.Printf("simulate doing bar work and pass ctx down to cancel down the calltree\n")
    time.Sleep(time.Millisecond * time.Duration(pretendWorkTime))

    select {
    case <-ctx.Done():
        fmt.Printf("\n\nWe cancelled the bar operation!\n\n")
        break
    default:
        fmt.Printf("we have done some bar work!\n")
        in <- true
    }
    close(in)
}

(play with the code here: https://play.golang.org/p/HAA-LIxWNt0)

The output works as expected, but I'm afraid I'm making some decision which will blow off my foot later.

like image 493
clo_jur Avatar asked May 17 '19 22:05

clo_jur


1 Answers

I would use a single channel to communicate results, so it's much easier to gather the results and it "scales" automatically by its nature. If you need to identify the source of a result, simply use a wrapper which includes the source. Something like this:

type Result struct {
    ID     string
    Result bool
}

To simulate "real" work, the workers should use a loop doing their work in an iterative manner, and in each iteration they should check the cancellation signal. Something like this:

func foo(ctx context.Context, pretendWorkMs int, resch chan<- Result) {
    log.Printf("foo started...")
    for i := 0; i < pretendWorkMs; i++ {
        time.Sleep(time.Millisecond)
        select {
        case <-ctx.Done():
            log.Printf("foo terminated.")
            return
        default:
        }
    }
    log.Printf("foo finished")
    resch <- Result{ID: "foo", Result: false}
}

In our example the bar() is the same just replace all foo word with bar.

And now executing the jobs and terminating the rest early if one does meet our expectation looks like this:

ctx, cancel := context.WithCancel(context.Background())
defer cancel()

resch := make(chan Result, 2)

log.Println("Kicking off workers...")
go foo(ctx, 3000, resch)
go bar(ctx, 5000, resch)

for i := 0; i < cap(resch); i++ {
    result := <-resch
    log.Printf("Result of %s: %v", result.ID, result.Result)
    if !result.Result {
        cancel()
        break
    }
}
log.Println("Done.")

Running this app will output (try it on the Go Playground):

2009/11/10 23:00:00 Kicking off workers...
2009/11/10 23:00:00 bar started...
2009/11/10 23:00:00 foo started...
2009/11/10 23:00:03 foo finished
2009/11/10 23:00:03 Result of foo: false
2009/11/10 23:00:03 Done.

Some things to note. If we terminate early due to unexpected result, the cancel() function will be called, and we break out form the loop. It may be the rest of the workers also complete their work concurrently and send their result, which will not be a problem as we used a buffered channel, so their send will not block and they will end properly. Also, if they don't complete concurrently, they check ctx.Done() in their loop, and they terminate early, so the goroutines are cleaned up nicely.

Also note that the output of the above code does not print bar terminated. This is because the main() function terminates right after the loop, and once the main() function ends, it does not wait for other non-main goroutines to complete. For details, see No output from goroutine in Go. If the app would not terminate immediately, we would see that line printed too. If we add a time.Sleep() at the end of main():

log.Println("Done.")
time.Sleep(3 * time.Millisecond)

Output will be (try it on the Go Playground):

2009/11/10 23:00:00 Kicking off workers...
2009/11/10 23:00:00 bar started...
2009/11/10 23:00:00 foo started...
2009/11/10 23:00:03 foo finished
2009/11/10 23:00:03 Result of foo: false
2009/11/10 23:00:03 Done.
2009/11/10 23:00:03 bar terminated.

Now if you must wait for all workers to end either "normally" or "early" before moving on, you can achieve that in many ways.

One way is to use a sync.WaitGroup. For an example, see Prevent the main() function from terminating before goroutines finish in Golang. Another way would be to have each worker send a Result no matter how they end, and Result could contain the termination condition, e.g. normal or aborted. And the main() goroutine could continue the receive loop until it receives n values from resch. If this solution is chosen, you must ensure each worker sends a value (even if a panic occurs) to not block the main() in such cases (e.g. with using defer).

like image 122
icza Avatar answered Nov 15 '22 06:11

icza