Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scheduled polling task in Go

Tags:

go

I have written some code that will concurrently poll URLs every 30 minutes:

func (obj * MyObj) Poll() {
    for ;; {
        for _, url := range obj.UrlList {
            //Download the current contents of the URL and do something with it
        }
        time.Sleep(30 * time.Minute)
}

//Start the routine in another function
go obj.Poll()

How would I then add to obj.UrlList elsewhere in the code and ensure that the next time the URLs are polled that the UrlList in the Poll goroutine as also been updated and as such will also poll the new URL?

I understand that memory is shared through communicating rather than vice versa in Go and I've investigated channels however I'm not sure how to implement them in this example.

like image 964
Thomas Denney Avatar asked Jun 03 '13 18:06

Thomas Denney


3 Answers

Here's an untested, but safe model for periodically fetching some URLs with the ability to dynamically add new URLs to the list of URLs safely. It should be obvious to the reader what would be required if you wanted to remove a URL as well.

type harvester struct {
    ticker *time.Ticker // periodic ticker
    add    chan string  // new URL channel
    urls   []string     // current URLs
}

func newHarvester() *harvester {
    rv := &harvester{
        ticker: time.NewTicker(time.Minute * 30),
        add:    make(chan string),
    }
    go rv.run()
    return rv
}

func (h *harvester) run() {
    for {
        select {
        case <-h.ticker.C:
            // When the ticker fires, it's time to harvest
            for _, u := range h.urls {
                harvest(u)
            }
        case u := <-h.add:
            // At any time (other than when we're harvesting),
            // we can process a request to add a new URL
            h.urls = append(h.urls, u)
        }
    }
}

func (h *harvester) AddURL(u string) {
    // Adding a new URL is as simple as tossing it onto a channel.
    h.add <- u
}
like image 85
Dustin Avatar answered Oct 27 '22 11:10

Dustin


If you need to poll at regular periodic intervals, you should not use time.Sleep but a time.Ticker instead (or relative like time.After). The reason is that a sleep is just a sleep and takes no account of drift due to the real work you did in your loop. Conversely, a Ticker has a separate goroutine and a channel, which together are able to send you regular events and thereby cause something useful to happen.

Here's an example that's similar to yours. I put in a random jitter to illustrate the benefit of using a Ticker.

package main

import (
    "fmt"
    "time"
    "math/rand"
)

func Poll() {
    r := rand.New(rand.NewSource(99))
    c := time.Tick(10 * time.Second)
    for _ = range c {
        //Download the current contents of the URL and do something with it
        fmt.Printf("Grab at %s\n", time.Now())
        // add a bit of jitter
        jitter := time.Duration(r.Int31n(5000)) * time.Millisecond 
        time.Sleep(jitter)
    }
}

func main() {
    //go obj.Poll()
    Poll()
}

When I ran this, I found that it kept to a strict 10-second cycle in spite of the jitter.

like image 37
Rick-777 Avatar answered Oct 27 '22 11:10

Rick-777


// Type with queue through a channel.
type MyType struct {
    queue chan []*net.URL
}

func (t *MyType) poll() {
    for urls := range t.queue {
        ...
        time.Sleep(30 * time.Minute)
    }
}

// Create instance with buffered queue.
t := MyType{make(chan []*net.URL, 25)}

go t.Poll()
like image 29
themue Avatar answered Oct 27 '22 09:10

themue