Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Google PubSub with Golang. The most efficient (cost) way to poll the service

Tags:

We're in the process of moving from AMQP to Google's Pubsub.

The docs suggest that pull might be the best choice for us since we're using compute engine and can't open our workers to receive via the push service.

It also says that pull might incur additional costs depending on usage:

If polling is used, high network usage may be incurred if you are opening connections frequently and closing them immediately.

We'd created a test subscriber in go that runs in a loop as so:

func main() {
    jsonKey, err := ioutil.ReadFile("pubsub-key.json")
    if err != nil {
        log.Fatal(err)
    }
    conf, err := google.JWTConfigFromJSON(
        jsonKey,
        pubsub.ScopeCloudPlatform,
        pubsub.ScopePubSub,
    )
    if err != nil {
        log.Fatal(err)
    }
    ctx := cloud.NewContext("xxx", conf.Client(oauth2.NoContext))

    msgIDs, err := pubsub.Publish(ctx, "topic1", &pubsub.Message{
        Data: []byte("hello world"),
    })

    if err != nil {
        log.Println(err)
    }

    log.Printf("Published a message with a message id: %s\n", msgIDs[0])

    for {
        msgs, err := pubsub.Pull(ctx, "subscription1", 1)
        if err != nil {
            log.Println(err)
        }

        if len(msgs) > 0 {
            log.Printf("New message arrived: %v, len: %d\n", msgs[0].ID, len(msgs))
            if err := pubsub.Ack(ctx, "subscription1", msgs[0].AckID); err != nil {
                log.Fatal(err)
            }
            log.Println("Acknowledged message")
            log.Printf("Message: %s", msgs[0].Data)
        }
    }
}

The question I have though is really whether this is the correct / recommended way to go about pulling messages.

We recieve about 100msg per second throughout the day. I'm not sure if running it in an endless loop is going to bankrupt us and can't find any other decent go examples.

like image 377
simonmorley Avatar asked Apr 11 '16 15:04

simonmorley


1 Answers

In general, the key to pull subscribers in Cloud Pub/Sub is to make sure you always have at least a few outstanding Pull requests with max_messages set to a value that works well for:

  • the rate at which you publish messages,
  • the size of those messages, and
  • the rate of messages your subscriber can process messages.

As soon as a pull request returns, you should issue another one. That means processing and acking the messages returned to you in the pull response asynchronously (or starting up the new pull request asynchronously). If you ever find that throughput or latency isn't what you expect, the first thing to do is add more concurrent pull requests.

The statement "if polling is used, high network usage may be incurred if you are opening connections frequently and closing them immediately" applies if your publish rate is extremely low. Imagine you only publish two or three messages in a day, but you constantly poll with pull requests. Every one of those pull requests incurs a cost for making the request, but you won't get any messages to process except for the few times when you actually have a message, so the "cost per message" is fairly high. If you are publishing at a pretty steady rate and your pull requests are returning a non-zero number of messages, then the network usage and costs will be in line with the message rate.

like image 180
Kamal Aboul-Hosn Avatar answered Oct 13 '22 01:10

Kamal Aboul-Hosn