Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Explain Cost of Google Cloud PubSub when used with Cloud Dataflow

The documentation on pubsub pricing is very minimal. Can someone explain the costs for the scenario below ?

  • Size of the data per event = 0.5 KB
  • Size of data per day = 1 TB

There is only one publisher app and there are two dataflow pipeline subscriptions.

The very rough estimate I can come up with is:

  • 1x publishing
  • 2x subscription (1x for each subscription)
  • 2x acknowledgment (1x for each subscription ack)

The questions are:

  1. Is total data volume per month, 150 (30* 1 TB * 5x) TB? That is 8000$ per month from the price calculator.
  2. 1 KB min size for the calculation is applicable even for acknowledging a message?
  3. Dataflow handles subscribe/acknowledge in bundles of ParDos. But, Is the bundle for each message acknowledged separately?
like image 533
mmziyad Avatar asked Jan 12 '18 17:01

mmziyad


People also ask

How much does a Pubsub cost?

The storage costs you $135 in North America (equivalent to 3375 GiB * 24 hours per day * 30 days per month * $0.04 / GiB-month-zone). For a regional Lite topic, since the data is stored in two zones, the storage cost is doubled to $270.

What is the value that cloud pub/sub provides?

Pub/Sub allows services to communicate asynchronously, with latencies on the order of 100 milliseconds.

What is the use of Pub/Sub in Google Cloud?

Google Cloud Pub/Sub provides messaging between applications. Cloud Pub/Sub is designed to provide reliable, many-to-many, asynchronous messaging between applications. Publisher applications can send messages to a "topic" and other applications can subscribe to that topic to receive the messages.

Is dataflow free?

Pricing. Dataflow jobs are billed per second, based on the actual use of Dataflow batch or streaming workers. Additional resources, such as Cloud Storage or Pub/Sub, are each billed per that service's pricing.


1 Answers

One does not pay for acknowledgements in Google Cloud Pub/Sub, only for publishes, pulls, and pushes. With messages of size 0.5KB, the amount you'd get charged would depend on the batching because of the 1KB minimum size. If all requests had at least 1KB, then the total cost for publishing and getting messages to two subscribers would be:

1TB/day * 30 days * 3 = 92,160GB/month

10GB * $0 + 92,150GB * $0.04 = $3,686

If some messages were not batched, then the price could go up because of the 1KB minimum. The Google Cloud Pub/Sub client library does batch published messages by default, so assuming your messages were not published very sporadically (meaning they were not frequent enough to result in batching), you would hit the 1KB minimum. With the amount of data, you are probably going to end up with batching on your subscribe side as well.

like image 68
Kamal Aboul-Hosn Avatar answered Nov 16 '22 23:11

Kamal Aboul-Hosn