Benefits with Dataflow over cloud functions when moving data?

Tags:

I'm relatively new to GCP and just starting to setup/evaluate my organizations architecture on GCP.

Scenario:
Data will flow into a pub/sub topic (high frequency, low amount of data). The goal is to move that data into Big Table. From my understanding you can do that either with a having a cloud function triggering on the topic or with Dataflow.

Now I have previous experience with cloud functions which I am satisfied with, so that would be my pick.

I fail to see the benefit of choosing one over the other. So my question is when to choose what of these products?

Thanks

639

asked Jul 05 '18 18:07

Tsuni

2 Answers

Both solutions could work. Dataflow will scale better if your pub/sub traffic grows to large amounts of data, but Cloud Functions should work fine for low amounts of data; I would look at this page (especially the rate-limit section) to ensure that you fit within Cloud Functions: https://cloud.google.com/functions/quotas

Another thing to consider is that Dataflow can guarantee exactly-once processing of your data, so that no duplicates end up in BigTable. Cloud Functions will not do this for you out of the box. If you go with a functions approach, then you will want to make sure that the Pub/Sub message consistently determines which BigTable cell is written to; that way, if the function gets retried several times the same data will simply overwrite the same BigTable cell.

answered Oct 01 '22 11:10

Reuven Lax

Your needs sound relatively straightforward and Dataflow may be overkill for what you're trying to do. If Cloud functions do what you need they maybe stick with that. Often I find that simplicity is key when it comes to maintainability.

However when you need to perform transformations like merging these events by user before storing them in BigTable, that's where Dataflow really shines:

https://beam.apache.org/documentation/programming-guide/#groupbykey

answered Oct 01 '22 12:10

Alex

Related questions
                            
                                gcloud app deploy does not remove previous versions
                            
                                UnsatisfiedRequirementsError: Node on app engine flex environment
                            
                                Your application has authenticated using end user credentials from the Google Cloud SDK which are not supported by the translate.googleapis.com
                            
                                Passing Variables To Google Cloud Functions
                            
                                Firebase Cloud Functions - createCustomToken
                            
                                What is the alternative to the deprecated 'GoogleCredential'?
                            
                                How to remove a lien from Google Cloud project?
                            
                                Downloading folders from Google Cloud Storage Bucket
                            
                                Cluster autoscaler not downscaling
                            
                                Google Cloud Speech with Javascript
                            
                                why does google appengine deployment take several minutes to update service
                            
                                Google Cloud Vision - Numbers and Numerals OCR
                            
                                Anonymous caller does not have storage.objects.get
                            
                                Difference between Google's API Gateway and Cloud Endpoints
                            
                                Startup script logs location
                            
                                Kubernetes pods failing on "Pod sandbox changed, it will be killed and re-created"
                            
                                Transpose rows into columns in BigQuery (Pivot implementation) [duplicate]
                            
                                Connect to Memorystore from Cloud Run
                            
                                Get correct image orientation by Google Cloud Vision api (TEXT_DETECTION)
                            
                                Pass google default application credentials in local docker run

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Benefits with Dataflow over cloud functions when moving data?

Tags:

google-cloud-platform

google-cloud-functions

google-cloud-pubsub

google-cloud-dataflow

Tsuni

People also ask

2 Answers

Reuven Lax

Alex

Recent Activity

Donate For Us