I have a batch job that takes a couple of hours to run. How can I run this in a serverless way on Google Cloud?
AppEngine, Cloud Functions, and Cloud Run are limited to 10-15 minutes. I don't want to rewrite my code in Apache Beam.
Is there an equivalent to AWS Batch on Google Cloud?
Try Google Cloud Batch processing is as old as computing itself, with the term 'batch' dating back to the punchcards used by early mainframes. Batch uses resources very efficiently and remains the preferred way of running jobs that don't need much human interaction.
Should I deploy on Cloud Run or Cloud Function? Both are GCP serverless products, easy to be deployed and handle HTTP requests/events.
To run a batch job icon, enter the name of the batch job, and then choose the related link. If there is an Options FastTab for the batch job, fill in the fields to determine what the batch job will do. The page may contain one or more FastTab with filters, which you can use to limit the data included in the batch job.
What if you want to run a long-running batch job in a serverless way? Put your code in a Docker container. Run it using AI Platform. Schedule it using Cloud Scheduler. You can use AI Platform Training to run any arbitrary Docker container — it doesn’t have to be a machine learning job.
Thanks to our strong partner relationships, Google Cloud’s serverless solutions easily integrate with existing partner technology. Fully managed serverless platform for developing and hosting web applications at scale. Takes care of provisioning servers and scaling app instances based on demand.
Automatically validate policies or configurations and perform other scripted automation using event triggers. Our serverless computing products can listen to events from other clouds, handle webhooks, and manage distributing events and workloads to other components.
Now search for cloud functions on the GCP console and click on create function. Give the cloud function an appropriate name and make sure it’s in the same region as the storage bucket. Select the function trigger type to be Cloud Pub/Sub. Creating Cloud Function to be triggered by Pub/Sub. (source: author)
Note: Cloud Run and Cloud Functions can now last up to 60 minutes. The answer below remains a viable approach if you have a multi-hour job.
Vertex AI Training is serverless and long-lived. Wrap your batch processing code in a Docker container, push to gcr.io and then do:
gcloud ai custom-jobs create \
--region=LOCATION \
--display-name=JOB_NAME \
--worker-pool-spec=machine-type=MACHINE_TYPE,replica-count=REPLICA_COUNT,executor-image-uri=EXECUTOR_IMAGE_URI,local-package-path=WORKING_DIRECTORY,script=SCRIPT_PATH
You can run any arbitrary Docker container — it doesn’t have to be a machine learning job. For details, see:
https://cloud.google.com/vertex-ai/docs/training/create-custom-job#create_custom_job-gcloud
Google Cloud does not offer a comparable product to AWS Batch (see https://cloud.google.com/docs/compare/aws/#service_comparisons).
Instead you'll need to use Cloud Tasks or Pub/Sub to delegate the work to another product, such as Compute Engine, but this lacks the ability to do this in a "serverless" way.
This answer to a How to make GCE instance stop when its deployed container finishes? will work for you as well:
In short:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With