Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I run a serverless batch job in Google Cloud

I have a batch job that takes a couple of hours to run. How can I run this in a serverless way on Google Cloud?

AppEngine, Cloud Functions, and Cloud Run are limited to 10-15 minutes. I don't want to rewrite my code in Apache Beam.

Is there an equivalent to AWS Batch on Google Cloud?

like image 922
Lak Avatar asked Sep 30 '19 06:09

Lak


People also ask

What is a batch job in GCP?

Try Google Cloud Batch processing is as old as computing itself, with the term 'batch' dating back to the punchcards used by early mainframes. Batch uses resources very efficiently and remains the preferred way of running jobs that don't need much human interaction.

Is GCP cloud run serverless?

Should I deploy on Cloud Run or Cloud Function? Both are GCP serverless products, easy to be deployed and handle HTTP requests/events.

How do you run a batch job?

To run a batch job icon, enter the name of the batch job, and then choose the related link. If there is an Options FastTab for the batch job, fill in the fields to determine what the batch job will do. The page may contain one or more FastTab with filters, which you can use to limit the data included in the batch job.

How to run a long-running batch job in a serverless way?

What if you want to run a long-running batch job in a serverless way? Put your code in a Docker container. Run it using AI Platform. Schedule it using Cloud Scheduler. You can use AI Platform Training to run any arbitrary Docker container — it doesn’t have to be a machine learning job.

What is Google Cloud serverless?

Thanks to our strong partner relationships, Google Cloud’s serverless solutions easily integrate with existing partner technology. Fully managed serverless platform for developing and hosting web applications at scale. Takes care of provisioning servers and scaling app instances based on demand.

What can you do with serverless?

Automatically validate policies or configurations and perform other scripted automation using event triggers. Our serverless computing products can listen to events from other clouds, handle webhooks, and manage distributing events and workloads to other components.

How to create Cloud Functions with GCP Cloud Pub/Sub?

Now search for cloud functions on the GCP console and click on create function. Give the cloud function an appropriate name and make sure it’s in the same region as the storage bucket. Select the function trigger type to be Cloud Pub/Sub. Creating Cloud Function to be triggered by Pub/Sub. (source: author)


Video Answer


3 Answers

Note: Cloud Run and Cloud Functions can now last up to 60 minutes. The answer below remains a viable approach if you have a multi-hour job.

Vertex AI Training is serverless and long-lived. Wrap your batch processing code in a Docker container, push to gcr.io and then do:

gcloud ai custom-jobs create \
  --region=LOCATION \
  --display-name=JOB_NAME \
  --worker-pool-spec=machine-type=MACHINE_TYPE,replica-count=REPLICA_COUNT,executor-image-uri=EXECUTOR_IMAGE_URI,local-package-path=WORKING_DIRECTORY,script=SCRIPT_PATH

You can run any arbitrary Docker container — it doesn’t have to be a machine learning job. For details, see:

https://cloud.google.com/vertex-ai/docs/training/create-custom-job#create_custom_job-gcloud

like image 91
Lak Avatar answered Sep 20 '22 01:09

Lak


Google Cloud does not offer a comparable product to AWS Batch (see https://cloud.google.com/docs/compare/aws/#service_comparisons).

Instead you'll need to use Cloud Tasks or Pub/Sub to delegate the work to another product, such as Compute Engine, but this lacks the ability to do this in a "serverless" way.

like image 33
Dustin Ingram Avatar answered Sep 23 '22 01:09

Dustin Ingram


This answer to a How to make GCE instance stop when its deployed container finishes? will work for you as well:

In short:

  • First dockerize your batch process.
  • Then, create an instance:
    • Using a container-optmized image
    • And using a Startup script that pulls your docker image, runs it, and shutdown the machine at the end.
like image 41
Iñigo González Avatar answered Sep 19 '22 01:09

Iñigo González