Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why there are insufficient accelerators when I execute gcloud ml-engine jobs?

I'm trying to run a Machine Learning Jon in Google Cloud, but it always tell me that there are insufficient accelerators available, I've tried with the parameter ----scale-tier=BASIC | BASIC_GPU | STANDARD_1 | PREMIUM_1. and is the same result.

Here is the command and result:

gcloud ml-engine jobs submit training object_detection_`date +%s`     --job-dir=gs://${TRAIN_DIR}     --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz     --module-name object_detection.train     --region us-central1     --config ${PATH_TO_LOCAL_YAML_FILE}     --     --train_dir=gs://${TRAIN_DIR}     --pipeline_config_path=gs://${PIPELINE_CONFIG_PATH}
ERROR: (gcloud.ml-engine.jobs.submit.training) RESOURCE_EXHAUSTED: Field: scale_tier Error: Insufficient accelerators are available in region us-central1 to schedule the job which requests 6 K80 accelerators. Please wait and try again or else try submitting your job to a different region.
- '@type': type.googleapis.com/google.rpc.BadRequest
  fieldViolations:
  - description: Insufficient accelerators are available in region us-central1 to
      schedule the job which requests 6 K80 accelerators. Please wait and try again
      or else try submitting your job to a different region.
    field: scale_tier
like image 479
Isidro Martínez Avatar asked Jan 04 '23 14:01

Isidro Martínez


1 Answers

GPUs are in high demand in us-central1. I suggest running your job in us-east1, if possible, in the near term until more GPUs become available.

like image 153
rhaertel80 Avatar answered May 09 '23 13:05

rhaertel80