Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unable to pull Artifact Registry private images in newly created GKE cluster

I cannot pull artifact registry images to a newly created GKE cluster with Terraform and a user-defined service account.

The terraform used to stand up the cluster is below.

locals {
  service         = "example"
  resource_prefix = format("%s-%s", local.service, var.env)
  location        = format("%s-b", var.gcp_region)
}

resource "google_service_account" "main" {
  account_id   = format("%s-sa", local.resource_prefix)
  display_name = format("%s-sa", local.resource_prefix)
  project      = var.gcp_project
}

resource "google_container_cluster" "main" {
  name                     = local.resource_prefix
  description              = format("Cluster primarily servicing the service %s", local.service)
  location                 = local.location
  remove_default_node_pool = true
  initial_node_count       = 1
}

resource "google_container_node_pool" "main" {
  name       = format("%s-node-pool", local.resource_prefix)
  location   = local.location
  cluster    = google_container_cluster.main.name
  node_count = var.gke_cluster_node_count

  node_config {
    preemptible  = true
    machine_type = var.gke_node_machine_type
    # Google recommends custom service accounts that have cloud-platform scope and permissions granted via IAM Roles.
    service_account = google_service_account.main.email
    oauth_scopes = [
      "https://www.googleapis.com/auth/logging.write",
      "https://www.googleapis.com/auth/monitoring",
      "https://www.googleapis.com/auth/cloud-platform",
      "https://www.googleapis.com/auth/devstorage.read_only",
      "https://www.googleapis.com/auth/servicecontrol",
      "https://www.googleapis.com/auth/service.management.readonly",
      "https://www.googleapis.com/auth/trace.append"
    ]
  }

  autoscaling {
    min_node_count = var.gke_cluster_autoscaling_min_node_count
    max_node_count = var.gke_cluster_autoscaling_max_node_count
  }
}

I run a helm deployment to deploy an application and get the following issue.

default       php-5996c7fbfd-d6xf5                                             0/1     ImagePullBackOff             0          37m
Normal   Pulling    36m (x4 over 37m)      kubelet            Pulling image "europe-docker.pkg.dev/example-999999/eu.gcr.io/example-php-fpm:latest"
  Warning  Failed     36m (x4 over 37m)      kubelet            Failed to pull image "europe-docker.pkg.dev/example-999999/eu.gcr.io/example-php-fpm:latest": rpc error: code = Unknown desc = failed to pull and unpack image "europe-docker.pkg.dev/example-999999/eu.gcr.io/example-php-fpm:latest": failed to resolve reference "europe-docker.pkg.dev/example-999999/eu.gcr.io/example-php-fpm:latest": failed to authorize: failed to fetch oauth token: unexpected status: 403 Forbidden
  Warning  Failed     36m (x4 over 37m)      kubelet            Error: ErrImagePull
  Warning  Failed     35m (x6 over 37m)      kubelet            Error: ImagePullBackOff

Seems to me that I've missed something to do with the service account. Although using cloud ssh I am able to generate an oauth token, but that also does not work using crictl

UPDATE: issue resolved

I have been able to resolve my problem with the following additional terraform code.

resource "google_project_iam_member" "artifact_role" {
  role = "roles/artifactregistry.reader"
  member  = "serviceAccount:${google_service_account.main.email}"
  project = var.gcp_project
}
like image 484
David Avatar asked Oct 19 '25 08:10

David


2 Answers

As error says : unexpected status: 403 Forbidden

You might be having an issue with the Deployment imagepull secret.

For GKE you can use the service account JSON

Ref doc : https://cloud.google.com/container-registry/docs/advanced-authentication#json-key

Terraform create secret in GKE which you can use it to deployment

resource "kubernetes_secret" "gcr" {
    type = "kubernetes.io/dockerconfigjson"
    metadata {
        name = "gcr-image-pull"
        namespace = "default"
    }
    data = {
        ".dockerconfigjson" = jsonencode({
            auths = {
                "gcr.io" = {
                    username = "_json_key"
                    password = base64decode(google_service_account_key.myaccount.private_key)
                    email = google_service_account.main.email
                    auth = base64encode("_json_key:${ base64decode(google_service_account_key.myaccount.private_key) }")
                }
            }
        })
    }}

Or use the kubectl to create the secret

kubectl create secret docker-registry gcr \
    --docker-server=gcr.io \
    --docker-username=_json_key \
    --docker-password="$(cat google-service-account-key.json)" \
    --docker-email=<Email address>

Now if you have the POD or deployment you can create YAML config like

apiVersion: v1
kind: Pod
metadata:
  name: uses-private-registry
spec:
  containers:
  - name: hello-app
    image: <image URI>
  imagePullSecrets:
  - name: secret-that-you-created

Update:

As per Guillaume's suggestion for GKE/GCP you can follow *workload identity* option as best practice with other extern repo it might could not work.

Create the IAM service account in GCP:

gcloud iam service-accounts create gke-workload-indentity \
    --project=<project-id>

Create a service account in the K8s cluster :

apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    iam.gke.io/gcp-service-account: [email protected]
  name: gke-sa-workload
  namespace: default

Policy binding run below Gcloud command :

gcloud iam service-accounts add-iam-policy-binding gke-workload-indentity@PROJECT_ID.iam.gserviceaccount.com \
    --role roles/iam.workloadIdentityUser \
    --member "serviceAccount:PROJECT_ID.svc.id.goog[default/K8s_SANAME]"

Now you can create the deployment POD with image in GCR/astifact repo just update the ServiceAccount

spec:
      serviceAccountName: gke-sa-workload
      containers:
      - name: container
        image: IMAGE

Read more at : https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/

like image 172
Harsh Manvar Avatar answered Oct 20 '25 23:10

Harsh Manvar


Turning comment to answer as it resolved @David's issue.

Because the user defined service account is being used for the node_pool the appropriate roles need to be bound to this service account.

In this case: roles/artifactregistry.reader

Configuring artifact registry permissions

Best practice is to grant the minimum required roles.

like image 42
GorginZ Avatar answered Oct 20 '25 22:10

GorginZ



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!