How does the GKE metadata server work in Workload Identity

Tags:

I've recently been making use of the GKE Workload Identity feature. I'd be interested to know in more detail how the gke-metadata-server component works.

GCP client code (gcloud or other language SDKs) falls through to the GCE metadata method
Request made to http://metadata.google.internal/path
(guess) Setting GKE_METADATA_SERVER on my node pool configures this to resolve to the gke-metadata-server pod on that node.
(guess) the gke-metadata-server pod with --privileged and host networking has a means of determining the source (pod IP?) then looking up the pod and its service account to check for the iam.gke.io/gcp-service-account annotation.
(guess) the proxy calls the metadata server with the pods 'pseudo' identity set (e.g. [PROJECT_ID].svc.id.goog[[K8S_NAMESPACE]/[KSA_NAME]]) to get a token for the service account annotated on its Kubernetes service account.
If this account has token creator / workload ID user rights to the service account presumably the response from GCP is a success and contains a token, which is then packaged and set back to the calling pod for authenticated calls to other Google APIs.

I guess the main puzzle for me right now is the verification of the calling pods identity. Originally I thought this would use the TokenReview API but now I'm not sure how the Google client tools would know to use the service account token mounted into the pod...

Edit follow-up questions:

Q1: In between step 2 and 3, is the request to metadata.google.internal routed to the GKE metadata proxy by the setting GKE_METADATA_SERVER on the node pool?

Q2: Why does the metadata server pod need host networking?

Q3: In the video here: https://youtu.be/s4NYEJDFc0M?t=2243 it's taken as a given that the pod makes a GCP call. How does the GKE metadata server identify the pod making the call to start the process?

642

asked Nov 01 '19 20:11

Charlie Egan

1 Answers

Before going into details, please familiarize yourself with these components:

OIDC provider: Runs on Google’s infrastructure, provides cluster specific metadata and signs authorized JWTs.

GKE metadata server: It runs as a DaemonSet meaning one instance on every node, exposes pod specific metadata server (it will provide backwards compatibility with old client libraries), emulates existing node metadata server.

Google IAM: issues access token, validates bindings, validates OIDC signatures.

Google cloud: accepts access tokens, does pretty much anything.

JWT: JSON Web token

mTLS: Mutual Transport Layer Security

The steps below explain how GKE metadata server components work:

Step 1: An authorized user binds the cluster to the namespace.

Step 2: Workload tries to access Google Cloud service using client libraries.

Step 3: GKE metadata server is going to request an OIDC signed JWT from the control plane. That connection is authenticated using mutual TLS (mTLS) connection with node credential.

Step 4: Then the GKE metadata server is going use that OIDC signed JWT to request an access token for the [identity namespace]/[Kubernetes service account] from IAM. IAM is going to validate that the appropriate bindings exist on identity namespace and in the OIDC provider.

Step 5: And then IAM validates that it was signed by the cluster’s correct OIDC provider. It will then return an access token for the [identity namespace]/[kubernetes service account].

Step 6: Then the metadata server sends the access token it just got back to IAM. IAM will then exchange that for a short lived GCP service account token after validating the appropriate bindings.

Step 7: Then GKE metadata server returns the GCP service account token to the workload.

Step 8: The workload can then use that token to make calls to any Google Cloud Service.

I also found a video regarding Workload Identity which you will find useful.

EDIT Follow-up questions' answers:

Below are answers to your follow-up questions:

Q1: In between step 2 and 3, is the request to metadata.google.internal routed to the gke metadata proxy by the setting GKE_METADATA_SERVER on the node pool?

You are right, GKE_METADATA_SERVER is set on the node pool. This exposes a metadata API to the workloads that is compatible with the V1 Compute Metadata APIs. Once workload tries to access Google Cloud service, the GKE metadata server performs a lookup (the metadata server checks to see if a pod exists in the list whose IP matches the incoming IP of the request) before it goes on to request the OIDC token from the control plane.

Keep in mind that GKE_METADATA_SERVER enumeration feature can only be enabled if Workload Identity is enabled at the cluster level.

Q2: Why does the metadata server pod need host networking?

The gke-metadata-server intercepts all GCE metadata server requests from pods, however pods using the host network are not intercepted.

Q3: How does the GKE metadata server identify the pod making the call to start the process?

The pods are identified using iptables rules.

165

answered Oct 24 '22 10:10

Md Daud Walizarif

Related questions
                            
                                What is the difference between kubernetes and GKE?
                            
                                kubernetes ingress with multiple target-rewrite
                            
                                hostPath as volume in kubernetes
                            
                                Docker Swarm and Kubernetes Manager hardware requirements
                            
                                How do you remove a kubernetes context?
                            
                                kubernetes, prompt freezes at port forward command
                            
                                Kubernetes - how to check current domain set by --cluster-domain from pod?
                            
                                How can we create service dependencies using kubernetes
                            
                                Why do we need a port/containerPort in a Kuberntes deployment/container definition?
                            
                                How does one use Apache in a Docker Container and write nothing to disk (all logs to STDIO / STDERR)?
                            
                                Unable to connect to kubernetes python api - .kube/config file not found
                            
                                How to clone a private git repository into a kubernetes pod using ssh keys in secrets?
                            
                                Bind different Persistent Volume for each replica in a Kubernetes Deployment
                            
                                WaitForFirstConsumer PersistentVolumeClaim waiting for first consumer to be created before binding
                            
                                Ingress controller vs api gateway
                            
                                (Kubernetes + Minikube) can't get docker image from local registry
                            
                                Can kubectl describe show timestamp of pod events?
                            
                                Setting up AWS EKS - Don't know username and password for config
                            
                                field is immutable k8s
                            
                                How to stop Docker (and Kubernetes) using Docker desktop?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How does the GKE metadata server work in Workload Identity

Tags:

google-cloud-platform

kubernetes

google-kubernetes-engine

google-iam

Charlie Egan

People also ask

1 Answers

Md Daud Walizarif

Recent Activity

Donate For Us