Skip to content

GCP Workload Identity Federation With External Workload

Over the past few years I have worked a lot on GKE-based workload identity. It's a feature that allows your GKE based workloads to access GCP resource via API without the need of storing GCP credentials in the workload.

For the recent work I've been running workload on bare-metal Kubernetes clusters instead of GKE, thus no more metadata service magic for me. Luckily GCP also allows workload identity federation with external workload, meaning I can use the pre-existing cluster's OIDC to federate with GCP, without the need to store GCP credentials as the Kubernetes secrets.

Prerequisites

  • A Kubernetes cluster with OIDC and service account token volume projection enabled
  • Have access to the GCP workload identity pool
  • The examples below uses terraform for managing GCP and Kubernetes resources, but you can use other tools/methods as well.

How it works

Workload identity federation supports two types of authentication:

  1. Direct resource access - which is what we are using here, where the workload can directly request the iam service to get the time based access token for accessing the resource.
  2. IAM service account impersonation - where the workload will use the JWT token to get the STS token, and then use the STS token to impersonate the service account to get the access token for accessing the resource.

Here is the diagram that shows how the service account impersonation works:

sequenceDiagram
    title Workload Identity Federation: Service Account Impersonation
    participant kubelet as Kubelet
    participant P as Pod

    participant STS as GCP STS Service
    participant IAM as GCP IAM Service
    participant GCP as GCP Resources

    kubelet->>+P: Projected the JWT token with audience
    kubelet->>+P: Mount the credential configuration (with no sensitive data)

    P->>+STS: Use JWT ID token to request for federated access token
    STS->>+STS: Validate the JWT token
    STS-->>-P: Return the federated access token

    P->>+IAM: Request for the GCP access token
    IAM-->>-P: Return the GCP access token

    P->>+GCP: Access resource with token
    GCP-->>-P: Return requested resource

Here is the diagram that shows how the direct resource access works:

sequenceDiagram
    title Workload Identity Federation: Direct Resource Access
    participant kubelet as Kubelet
    participant P as Pod

    participant STS as GCP STS Service
    participant GCP as GCP Resources

    kubelet->>+P: Projected the JWT token with audience
    kubelet->>+P: Mount the credential configuration (with no sensitive data)

    P->>+STS: Use JWT ID token to request for federated access token
    STS->>+STS: Validate the JWT token
    STS-->>-P: Return the federated access token


    P->>+GCP: Access resource with the federated access token directly
    GCP-->>-P: Return requested resource

As you can see, the direct resource access comes with a lot less round trips, and this is what we are going to use here.

Steps to enable workload identity federation

It's pretty much based on this doc.

You can pretty much using the following config

variable "project_id" {
  description = "The project ID to create the workload identity pool in"
}

variable "pool_id" {
  description = "The ID of the workload identity pool"
}

variable "provider_id" {
  description = "The ID of the workload identity pool provider"
}

variable "oidc_issuer_uri" {
  description = "The OIDC issuer URI"
}

variable "jwks_json" {
  description = "The JWKS JSON"
}

resource "google_iam_workload_identity_pool" "pool" {
  project                   = var.project_id
  workload_identity_pool_id = var.pool_id
  display_name              = var.pool_id
  description               = "Workload identity pool for on-prem kubernetes"
}

resource "google_iam_workload_identity_pool_provider" "provider" {
  project                            = var.project_id
  workload_identity_pool_id          = google_iam_workload_identity_pool.pool.workload_identity_pool_id
  workload_identity_pool_provider_id = var.provider_id

  oidc {
    issuer_uri = var.oidc_issuer_uri
    jwks_json  = var.jwks_json
  }

  attribute_mapping = {
    "google.subject"                 = "assertion.sub"
    "attribute.namespace"            = "assertion['kubernetes.io']['namespace']"
    "attribute.service_account_name" = "assertion['kubernetes.io']['serviceaccount']['name']"
    "attribute.pod"                  = "assertion['kubernetes.io']['pod']['name']"
  }
}

output "pool_name" {
  value = google_iam_workload_identity_pool.pool.name
}

output "provider_name" {
  value = google_iam_workload_identity_pool_provider.provider.name
}

You can get the jwks json via

kubectl get --raw /openid/v1/jwks | jq

And the OIDC issuer URI via

kubectl get --raw /.well-known/openid-configuration | jq -r .issuer

The terraform module can be deployed via

module "workload_identity" {
  source = "./workload-identity"
  // ...
}

deploy the workload

Here is an example of a pod workload that uses the workload identity federation

locals {
  namespace      = "default"
  sa_name        = "storage-sa"
  mapped_subject = "system:serviceaccount:${local.namespace}:${local.sa_name}"
}

// create the k8s service account
resource "kubernetes_service_account" "storage_sa" {
  metadata {
    name      = local.sa_name
    namespace = local.namespace
  }
}

// iam member for direct resource access from the workload
resource "google_project_iam_member" "storage_sa" {
  project = "your-project"
  role    = "roles/storage.admin"
  member  = "principal://iam.googleapis.com/${module.workload_identity.pool_name}/subject/${local.mapped_subject}"
}

// credential configuration for the workload as a config map
resource "kubernetes_config_map" "credential_configuration" {
  metadata {
    name      = "credential-configuration"
    namespace = "default"
  }
  data = {
    "credential-configuration.json" = jsonencode({
      universe_domain    = "googleapis.com"
      type               = "external_account"
      audience           = "//iam.googleapis.com/${module.workload_identity.provider_name}"
      subject_token_type = "urn:ietf:params:oauth:token-type:jwt"
      token_url          = "https://sts.googleapis.com/v1/token"
      credential_source = {
        file = "/var/run/workload-identity-federation/token"
        format = {
          type = "text"
        }
      }
      token_info_url = "https://sts.googleapis.com/v1/introspect"
    })
  }
}

// the workload pod
resource "kubernetes_pod" "example" {
  metadata {
    name      = "example"
    namespace = "default"
  }
  spec {
    service_account_name = kubernetes_service_account.storage_sa.metadata[0].name
    container {
      name  = "example"
      image = "google/cloud-sdk:alpine"
      command = [
        "/bin/sh",
        "-c",
        "gcloud auth login --cred-file=$GOOGLE_APPLICATION_CREDENTIALS --project=$GOOGLE_PROJECT_ID && sleep infinity" // auth login
      ]
      volume_mount {
        name       = "token"
        mount_path = "/var/run/workload-identity-federation"
        read_only  = true
      }
      volume_mount {
        name       = "credential-configuration"
        mount_path = "/etc/workload-identity"
        read_only  = true
      }
      env {
        name  = "GOOGLE_APPLICATION_CREDENTIALS"
        value = "/etc/workload-identity/credential-configuration.json"
      }
      env {
        name  = "GOOGLE_PROJECT_ID"
        value = "your-project-id"
      }
    }
    volume {
      name = "token"
      projected { // projected volume for the workload identity token
        sources {
          service_account_token {
            audience           = "https://iam.googleapis.com/${module.workload_identity.provider_name}" // audience for the token
            path               = "token"
            expiration_seconds = 3600 // jwt token expiration time
          }
        }
      }
    }
    volume {
      name = "credential-configuration"
      config_map {
        name = kubernetes_config_map.credential_configuration.metadata[0].name
      }
    }
  }
}

The configuration is fairly labourous, but can easily be automated either via terraform or a magic mutating webhook injection.

Testing

kubectl exec -it example -- bash
apk add curl jq

SUBJECT_TOKEN=$(cat /var/run/workload-identity-federation/token)

PROJECT_NAME="<the-project-name>"
PROJECT_NUMBER="<the-project-number>"
POOL_ID="<the-oidc-pool-id>"
PROVIDER_ID="<the-oidc-provider-id>"

ACCESS_TOKEN=$(curl https://sts.googleapis.com/v1/token \
    --data-urlencode "audience=//iam.googleapis.com/projects/${PROJECT_NUMBER}/locations/global/workloadIdentityPools/${POOL_ID}/providers/${PROVIDER_ID}" \
    --data-urlencode "grant_type=urn:ietf:params:oauth:grant-type:token-exchange" \
    --data-urlencode "requested_token_type=urn:ietf:params:oauth:token-type:access_token" \
    --data-urlencode "scope=https://www.googleapis.com/auth/cloud-platform" \
    --data-urlencode "subject_token_type=urn:ietf:params:oauth:token-type:jwt" \
    --data-urlencode "subject_token=$SUBJECT_TOKEN" | jq -r .access_token)

curl "https://storage.googleapis.com:443/storage/v1/b?alt=json&maxResults=1000&project=${PROJECT_NAME}&projection=full" \
    --header "Authorization: Bearer $ACCESS_TOKEN"
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3059  100  1352  100  1707   6656   8404 --:--:-- --:--:-- --:--:-- 15143
{
  "kind": "storage#buckets",
  "items": [
    {
      "kind": "storage#bucket",
# ...

Bonus: It's not just for Kubernetes :)

Workload identity federation also works for other external workloads such as GitHub Actions. It's as easy as:

// service account for the github actions
resource "google_service_account" "github_actions" {
  account_id   = "github-actions"
  display_name = "github-actions"
}

// create the workload identity pool
resource "google_iam_workload_identity_pool" "github_actions" {
  project                   = "your-project"
  workload_identity_pool_id = "github-actions"
  display_name              = "github-actions"
  description               = "Workload identity pool for github-actions"
}

// create the workload identity pool provider
resource "google_iam_workload_identity_pool_provider" "repox" {
  project                            = "your-project"
  workload_identity_pool_id          = google_iam_workload_identity_pool.github_actions.workload_identity_pool_id
  workload_identity_pool_provider_id = "repox-github-actions"
  display_name                       = "repox-github-actions"

  attribute_mapping = {
    "google.subject"       = "assertion.sub"
    "attribute.actor"      = "assertion.actor"
    "attribute.repository" = "assertion.repository"
    "attribute.org"        = "assertion.repository_owner"
    "attribute.ref"        = "assertion.ref"
  }

  attribute_condition = "attribute.repository == 'username/repox'"

  oidc {
    issuer_uri = "https://token.actions.githubusercontent.com"
  }
}

// bind the service account to the workload identity pool
resource "google_service_account_iam_member" "workload_identity" {
  service_account_id = google_service_account.github_actions.name
  role               = "roles/iam.workloadIdentityUser"
  member             = "principalSet://iam.googleapis.com/${google_iam_workload_identity_pool.github_actions.name}/attribute.repository/username/repox"
}

// make sure that you have given the service account the necessary permissions
// ...

In your github actions repository username/repox, you can now add the following workflow to authenticate with GCP using the workload identity federation with a github action that looks like this:

name: Helm Chart Publish

on:
  push:
    branches:
      - main

permissions:
  contents: read # read is required otherwise the checkout step will fail
  id-token: write

jobs:
  helm-publish:
    name: Run on Ubuntu
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code # make sure the checkout happens before the auth step
        uses: actions/checkout@v4
      - id: auth
        name: Authenticate to GCP
        uses: 'google-github-actions/auth@v2'
        with:
          create_credentials_file: 'true'
          workload_identity_provider: projects/YOUR_GCP_PROJECT_NUMBER/locations/global/workloadIdentityPools/github-actions/providers/repox-github-actions
          service_account: github-actions@your-project.iam.gserviceaccount.com
      - name: docker login
        run: |
          gcloud auth login --brief --cred-file="${{ steps.auth.outputs.credentials_file_path }}"
          gcloud auth configure-docker europe-west1-docker.pkg.dev
      # later you can build and push the container image or helm chart to the registry

Conclusion

Running on a bare-metal Kubernetes cluster is cheap in terms of computing cost however the setup is not as straight forward as using commoditized cloud-based Kubernetes offerings. The workload identity federation feature makes it a lot easier to utilise the GCP resources because of the following reasons:

  • No need for manually GCP credentials provisioning
  • No need for storing GCP credentials in the Kubernetes cluster as secrets, thus a lot less operational overhead especially when it comes to rotating the credentials.
  • The workload identity STS token is time-based, which is a lot more secure vs the long-lived GCP credentials.

Overall it is just a very elegant solution :)