👈

Setting up CI/CD authentication for AWS using OpenID

2024-07-09

OpenID

Conventionally, developers would have to create secrets in the application they want to authenticate with and store these secrets in a place where they are available for an application. This raises a security risk because secrets may be exposed if they're not handled carefully. One way of mitigating this risk is using OpenID, where the application outsources authentication to an OpenID Connect (OIDC) provider. The application that requires authentication is registered with the OIDC provider and uses tokens to verify the identity of the user. For a recent project, I wanted to authenticate a GitLab CI runner (the user) with my AWS account so the runner would be able to deploy AWS resources uing Terraform. Deploying infrastructure from a single place (CI/CD in this case) is generally a good idea if you're working on a code base with several people.

  sequenceDiagram
    participant GitLab CI Runner
    participant AWS OIDC as AWS OIDC Provider
    participant AWS Resource as AWS Account

    GitLab CI Runner->>AWS Resource: Attempt to access AWS resource
    AWS Resource->>GitLab CI Runner: Redirect to AWS OIDC Provider for authentication
    GitLab CI Runner->>AWS OIDC: Authentication request with OIDC token
    AWS OIDC-->>GitLab CI Runner: Authentication response
    GitLab CI Runner->>AWS Resource: Return with authentication token
    AWS Resource->>AWS OIDC: Validate authentication token
    AWS OIDC-->>AWS Resource: Token validation response
    AWS Resource-->>GitLab CI Runner: Grant access to AWS resource

IaC (Infrastructure as code)

There are several ways to manage infrastructure with declarative code, but what I like about Terraform is that it is cloud provider agnostic. This means that you can use the same code base to provision resources in AWS, GCP and Azure. While the three big cloud providers are the most common use case, many SaaS solutions that have an API often also have a Terraform provider. Terraform uses state to keep track of your current stack and to determine changes. State is kept in terraform.tfstate, which can be kept locally, but should be kept in remote storage. Maintaining the state remotely allows you to work on the same IaC project with others, and it's generally safer.

You can terraform refresh to update the state with the actual state of your cloud environment. You can terraform import cloud resources that are not yet tracked in the state. You can terraform plan to compare your Terraform code with the state, which results in an overview of which resources will be created, updated, or destroyed. You can terraform apply this plan to let the previously reported changes take effect.

OpenID in the CI

Usually you don't want to apply a Terraform plan (a set of definitions of the infrastructure you want to deploy) from your local machine, but from a CI (continuous integration) pipeline that only runs after code has been reviewed and merged. The CI runner is the environment that runs the sequence of commands that are described in the CI pipeline. To deploy anything in a cloud environment, authentication and authorization of the CI runner is required. However, since this requires additional cloud resources, you often have to bootstrap your Terraform project by deploying some things with Terraform from your local machine. This is not an issue since you can terraform sync your local state with the remote state that is used by the CI runner. GitLab even offers a managed remote state backend.

I want to deploy AWS infrastructure from a GitLab CI, so the first step is to create a new OIDC provider.

variable "gitlab_url" {
  type    = string
  default = "https://gitlab.com"
}

data "tls_certificate" "gitlab" {
  url = var.gitlab_url
}

resource "aws_iam_openid_connect_provider" "gitlab" {
  url = var.gitlab_url

  client_id_list = [
    var.gitlab_url,
  ]

  thumbprint_list = [data.tls_certificate.gitlab.certificates.0.sha1_fingerprint]
}

Next, we need a role that is allowed to request temporary credentials. This role also gets an assume role policy to establish 'trust' with the OIDC provider. With the condition we make sure that the CI runner can only get temporary credentials if the CI run is on the main branch of the project specified in the project_path variable. This will prevent someone from deploying infrastructure from a feature branch. You can change this condition if your branching strategy is different.

variable project_path {
  type    = string
  default = some_project_id/some_repo_name:ref_type:branch:ref:main
}

resource "aws_iam_role" "gitlab" {
  name = "gitlab"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Principal = {
          Federated = aws_iam_openid_connect_provider.gitlab.arn
        }
        Action = "sts:AssumeRoleWithWebIdentity"
        Condition = {
          StringEquals = {
            "gitlab.com:sub" = "project_path:${var.project_path}"
          }
        }
      }
    ]
  })
}

This role has an ARN (Amazon Resource Name), which is a unique identifier of any resource. This ID is needed by the CI runner to identify as a trusted entity. You can pass this ID through a CI pipeline variable, such as ROLE_ARN.

auth:
  image: python:3.12-slim
  stage: auth
  id_tokens:
    GITLAB_OIDC_TOKEN:
      aud: https://gitlab.com
  script:
    - pip install awscli
    - >
      STS=($(aws sts assume-role-with-web-identity
      --role-arn $ROLE_ARN
      --role-session-name "gitlab-${CI_PROJECT_ID}-${CI_PIPELINE_ID}"
      --web-identity-token ${GITLAB_OIDC_TOKEN}
      --query 'Credentials.[AccessKeyId,SecretAccessKey,SessionToken]'
      --output text))

    - export AWS_ACCESS_KEY_ID="${STS[0]}"
    - export AWS_SECRET_ACCESS_KEY="${STS[1]}"
    - export AWS_SESSION_TOKEN="${STS[2]}"

    - echo "AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}" >> config.env
    - echo "AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}" >> config.env
    - echo "AWS_SESSION_TOKEN=${AWS_SESSION_TOKEN}" >> config.env
    - aws sts get-caller-identity
  artifacts:
    reports:
      dotenv: config.env

At the beginning, an ID token is added to the CI job. This token is used to authenticate with the OIDC provider. I'm using a Python image on which I install the awscli that is needed to interact with an AWS cloud account. The credentials are assigned to environment variables and these variables are bundled in an artifact that is picked up by the subsequent jobs. It's important to note that this is not an ideal way to transmit credentials between jobs because artifacts can be downloaded. You can, however, restrict users from downloading artifacts by setting actifacts:access. Since the credentials fetched in the CI above are temporary and because my project is private, I didn't bother to do this.

The AWS provider looks for the environment variables that were set in the auth stage by default. So in the subsequent CI task we can terraform init to initialize the project and install the AWS provider.

dev_init:
  image: registry.gitlab.com/gitlab-org/terraform-images/stable:latest
  stage: init
  needs: [auth]
  variables:
    TF_ADDRESS: ${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/terraform/state/dev
  script:
    - gitlab-terraform validate
    - gitlab-terraform init