Skip to main content

[EKS] GitHub Action Runner Controller

The GitHub Action Runner Controller (ARC) is a Kubernetes operator that automates the management of self-hosted GitHub Actions runners in a Kubernetes cluster, that works very well together with Karpenter for EKS.

Important:
We recommend using ARC only if your organization uses Kubernetes, otherwise we recommend the Philips Labs GitHub Runner approach instead.

By default, GitHub Actions are run in the cloud on hosted machines, but we can opt to use "Self-Hosted" GitHub Action Runners instead. Historically, we've deployed an Auto Scaling Group that gives each run a dedicated and customized instance. Now that we've deployed EKS, we can save money by utilizing the actions-runner-controller to deploy virtual-machines inside of EKS, and run GitHub Actions from these containers. These virtual-machines will be fully customizable, scale automatically, and be cheaper than both GitHub hosted runners and ASG instances.

Quick Start

StepsExample
1. Generate GitHub Private Keyssm_github_secret_path: "/github_runners/controller_github_app_secret"
2. Generate GitHub Webhook Secret Tokenssm_github_webhook_secret_token_path: "/github_runners/github_webhook_secret_token"
3. Connect to the VPN
4. Deploy cluster and resources into the auto stackatmos workflow deploy/github-runners -f github
5. Set up Webhook Driven ScalingClick Ops

Requirements

In order to deploy Self-Hosted GitHub Runners on EKS, follow the steps outlined in the EKS setup doc. Those steps will complete the EKS requirements.

Overview

  • We'll begin by generating the required secrets, which is a manual process.
  • AWS SSM will be used to store and retrieve secrets.
  • Then we need to decide on the SSM path for the GitHub secret (Application private key) and GitHub webhook secret.

1 GitHub Application Private Key

Since the secret is automatically scoped by AWS to the account and region where the secret is stored, we recommend the secret be stored at /github/acme/github_token.

stacks/catalog/eks/actions-runner-controller.yaml:

ssm_github_secret_path: "/github_runners/controller_github_app_secret"

The preferred way to authenticate is by creating and installing a GitHub App. This is the recommended approach as it allows for more much more restricted access than using a personal access token, at least until fine-grained personal access token permissions are generally available. Follow the instructions here to create and install the GitHub App.

At the creation stage, you will be asked to generate a private key. This is the private key that will be used to authenticate the Action Runner Controller. Download the file and store the contents in SSM using the following command, adjusting the profile and file name. The profile should be the admin role in the account to which you are deploying the runner controller. The file name should be the name of the private key file you downloaded.

AWS_PROFILE=acme-core-use1-auto-admin chamber write github_runners controller_github_app_secret -- "$(cat APP_NAME.DATE.private-key.pem)"

You can verify the file was correctly written to SSM by matching the private key fingerprint reported by GitHub with:

AWS_PROFILE=acme-core-use1-auto-admin chamber read -q github_runners controller_github_app_secret | openssl rsa -in - -pubout -outform DER | openssl sha256 -binary | openssl base64

At this stage, record the Application ID and the private key fingerprint in your secrets manager (e.g. 1Password). You will need the Application ID to configure the runner controller, and want the fingerprint to verify the private key.

Proceed to install the GitHub App in the organization or repository you want to use the runner controller for, and record the Installation ID (the final numeric part of the URL, as explained in the instructions linked above) in your secrets manager. You will need the Installation ID to configure the runner controller.

In your stack configuration, set the following variables, making sure to quote the values so they are treated as strings, not numbers.

github_app_id: "12345"
github_app_installation_id: "12345"

2 GitHub Webhook Secret Token

If using the Webhook Driven autoscaling (recommended), generate a random string to use as the Secret when creating the webhook in GitHub.

Generate the string using 1Password (no special characters, length 45) or by running

dd if=/dev/random bs=1 count=33  2>/dev/null | base64

Store this key in AWS SSM under the same path specified by ssm_github_webhook_secret_token_path

stacks/catalog/eks/actions-runner-controller.yaml:

ssm_github_webhook_secret_token_path: "/github_runners/github_webhook_secret_token"

Deploy

Automation has an unique set of components from the plat clusters and therefore has its own Atmos Workflow. Notably, auto includes the eks/actions-runner-controller component, which is used to create the self-hosted runners for the GitHub Repository or Organization

Tip:

The first three steps before are all included in the following workflow:

These are the commands included in the deploy/arc-github-runners workflow in the examples/snippets/stacks/workflows/github.yaml file:
    No commands found
Too many commands? Consider using the Atmos workflow! 🚀

6 iam-service-linked-roles Component

At this point we assume that the iam-service-linked-roles component is already deployed for core-auto. If not, deploy this component now with the following command:

atmos terraform apply iam-service-linked-roles -s core-gbl-auto

7 Deploy Automation Cluster and Resources

Deploy the cluster with the same commands as plat cluster deployments:

These are the commands included in the deploy/cluster workflow in the examples/snippets/stacks/workflows/eks.yaml file:
    No commands found
Too many commands? Consider using the Atmos workflow! 🚀
These are the commands included in the deploy/resources workflow in the examples/snippets/stacks/workflows/eks.yaml file:
    No commands found
Too many commands? Consider using the Atmos workflow! 🚀

Validate the core-auto deployment using Echo Server. For example: https://echo.use1.auto.core.acme-svc.com/

8 Deploy the Actions Runner Controller

Finally, deploy the actions-runner-controller component with the following command:

atmos terraform deploy eks/actions-runner-controller -s core-use1-auto

9 Using Webhook Driven Autoscaling (Click Ops)

To use the Webhook Driven autoscaling, you must also install the GitHub organization-level webhook after deploying the component (specifically, the webhook server). The URL for the webhook is determined by the webhook.hostname_template and where it is deployed. Recommended URL is https://gha-webhook.[environment].[stage].[tenant].[service-discovery-domain], which for this organization would be https://gha-webhook.use1.auto.core.acme-svc.com

As a GitHub organization admin, go to https://github.com/organizations/acme/settings/hooks, and then:

  • Click "Add webhook" and create a new webhook with the following settings:
    • Payload URL: copy from Terraform output webhook_payload_url
    • Content type: application/json
    • Secret: whatever you configured in the secret above
    • Which events would you like to trigger this webhook:
      • Select "Let me select individual events"
      • Uncheck everything ("Pushes" is likely the only thing already selected)
      • Check "Workflow jobs"
    • Ensure that "Active" is checked (should be checked by default)
    • Click "Add webhook" at the bottom of the settings page

After the webhook is created, select "edit" for the webhook and go to the "Recent Deliveries" tab and verify that there is a delivery (of a "ping" event) with a green check mark. If not, verify all the settings and consult the logs of the actions-runner-controller-github-webhook-server pod.

Related Topics