Skip to main content

Decide on Self-Hosted Runner Architecture

Decide on how to operate self-hosted runners that are used to run GitHub Actions workflows. These runners can be set up in various ways and allow us to avoid platform fees while running CI jobs in private infrastructure, enabling access to VPC resources. This approach is ideal for private repositories, providing control over instance size, architecture, and control costs by leveraging spot instances. The right choice depends on your platform, whether you’re using predominantly EKS, ECS, or Lambda.

Problem

When using GitHub Actions, you can opt for both GitHub Cloud-hosted and self-hosted runners, and they can complement each other. In some cases, self-hosted runners are essential—particularly for accessing resources within a VPC, such as databases, Kubernetes API endpoints, or Kafka servers, which is common in GitOps workflows.

However, while self-hosted runners are ideal for private infrastructure, they pose risks in public or open-source repositories due to potential exposure of sensitive resources. If your organization maintains open-source projects, this should be a critical consideration, and we recommend using cloud-hosted runners for those tasks.

The hosting approach for self-hosted runners should align with your infrastructure. If you use Kubernetes, it's generally best to run your runners on Kubernetes. Conversely, if your infrastructure relies on ECS or Lambdas, you may want to avoid unnecessary Kubernetes dependencies and opt for alternative hosting methods.

In Kubernetes-based setups, configuring node pools with Karpenter is key to maintaining stability and ensuring effective auto-scaling with a mix of spot and on-demand instances. However, tuning this setup can be challenging, especially with recent changes to ARC, where the newer version does not support multiple labels for runner groups, leading to community disagreement over trade-offs. We provide multiple deployment options for self-hosted runners, including EKS, Philips Labs' solution, and Auto Scaling Groups (ASG), tailored to your specific runner management needs.

Considered Options

Option 1: EC2 Instances in an Auto Scaling Group (github-runners)

The first option is to deploy EC2 instances in an Auto Scaling Group. This is the simplest option. We can use the github-runners component to deploy the runners. However, this option is not as scalable as the other options.

Option 2: Actions Runner Controller on EKS (eks/actions-runner-controller)

The second option is to deploy the Actions Runner Controller on EKS. Since many implementations already have EKS, this option is a good choice to reuse existing infrastructure.

We can use the eks/actions-runner-controller component to deploy the runners, which is built with the Actions Runner Controller helm chart.

Option 3: GitHub Actions Runner on EKS (eks/github-actions-runner)

Alternatively, we can deploy the GitHub Actions Runner on EKS. This option is similar to the previous one, but it uses the GitHub Actions Runner instead of the Actions Runner Controller.

This component deploys self-hosted GitHub Actions Runners and a Controller on an EKS cluster, using "runner scale sets".

This solution is supported by GitHub and supersedes the actions-runner-controller developed by Summerwind and deployed by Cloud Posse's actions-runner-controller component.

However, there are some limitations to the official Runner Sets implementation:

  • Limited set of packages

    The runner image used by Runner Sets contains no more packages than are necessary to run the runner. This is in contrast to the Summerwind implementation, which contains some commonly needed packages like build-essential, curl, wget, git, and jq, and the GitHub hosted images which contain a robust set of tools. (This is a limitation of the official Runner Sets implementation, not this component per se.) You will need to install any tools you need in your workflows, either as part of your workflow (recommended), by maintaining a custom runner image, or by running such steps in a separate container that has the tools pre-installed. Many tools have publicly available actions to install them, such as actions/setup-node to install NodeJS or dcarbone/install-jq-action to install jq. You can also install packages using awalsh128/cache-apt-pkgs-action, which has the advantage of being able to skip the installation if the package is already installed, so you can more efficiently run the same workflow on GitHub hosted as well as self-hosted runners.

    Feature Requests:

    There are (as of this writing) open feature requests to add some commonly needed packages to the official Runner Sets runner image. You can upvote these requests here and here to help get them implemented.

  • Docker in Docker (dind) mode only

    In the current version of this component, only "dind" (Docker in Docker) mode has been tested. Support for "kubernetes" mode is provided, but has not been validated.

  • Limited configuration options

    Many elements in the Controller chart are not directly configurable by named inputs. To configure them, you can use the controller.chart_values input or create a resources/values-controller.yaml file in the component to supply values.

    Almost all the features of the Runner Scale Set chart are configurable by named inputs. The exceptions are:

    You can specify these values by creating a resources/values-runner.yaml file in the component and setting values as shown by the default Helm values.yaml, and they will be applied to all runners.

  • Component limitations

    Furthermore, the Cloud Posse component has some additional limitations. In particular:

    • The controller and all runners and listeners share the Image Pull Secrets. You cannot use different ones for different runners.
    • All the runners use the same GitHub secret (app or PAT). Using a GitHub app is preferred anyway, and the single GitHub app serves the entire organization.
    • Only one controller is supported per cluster, though it can have multiple replicas.

These limitations could be addressed if there is demand. Contact Cloud Posse Professional Services if you would be interested in sponsoring the development of any of these features.

Option 4: Philips Labs Runners (philips-labs-github-runners)

If we are not deploying EKS, it's not worth the additional effort to set up Self-Hosted runners on EKS. Instead, we deploy Self-Hosted runners on EC2 instances. These are managed by an API Gateway and Lambda function that will automatically scale the number of runners based on the number of pending jobs in the queue. The queue is written to by the API Gateway from GitHub Events.

For more on this option, see the Philips Labs GitHub Runner documentation.

Option 5: Managed Runners

There are a number of services that offer managed runners. These still have the advantage over GitHub Cloud hosted runners as the can be managed within you private VPCs.

One option to consider is runs-on.com which provides a very inexpensive option.

Recommendation

At this time Cloud Posse recommends the Actions Runner Controller on EKS (eks/actions-runner-controller) if you are using EKS and the Philips Labs Runners (philips-labs-github-runners) if you are not using EKS.