Close Menu

Autoscale GitLab CI runners and save 90% on EC2 costs

Autoscale GitLab CI runners and save 90% on EC2 costs

At Substrakt Health, we use continuous integration workers to test our software every time new code is written and pushed but that computing capacity can be expensive and hard to predict. This tutorial shows you how to set up an autoscaling cluster of GitLab CI runners using docker-machine and AWS.

Code quality is always a top priority for us. We want to know that our code works every time and when it stops working we want to know immediately. We use GitLab CI to run our tests every time we push new code and before every deployment. GitLab CI lets us split this work across multiple servers and scale up and down capacity as required to keep costs down for us. This tutorial will show you how to set up an autoscaling CI cluster for GitLab and save up to 90% on costs using AWS EC2 Spot Instances.

GitLab CI allows us to split our jobs across multiple machines. By default, each new worker node requires some setup work to provision and attach it to our GitLab instance, but we can also use the autoscaling mode to provision a single machine and let that machine decide how much capacity is required and then spin up or down further instances as required.

A warning: This tutorial will not be covered entirely by the AWS free usage tier. It’s going to cost money to try this out.

Creating the spawner

First off, we need a spawner machine. This runs 24/7 and checks that GitLab CI has enough capacity to runs the jobs currently in the queue. It doesn’t run any jobs itself.

We use Ubuntu 16.04 LTS for our internal tooling, so just create an EC2 instance (t2.micro is enough and is included in the free tier.) Setting up VPCs and related subnets is out of the scope of this article, we’ll assume that you’re working in the default VPC. Then we need to install a bunch of software on our machine to set it up.

Installing gitlab-runner

gitlab-runner is the main software we need to complete this task. Installing it on ubuntu is really easy.

curl -L | sudo bash 
sudo apt-get install gitlab-ci-multi-runner

Once you’ve done that, register the runner on your GitLab instance. Do this as you normally would with any other GitLab CI runner but choose docker+machine as the runner. Docker Machine is the software required to spin up new virtual machines and install Docker on them.

Installing Docker Machine

Docker Machine is a handy bit of software that allows one host running Docker to spin up and provision other machines running Docker. Installing it is even easier:

curl -L`uname -s`-`uname -m` >/tmp/docker-machine &&
chmod +x /tmp/docker-machine &&
sudo cp /tmp/docker-machine /usr/local/bin/docker-machine

This will install the docker-machine binary in your PATH.

Configuring gitlab-runner

By default, gitlab-runner will not work in the autoscaling mode we want. It’ll just run a job by default and then stop. We want to configure this machine to no longer run tests but to spin up new docker machines as and when necessary. Open your gitlab-runner config file, usually found in /etc/gitlab-runner/config.toml and make some changes. This is our example (with sensitive information removed). Let’s go through some of the important lines.

concurrent = 12

This tells GitLab CI that in total, it should attempt to run 12 jobs simultaneously across all child workers.

limit = 6

This tells GitLab CI that in total, it should use a maximum of 6 worker nodes to run the number of jobs defined above. This means, in practice, that each node will run 2 jobs simultaneously. You’ll need to tweak this value depending on the resources your jobs need and the resources of your child nodes. There’s no right answer here but generally we found it wasn’t a good idea to have more than the number of CPUs – 1 of jobs running per node but again this is a bit of a ‘finger-in-the-air’ calculation as it really depends on your tech stack.

IdleCount = 0

This tells GitLab CI not to run any machines constantly (whilst idle.) This means when nobody is running a job, or no jobs are queued to spin down all of the worker nodes after an amount of time (IdleTime at the bottom of the file). We power our nodes down after half an hour of no use. This does have the consequence of there being a short wait when we start our day but it saves us money as we’re not using computing power when it’s not required.

MachineOptions = ["amazonec2-access-key=XXXX", "amazonec2-secret-key=XXXX", "amazonec2-ssh-user=ubuntu", "amazonec2-region=eu-west-2", "amazonec2-instance-type=m4.xlarge", "amazonec2-ami=ami-996372fd", "amazonec2-vpc-id=vpc-xxxxx", "amazonec2-subnet-id=subnet-xxxxx", "amazonec2-zone=a", "amazonec2-root-size=32", "amazonec2-request-spot-instance=true", "amazonec2-spot-price=0.03"]

This is where the magic happens. This is where we set our options for Docker Machine. It defines the size, type and price of our runners. I’ll run through each of the non-obvious options.

amazonec2-vpc-id=vpc-xxxxx & amazonec2-subnet-id=subnet-xxxxx

This is the VPC & associated subnet ID. Generally, you’d want this in your default VPC in a public subnet. We run our jobs in a private VPC with internal peering connections to other VPCs due to regulatory constraints.


This is the AWS region. We run all of our infrastructure in the EU (London) region.


This is the size of the instance we want for each of our runners. This setting can have massive implications on cost and it can be a tricky balancing act. Choose too small and your jobs take forever to run due to a lack of resources (more time = more money) but choose too large and you have unused compute capacity which costs you money you don’t need to spend. Again, there’s no right answer here, it’s about what works for your workload. We found m4.xlarge works for us.

Save up to 90% on EC2 costs using Spot Instances

Spot instances are magic. They allow us to bid for unused capacity in the AWS infrastructure and often can mean that EC2 costs can be dramatically lower. We’re currently seeing discounts of around 85% on our EC2 bills due to using spot instances. Setting them up for use on GitLab CI is really easy too. There is (of course) a downside. If our bid price for VMs is exceeded, then our instances shut down with only a few minutes notice. But as long as our bid is high enough, this isn’t an issue. Pricing in the spot market is insanely complex but in eu-west-2 at least, prices for m4.large and xlarge instances appear to have been static for months so a bid 10-20% higher than the current spot price appears to be a safe bet. Just keep your eyes peeled. The current spot price for an m4.xlarge instance is $0.026. We’ve set our maximum price at $0.03 to give us some wiggle room. At time of writing, the on-demand price is $0.232. The numbers speak for themselves.

Note: Spot pricing can vary significantly between instance sizes, regions and even availability zones in the same region. This guide assumes that spot pricing won’t vary massively or that you’ve set a good buffer above the current spot price to avoid outages.

amazonec2-request-spot-instance=true & amazonec2-spot-price=0.03

This tells GitLab CI that instead of just spawning new EC2 instances at full price, that it should request spot instances at the current spot price setting a maximum bid that it should not exceed per hour, in USD (regardless of what currency you’re billed in. We’re billed in GBP, but spot instances are still calculated in USD.) The maximum bid is whatever you’re comfortable paying. We tend to set it close to the on-demand price because we’re looking for any discount. As long as we’re not paying more than we otherwise would, it’s fine with us. Your financial constraints may affect your decisions differently.

Update: From October, AWS will charge in seconds, rather than hours used making the potential savings even higher for unused partial-hours.

We’d love to see how you get along with this so please let us know. You can contact me max [at] substrakthealth [dot] com. For us, it’s saved us time and money and that’s never a bad thing.

Can you help us test new designs?

We develop user centred designs which means we're always looking for actual people (like you) who can help us test new interfaces. This can take the form of a face to face session or small user tests sent by email which can be completed online in no time