Feedback

Chat Icon

Cloud Native CI/CD with GitLab

From Commit to Production Ready

Cloud Native GitLab Runners on Kubernetes: Scalability and Caching
87%

Using a Distributed Cache to Share Data Between Runners

Imagine your Kubernetes cluster is configured to autoscale its nodes based on the load (Cluster Autoscaler) and your GitLab Runner is configured to autoscale its pods based on the number of jobs in the queue (Runner Autoscaler). This combination is powerful and enables you to run multiple jobs in parallel on different nodes in the cluster. However, there is a challenge when it comes to caching data between runners.

There are some scenarios where you need to share data between runners that run on different nodes or even different clusters. This is where a distributed cache comes into play. GitLab supports different types of distributed caches, including S3-compatible storage like AWS S3 and Minio, Google Cloud Storage, and Azure Blob Storage.

If you have an account on S3, you can try the S3 cache, start by creating a bucket:

# Create a random bucket name
BUCKET_NAME=my-gl-cache-$(base64 /dev/urandom | tr -dc 'a-z0-9' | head -c 8)

# The location of the bucket
# Change it if needed
BUCKET_LOCATION=eu-west-3

# Your S3 access key
ACCESS_KEY=
SECRET_KEY=

# Install the AWS CLI
pip install awscli==1.44.29 \
  --break-system-packages --ignore-installed

# Login to AWS CLI
aws configure set aws_access_key_id $ACCESS_KEY
aws configure set aws_secret_access_key $SECRET_KEY
aws configure set default.region $BUCKET_LOCATION

# Create the bucket
aws s3 mb s3://$BUCKET_NAME --region $BUCKET_LOCATION

Create a user and attach the following policy:

cat < > /tmp/s3-gitlab-runner-cache-policy.json
{
  "Version":"2012-10-17",
  "Statement":[
    {
      "Sid":"GitlabRunnerCachePolicy",
      "Effect":"Allow",
      "Action":[
        "s3:PutObject",
        "s3:GetObject",
        "s3:ListBucket",
        "s3:DeleteObject"
      ],
      "Resource":[
        "arn:aws:s3:::$BUCKET_NAME/*",
        "arn:aws:s3:::$BUCKET_NAME"
      ]
    }
  ]
}
EOF

Create the gitlab-runner-cache-user user:

aws iam create-user --user-name gitlab-runner-cache-user

Attach the policy to the user:

aws iam put-user-policy \
  --user-name gitlab-runner-cache-user \
  --policy-name GitlabRunnerCachePolicy \
  --policy-document file:///tmp/s3-gitlab-runner-cache-policy.json

Verify that the policy is attached to the user:

aws iam list-user-policies \
  --user-name gitlab-runner-cache-user

Generate credentials for the user and export them to environment variables (access key and secret key):

GITLAB_RUNNER_CREDENTIALS=$(\
aws iam create-access-key --user-name gitlab-runner-cache-user)

GITLAB_RUNNER_ACCESS_KEY=$(echo $GITLAB_RUNNER_CREDENTIALS \
  | jq -r '.AccessKey.AccessKeyId')

GITLAB_RUNNER_SECRET_KEY=$(echo $GITLAB_RUNNER_CREDENTIALS | \
  jq -r '.AccessKey.SecretAccessKey')

Now, create a Kubernetes secret to store the S3 access and secret keys:

kubectl create secret generic gitlab-runner-s3-cache \
  --from-literal=accesskey="$GITLAB_RUNNER_ACCESS_KEY" \
  --from-literal=secretkey="

Cloud Native CI/CD with GitLab

From Commit to Production Ready

Enroll now to unlock all content and receive all future updates for free.