Active/Passive Load Balancing with Kubernetes Services


You likely know that Services are how Kubernetes provides load balancing, exposing an application running on a set of Pods as a network service, and providing a single DNS name for that application.

If you have used a non-Kubernetes load balancer, you will also know that while active/active load balancing is probably the most common approach, active/passive load balancing is also a thing in that world. There can be a number of reasons one might want to do active/passive load balancing, for example licensing constraints, application requirements for consistent writes to local file systems, etc.

At first glance, however, it would appear that Kubernetes Services support only active/active approaches. There are ways to do active/passive load-balancing, however, if you are sufficiently motivated — but first you should consider some alternatives:

  1. Try to remove the “requirements” leading you to consider active/passive load balancing by changing the application to remove those “requirements” — some applications use active-passive load balancing because they are doing things like storing non-replicated session state in memory or on the file system of the application server. This is rarely a wise application design pattern, and storing state in a highly-available database or in-memory data grid or store is probably wise. This will also have the advantage of allowing you to horizontally scale your application, instead of being restricted to vertical scaling. Of course, if licensing is a particular concern, this may not be viable…
  2. Don’t load balance at all — Consider running only one replica and use Kubernetes features and/or cloud features to compensate for the lack of redundancy. If the container crashes or is otherwise impaired and you have a reasonable liveness probe, Kubernetes will restart the container. If the worker node goes down (and you haven’t used a local-storage PersistentVolumeClaim or defined overly restrictive nodeSelectors or other similar constructs, Kubernetes will attempt to start the Pod on a different worker node (assuming, of course, that you have another worker). This is often a pretty easy approach but may not work well if the Pod starts slowly or requires local-storage (unless your Kubernetes platform can move that storage with the pod or share it between the nodes).
  3. If you can’t do any of the above, then do active/passive load balancing with Kubernetes — just be aware that there are reasons this is the last entry on this list, however. Make sure that you have fully considered and thoughtfully ruled out the preceding options before continuing with an active/passive load balancing approach.

So, how can we do active/passive load-balancing? One of the easiest and most effective ways relies on 3 key facts:

  1. Services use labels to select the Pods that they will load balance traffic amongst
  2. Even though Pods are typically created by Deployments or StatefulSets, Pod labels can be updated by patching the Pod(s) directly rather than updating the Deployment or StatefulSet that is the “parent” of the Pod
  3. Roles and RoleBindings can be used to give Pods the ability to update either their own or other Pod labels

What does the solution look like?

  1. Create your application Deployment or StatefulSet as normal
  2. Create an “active-passive” Service that selects all of the normal labels that you would select for an active/active configuration — then add an additional label to the selection that is only for load-balancing purposes (in the diagram above, we’ve used role: active). This label is NOT to be added to the podSpec in the Deployment or StatefulSet. This is the service that other applications will use to connect to your active/passive application.
  3. If load-balancing a StatefulSet, create an “all-pods” Service that does not include the additional label (e.g. role: active). This service is only so that the Deployment that we’ll create later can access the StatefulSet’s Pods for health check purposes.
  4. Next, create a Role that allows get and patch of Pods — and a RoleBinding that assigns that Role to a ServiceAccount. The ServiceAccount does not necessarily need to be the same ServiceAccount that is used to run your Deployment or StatefulSet
  5. Now, create a Deployment that runs a Pod using the ServiceAccount that you bound the Role to. This Deployment’s sole purpose is to check the status of your application Pods and based on that check, toggle the role: active label between the Pods (in this example, the image used for this Deployment needs to include kubectl, however, you could easily implement the with Java or C# or any other framework that allows access to the Kubernetes APIs). Consider the following pseudo-code example for a 2-replica StatefulSet (a fully working example can be found here):

- bash
- -c
- |
while true; do
# If neither pod is active, then arbitrarily set the first
   # Pod to active (we'll check its status and adjust if 
   # necessary immediately after
   if there is no active pod
      use kubectl to patch pod helloworld-0 set label role: active
      use kubectl to patch pod helloworld-1 set label role: passive                                  
   # Detect if active pod is unhealthy.  This logic will likely
   # be application-dependent in your implementation.  In this
   # example, only if the response from the HTTP request is 200,    
   # is the pod considered healthy
   active_http_response=$(use curl to get status of active pod)
   echo "Active pod HTTP response: $active_http_response"
   passive_http_response=$(use curl to get status of passive pod)
   echo "Passive pod HTTP response: $passive_http_response"
   if [ "$active_http_response" = "200" ]; then
      echo "Active node passed healthcheck"
      if [ "$passive_http_response" = "200" ]; then
         echo "Enabling $passive"
         use kubectl to patch $active pod to set label role: passive
         use kubectl to patch pod $passive to set label role: active
         echo "Passive pod failing health check as well... leaving active pod enabled"
   # sleep for however long you want to wait between polling
   # Pod status
   echo "Sleeping for 60 seconds..."
   sleep 60

A word of caution — the example above is meant to be illustrative of the concept, not to be a production-ready implementation:

  • The example does not include any livenessProbes, resource constraints, etc.
  • The script that toggles the Pod labels might be better included in the Docker image instead of being contained in the Deployment manifest.
  • The script also really handles situations where the health check times out and doesn’t return a value at all.
  • Your application may not be able to use curl and may require a different approach to detect if the active Pod is unhealthy.
  • This example only supports 2 replicas, although it would not take much effort to extend it to handle more than 2 replicas, or to load-balance, a Deployment’s Pods instead of a StatefulSet’s Pods, require multiple concurrent failures before the Pod is considered failed, etc., etc.

Hopefully, however, the provided code illustrates the general concept.

After deploying and waiting for just a moment or two, the helloworld-svc has one endpoint (which happens to correspond to Pod helloworld-statefulset-0):

                >kubectl -n local-demo-activepassive4-ns describe svc helloworld-svc
Name:              helloworld-svc
Namespace:         local-demo-activepassive4-ns
Labels:            <none>
Annotations:       <none>
Type:              ClusterIP
IP Family Policy:  SingleStack
IP Families:       IPv4
IP:                None
IPs:               None
Port:              http  8080/TCP
TargetPort:        8080/TCP
Endpoints: <-- pod 0’s IP
Session Affinity:  None
Events:            <none>

After a while, the loadbalancer pod shows the following log:

                helloworld-0 is the active pod
helloworld-1 is the passive pod
Active pod HTTP response: 500
Passive pod HTTP response: 200
Detected application health failure
Enabling helloworld-statefulset-1
pod/helloworld-statefulset-0 patched
pod/helloworld-statefulset-1 patched
helloworld-1 is the active pod
helloworld-0 is the passive pod
Active pod HTTP response: 200
Passive pod HTTP response: 200
Active node passed healthcheck

and describing the helloworld-svc again shows the following:

                >kubectl -n local-demo-activepassive4-ns describe svc helloworld-svc
Name:              helloworld-svc
Namespace:         local-demo-activepassive4-ns
Labels:            <none>
Annotations:       <none>
Type:              ClusterIP
IP Family Policy:  SingleStack
IP Families:       IPv4
IP:                None
IPs:               None
Port:              http  8080/TCP
TargetPort:        8080/TCP
Endpoints: <-- Pod 1’s IP
Session Affinity:  None
Events:            <none>

As you can see, initially Pod helloworld-statefulset-0 was active. Its health check page at some point returned a 500 to signal that it was no longer healthy (in this example, that is done randomly, but presumably you would replace that health-check with a real implementation of something more applicable to your application). The script toggled the label value on the active pod to the role: passive, thereby removing it from the Service and the passive pod to the role: active, thereby adding it to the Service.

The full source code for the example is available over on GitHub.

Only registered users can post comments. Please, login or signup.

Start blogging about your favorite technologies and get more readers

Join other developers and claim your FAUN account now!


Paul Dally

Architect, Sun Life Financial

Distinguished Architect at Sun Life Financial. My focus lately includes a lot of containers and Kubernetes.



Total Hits



Discussed tools