Affinity and Anti-Affinity: Advanced Node Scheduling

Unlike nodeSelector, which only supports equality-based requirements, and unlike Taints and Tolerations, which repel Pods from nodes, Node Affinity allows you to define explicit, more complex, and custom scheduling rules based on node labels.

This is the third mechanism we are going to explore in this section. It's the most complete and flexible one, and it's recommended for advanced use cases.

Node Affinity: A Practical Example

In this example, our aim is to deploy Nginx on a specific node in the cluster based on its hostname.

Suppose we have a cluster of three nodes:

prod-f72fj
prod-f72fo
backstage-f8wx9

Our objective is to deploy Nginx on the node with the hostname backstage-f8wx9. Since Kubernetes automatically adds the label kubernetes.io/hostname to each node with its hostname, we can leverage this label for our scheduling needs. This is what the manifest looks like:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 5
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/hostname
                operator: In
                values:
                - backstage-f8wx9

The block we added is the affinity section:

      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/hostname
                operator: In
                values:
                - backstage-f8wx9

We start by defining nodeAffinity, which indicates that we are specifying rules for node selection.
The requiredDuringSchedulingIgnoredDuringExecution field indicates that the rules defined within it must be applied during the scheduling phase. For example, when we first create the Pod, Kubernetes will check if the node meets the criteria defined in this section. If it does not, the Pod will not be scheduled on that node. The IgnoredDuringExecution part means that once the Pod is running, if the node's labels change and no longer meet the criteria, the Pod will continue to run on that node. In other words, we are telling Kubernetes to apply these rules only during the initial scheduling phase (a Pod is created, a Deployment is updated, etc.) without evicting running Pods if the node's labels do not match anymore.
The nodeSelectorTerms field contains a list of terms that define the selection criteria.
Each term consists of one or more matchExpressions, which are conditions that must be met for a node to be considered suitable for scheduling the Pod.
In our case, we have a single matchExpression that checks if the node has the label kubernetes.io/hostname with a value of backstage-f8wx9. The In operator means that the value of the label must be in the specified list (in this case, just one value).

We can add as many values as we want to the values since it's a list:

Then add the following lines:

      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/hostname
                operator: In
                values:
                - backstage-f8wx9
                - prod-f72fj

This is another example using the topology.kubernetes.io/zone label.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 5
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/topology.kubernetes.io/zone
                operator: In
                values:
                - FRA
                - LON

The example above schedules the Pod on nodes in the FRA (Frankfurt) and LON (London) zones. It uses the label topology.kubernetes.io/zone, which is typically added to nodes by DigitalOcean.

You can always find the labels of your nodes using kubectl get nodes --show-labels and choose the ones that fit your scheduling needs. Also, remember that you can add custom labels to your nodes using the kubectl label node command, as we saw in the SSD example.

Another useful feature of matchExpressions is the ability to combine multiple expressions using logical AND. For example, you can specify that a Pod should be scheduled on nodes that meet multiple criteria simultaneously. Here is an example that deploys Nginx on nodes that have backstage-* hostnames and SSD disks at the same time. If backstage-f8wx9 is the only machine that has the label disktype=ssd, the Pod will be scheduled only on that node and nowhere else.

            - matchExpressions:
              - key: kubernetes.io/hostname
                operator: In
                values:
                -

Cloud-Native Microservices With Kubernetes - 2nd Edition

A Comprehensive Guide to Building, Scaling, Deploying, Observing, and Managing Highly-Available Microservices in Kubernetes

Enroll now to unlock all content and receive all future updates for free.

Unlock now $31.99 Learn More

Previous Next