VPA Modes

In the previous example, we used the Auto mode.

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: stateless-flask-vpa
  namespace: stateless-flask
spec:
  targetRef:
      apiVersion: "apps/v1"
      kind:       Deployment
      name:       stateless-flask
  updatePolicy:
      # This is the mode we used
      updateMode: "Auto"

There are four modes available in the Vertical Pod Autoscaler (VPA). Each mode defines how VPA manages the resource requests and limits of your Pods.

Auto

In Auto mode, the VPA automatically updates the requests and limits of Pods during both scale-up and scale-down events.

This mode simply works in an automated fashion:

Scale-up: When the VPA detects that a Pod is consistently using more CPU or memory than requested, it increases the Pod’s resource requests and limits.
Scale-down: When a Pod is underusing its allocated resources, the VPA can reduce its requests and limits. This frees up cluster resources for other workloads.

It’s useful for workloads that require high availability and cannot tolerate downtime, as well as for scenarios where you want to adjust resource requirements dynamically while minimizing the impact on running Pods. It’s particularly suitable for stateless applications that can gracefully handle updates.

Initial

In Initial mode, the VPA updates the requests

Cloud-Native Microservices With Kubernetes - 2nd Edition

A Comprehensive Guide to Building, Scaling, Deploying, Observing, and Managing Highly-Available Microservices in Kubernetes

Enroll now to unlock all content and receive all future updates for free.

Unlock now $31.99 Learn More

Previous Next