Feedback

Chat Icon

Cloud-Native Microservices With Kubernetes - 2nd Edition

A Comprehensive Guide to Building, Scaling, Deploying, Observing, and Managing Highly-Available Microservices in Kubernetes

Autoscaling Microservices in Kubernetes: Vertical Scaling
54%

VPA Limitations

The VPA has several limitations that should be considered before counting on it to scale your workloads vertically. Here is a summary of the main limitations:

  • Recreating Pods when updating resources: Whenever the VPA updates Pod resources, the affected Pods are deleted and recreated, which also restarts all running containers.

  • Pod recreation guarantee limitations: The VPA cannot guarantee that Pods it evicts or deletes to apply recommendations (when running in Auto or Recreate modes) will always be successfully recreated. This can happen for various reasons, such as insufficient cluster resources or user or app misconfigurations that stop containers from starting.

  • Limitations for non-controller Pods: The VPA does not evict Pods that are not managed by a controller (standalone Pods). For such Pods, Auto mode behaves the same as Initial.

  • Incompatibility with HPA on CPU or memory: The VPA should not be used together with the HPA when the latter is configured to scale based on CPU or memory. However, if the HPA uses custom or external metrics, it can indeed be combined with the VPA.

  • Incomplete handling of OOM events: The VPA reacts to most OOM events, but not in all cases. Thus, it is possible that Pods may experience OOM kills even when the VPA is in use.

  • Performance limitations in large clusters: VPA performance has not been thoroughly tested in very large clusters, and there is no strict definition of what constitutes a "large cluster." In general, the more Pods and nodes your cluster has, the more performance limitations you may experience. There's no specific threshold or limit defined for cluster size, but testing on your own workloads is recommended.

  • Recommendations exceeding available resources: VPA recommendations might exceed available resources (for example, node size, available capacity, or quotas), which can cause Pods to remain in a Pending state. This limitation can be partially mitigated by using the VPA together with the Cluster Autoscaler.

  • Undefined behavior with multiple VPA resources for the same Pod: Creating multiple VPA resources that target the same Pod leads to undefined behavior. It is strongly recommended to have only one VPA resource per Pod.

  • Number of replicas requirement: By default, the VPA applies changes only when there are at least two replicas of a Pod. You can modify this behavior by setting the minReplicas field in the updatePolicy section of the VPA spec.

Example:

Cloud-Native Microservices With Kubernetes - 2nd Edition

A Comprehensive Guide to Building, Scaling, Deploying, Observing, and Managing Highly-Available Microservices in Kubernetes

Enroll now to unlock all content and receive all future updates for free.