Adopt Karpenter Consolidation without Disrupting Critical Workloads

Table of Contents

Introduction

Autoscaling in Kubernetes, particularly in cloud-hosted Kubernetes like Amazon EKS, comes in two flavors:

  • Autoscale the app/workload within the cluster, using Horizontal / Vertical Pod Autoscalers (HPA / VPA)
  • Autoscale the cluster itself, by adding / removing worker nodes automatically as needed

Kubernetes cluster autoscaler is the go-to solution for the second kind of autoscaling. Karpenter is a better, more capable alternative to cluster autoscaler. Both solutions watch for pods pending due to lack of resources & provision nodes to meet pod requirements.

Karpenter can also ensure your cluster runs at max efficiency & min cost at all times by:

  • Auto-deleting empty nodes
  • Deleting nodes whose workloads can run on other nodes
  • Replace nodes with lower priced variants when possible

This is called “consolidation”.

Although great in theory, Karpenter consolidation should not be enabled unless your workloads can tolerate adhoc disruptions. Karpenter can terminate any pod anytime to consolidate the cluster. Karpenter does however, provide several mechanisms to control the disruption behavior.

Time-Bound Consolidation

Karpenter disruption budgets can be used to rate limit Karpenter’s disruption of nodes in 3 ways:

  • Define a max percentage of nodes that can be disrupted at a time
  • Define a max count of nodes that can be disrupted at a time
  • Define a max % or count of nodes that can be disrupted in a time window

Disruption budgets are defined in the Karpenter NodePool manifest. For example, this budget pauses all disruption from 3 to 6 AM UTC on Saturdays:

apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: default
spec:
  disruption:
    consolidationPolicy: WhenUnderutilized
    expireAfter: Never
    budgets:
    - nodes: "0"
      schedule: 0 3 * * sat
      duration: 3h

schedule is a cron expression that defines the starting point of the time window. duration defines how long after the schedule start, the budget is active.

When multiple budgets are defined, the most restrictive takes effect. Budgets can be used to define when the cluster should not consolidate. So if you prefer to consolidate your cluster only once a week between 3 to 6 AM UTC on Saturdays, your budget should define a time window covering all hours of the week except 3 to 6 AM on Saturdays:

budgets:                 # of the 168 hours in a week
- nodes: "0"             # pause consolidation
  duration: 165h         # for 165 hours
  schedule: 0 6 * * sat  # starting at 6 AM UTC every Saturday

Protect Critical Workloads

Depending on your workload characteristics & availability requirements, you can choose to let the cluster consolidate nightly, weekly, etc. For pods that must not be disrupted, even when consolidation is allowed, like nightly batch jobs etc, annotate them with:

karpenter.sh/do-not-disrupt: "true"

The same annotation can be applied to a node that should not be disrupted.

Conclusion

This article was an introduction to Karpenter & its consolidation feature. Since not all workloads are “built for the cloud”, disrupting them adhoc may cause availability issues. Here we learned how to reap the cost & efficiency benefits of consolidation, while still avoiding end-user interruptions during business hours.

About the Author ✍🏻

Harish KM is a Principal DevOps Engineer at QloudX & a top-ranked AWS Ambassador since 2020. 👨🏻‍💻

With over a decade of industry experience as everything from a full-stack engineer to a cloud architect, Harish has built many world-class solutions for clients around the world! 👷🏻‍♂️

With over 20 certifications in cloud (AWS, Azure, GCP), containers (Kubernetes, Docker) & DevOps (Terraform, Ansible, Jenkins), Harish is an expert in a multitude of technologies. 📚

These days, his focus is on the fascinating world of DevOps & how it can transform the way we do things! 🚀

Leave a Reply

Your email address will not be published. Required fields are marked *