Terraform Module for a Ready-to-Use Amazon EKS Cluster, with EKS Fargate & AWS IRSA, & Karpenter, with Spot Nodes & ABS

Table of Contents

Introduction

I recently spent a few days writing the “perfect” Terraform module for a complete, end-to-end, ready-to-use, EKS cluster, with a number of best practices & optimizations built-in. This is of course, a very subjective topic, since your needs will clearly vary from ours. This module for example, uses AWS IRSA for all service accounts, includes Karpenter for autoscaling, & configures Karpenter to use spot nodes that provide a specific range of resources (vCPU & memory), that suit our (flexible) workloads well.

Although writing a piece of code like this, should be as simple as copying your source Terraform module’s examples & tweaking them, that certainly wasn’t the case here. With all the assembly, fine-tuning, & trial & error that went into this, I thought it’s best to document this for everyone else in a similar situation. The complete code for this setup, is hosted on our GitHub.

EKS Cluster

Let’s start with the EKS cluster. Here is some minimal configuration to get started:

module "eks_cluster" {
  source = "terraform-aws-modules/eks/aws"

  cluster_name    = var.eks_cluster_name
  cluster_version = var.eks_cluster_version

  vpc_id     = var.vpc_id
  subnet_ids = var.private_subnets

  control_plane_subnet_ids       = var.public_subnets
  cluster_endpoint_public_access = true

  cluster_enabled_log_types = [] # Disable logging
  cluster_encryption_config = {} # Disable secrets encryption
}

Cluster logging & encryption are disabled here. Logging can be enabled by adding the types of logs to enable in the log_types list. Encryption can be enabled by simply leaving out the encryption_config section. The module will then create a new KMS CMK & use it to encrypt all EKS secrets.

EKS Fargate

The plan here, is to run all workloads in Karpenter-managed spot nodes. But since Karpenter itself is a deployment, it needs to run somewhere, before it can start provisioning nodes for other workloads. So, we create Fargate profiles, 1 for Karpenter & another for the foundational cluster addon CoreDNS:

module "eks_cluster" {
  ...

  # Fargate profiles use the cluster's primary security group
  # ...so these are never utilized:
  create_cluster_security_group = false
  create_node_security_group    = false

  fargate_profiles = {
    kube-system = {
      selectors = [
        { namespace = "kube-system" }
      ]
    }
    karpenter = {
      selectors = [
        { namespace = "karpenter" }
      ]
    }
  }
}

AWS IRSA

Let us now create IAM roles for service accounts of all cluster addons we plan to use. First, VPC CNI:

module "vpc_cni_irsa" {
  source    = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
  role_name = "${var.eks_cluster_name}-EKS-IRSA-VPC-CNI"
  tags      = var.tags

  attach_vpc_cni_policy = true
  vpc_cni_enable_ipv4   = true

  oidc_providers = {
    cluster-oidc-provider = {
      provider_arn               = module.eks_cluster.oidc_provider_arn
      namespace_service_accounts = ["kube-system:aws-node"]
    }
  }
}

The other 2 foundational addons: kube-proxy & CoreDNS, don’t need any AWS access, so we’ll create an IRSA for them that denies all permissions:

module "deny_all_irsa" {
  source    = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
  role_name = "${var.eks_cluster_name}-EKS-IRSA-DenyAll"
  tags      = var.tags

  role_policy_arns = {
    policy = "arn:aws:iam::aws:policy/AWSDenyAll"
  }

  oidc_providers = {
    cluster-oidc-provider = {
      provider_arn               = module.eks_cluster.oidc_provider_arn
      namespace_service_accounts = []
    }
  }
}

No need to create a Karpenter IRSA here. The Karpenter Terraform module we use later, will do that for us.

Cluster Addons

With the required IRSAs ready, add the cluster addons to the cluster:

module "eks_cluster" {
  ...

  cluster_addons = {
    kube-proxy = {
      most_recent = true

      resolve_conflicts_on_create = "OVERWRITE"
      resolve_conflicts_on_update = "OVERWRITE"
      service_account_role_arn    = module.deny_all_irsa.iam_role_arn
    }

    vpc-cni = {
      most_recent = true

      resolve_conflicts_on_create = "OVERWRITE"
      resolve_conflicts_on_update = "OVERWRITE"
      service_account_role_arn    = module.vpc_cni_irsa.iam_role_arn
    }

    coredns = {
      most_recent = true

      resolve_conflicts_on_create = "OVERWRITE"
      resolve_conflicts_on_update = "OVERWRITE"
      service_account_role_arn    = module.deny_all_irsa.iam_role_arn

      configuration_values = jsonencode({
        computeType = "Fargate"
      })
    }
  }
}

Note that we have configured CoreDNS to run on Fargate.

Kubernetes Providers

To install Karpenter, modify the aws-auth ConfigMap, & create Karpenter provisioners & node templates, you’ll need the Kubernetes, Helm & kubectl Terraform providers, so here they are:

data "aws_eks_cluster_auth" "my_cluster" {
  name = module.eks_cluster.cluster_name
}

provider "kubernetes" {
  host  = module.eks_cluster.cluster_endpoint
  token = data.aws_eks_cluster_auth.my_cluster.token

  cluster_ca_certificate = base64decode(module.eks_cluster.cluster_certificate_authority_data)
}

provider "helm" {
  kubernetes {
    host  = module.eks_cluster.cluster_endpoint
    token = data.aws_eks_cluster_auth.my_cluster.token

    cluster_ca_certificate = base64decode(module.eks_cluster.cluster_certificate_authority_data)
  }
}

provider "kubectl" {
  host  = module.eks_cluster.cluster_endpoint
  token = data.aws_eks_cluster_auth.my_cluster.token

  load_config_file       = false
  cluster_ca_certificate = base64decode(module.eks_cluster.cluster_certificate_authority_data)
}

Install Karpenter

First, we create all Karpenter-related resources using the Karpenter module:

module "karpenter" {
  source       = "terraform-aws-modules/eks/aws//modules/karpenter"
  tags         = var.tags
  cluster_name = module.eks_cluster.cluster_name

  irsa_oidc_provider_arn       = module.eks_cluster.oidc_provider_arn
  iam_role_additional_policies = ["arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"]

  policies = {
    AmazonSSMManagedInstanceCore = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
  }
}

Then we actually install the Karpenter Helm chart:

data "aws_ecrpublic_authorization_token" "ecr_auth_token" {}

resource "helm_release" "karpenter" {
  namespace        = "karpenter"
  create_namespace = true

  name       = "karpenter"
  repository = "oci://public.ecr.aws/karpenter"
  chart      = "karpenter"
  version    = var.karpenter_version

  repository_username = data.aws_ecrpublic_authorization_token.ecr_auth_token.user_name
  repository_password = data.aws_ecrpublic_authorization_token.ecr_auth_token.password

  lifecycle {
    ignore_changes = [repository_password]
  }

  set {
    name  = "settings.aws.clusterName"
    value = module.eks_cluster.cluster_name
  }

  set {
    name  = "settings.aws.clusterEndpoint"
    value = module.eks_cluster.cluster_endpoint
  }

  set {
    name  = "serviceAccount.annotations.eks\\.amazonaws\\.com/role-arn"
    value = module.karpenter.irsa_arn
  }

  set {
    name  = "settings.aws.defaultInstanceProfile"
    value = module.karpenter.instance_profile_name
  }

  set {
    name  = "settings.aws.interruptionQueueName"
    value = module.karpenter.queue_name
  }
}

Also remember to update the EKS module to:

  • Tag the security groups so Karpenter can discovery them
  • Add Karpenter’s role to aws-auth so Karpenter can function
module "eks_cluster" {
  ...

  tags = merge(var.tags, {
    "karpenter.sh/discovery" = var.eks_cluster_name
  })

  manage_aws_auth_configmap = true

  aws_auth_roles = [
    {
      rolearn  = module.karpenter.role_arn
      username = "system:node:{{EC2PrivateDNSName}}"
      groups   = ["system:nodes", "system:bootstrappers"]
    }

    # Add your org roles here to allow them cluster access
  ]
}

Configure Karpenter

This Karpenter provisioner looks for spot nodes. It also implements attribute-based instance type selection (ABS): it only picks nodes within a certain range of vCPUs & memory:

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: default
spec:
  providerRef:
    name: default
  consolidation:
    enabled: true
  requirements:
  - key: karpenter.sh/capacity-type
    operator: In
    values: ["spot"]
  # Only pick nodes with 4-16 vCPUs
  - key: karpenter.k8s.aws/instance-cpu
    operator: Gt
    values: ['3']
  - key: karpenter.k8s.aws/instance-cpu
    operator: Lt
    values: ['17']
  # Only pick nodes with 8-32G memory
  - key: karpenter.k8s.aws/instance-memory
    operator: Gt
    values: ['7168'] # 7G
  - key: karpenter.k8s.aws/instance-memory
    operator: Lt
    values: ['33792'] # 33G

To learn more about ABS in Karpenter, see:

You’ll also need an AWS node template, so Karpenter can place & configure nodes as needed:

apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
  name: default
spec:
  subnetSelector:
    karpenter.sh/discovery: ${eks_cluster_name}
  securityGroupSelector:
    karpenter.sh/discovery: ${eks_cluster_name}
  tags:
    %{ for key, val in tags ~}
    ${key}: ${val}
    %{ endfor ~}
    karpenter.sh/discovery: ${eks_cluster_name}
  blockDeviceMappings:
  - deviceName: /dev/xvda
    ebs:
      encrypted: true
      volumeType: gp3
      volumeSize: 100Gi # Change to suit your app's needs
      deleteOnTermination: true

And finally, we create the provisioner & node template in the cluster:

resource "kubectl_manifest" "karpenter_provisioner" {
  depends_on = [helm_release.karpenter]
  yaml_body  = file("${path.module}/karpenter-provisioner.yaml")
}

resource "kubectl_manifest" "karpenter_node_template" {
  depends_on = [helm_release.karpenter]

  yaml_body = templatefile("${path.module}/karpenter-node-template.yaml", {
    tags             = var.tags
    eks_cluster_name = module.eks_cluster.cluster_name
  })
}

Conclusion

In this post, you learnt how to provision an EKS cluster with Terraform, complete with properly configured cluster addons on Fargate with IRSA, Karpenter for autoscaling, & pre-configured for cost-optimized spot nodes with ABS.

About the Author ✍🏻

Harish KM is a Principal DevOps Engineer at QloudX & a top-ranked AWS Ambassador since 2020. 👨🏻‍💻

With over a decade of industry experience as everything from a full-stack engineer to a cloud architect, Harish has built many world-class solutions for clients around the world! 👷🏻‍♂️

With over 20 certifications in cloud (AWS, Azure, GCP), containers (Kubernetes, Docker) & DevOps (Terraform, Ansible, Jenkins), Harish is an expert in a multitude of technologies. 📚

These days, his focus is on the fascinating world of DevOps & how it can transform the way we do things! 🚀

Leave a Reply

Your email address will not be published. Required fields are marked *