AKS cost optimization: the control plane is free, the nodes are not
The AKS control plane is free (or ~$73/month for the SLA tier), but worker node VMs are the real bill. Here's how to cut AKS cost with the autoscaler, Spot node pools, right-sizing, reservations, and scale-to-zero.
Quick answer
The AKS control plane is free (or ~$73/month for the Standard uptime-SLA tier), but that's not your bill — the worker node VMs are. Three Standard_D4s_v5 nodes run ~$500/month. Cut AKS cost by enabling the cluster autoscaler, adding a Spot node pool for fault-tolerant pods (up to 90% off), right-sizing the VM family, reserving the steady baseline, and scaling non-prod to zero off-hours.
AKS has a reputation for being "basically free" because the control plane has a free tier. That framing hides the actual cost: every AKS cluster bills as a set of Virtual Machine Scale Set nodes, and those nodes follow the same pricing as any other Azure VM. The cluster fee is at most ~$73/month; the nodes can be thousands. Optimizing AKS is really optimizing the node pools.
Where the money goes
- Control plane: Free tier ($0, no SLA) or Standard tier (~$0.10/hour, ~$73/month, 99.95% SLA) per cluster.
- Node pools (the bill): node count × the VM size's hourly rate. A Standard_D4s_v5 (4 vCPU, 16 GB) is ~$0.23/hour, so three of them is ~$500/month before disks.
- Managed disks: each node's OS disk, plus any persistent volumes, billed per tier.
- Networking: the cluster load balancer, public IPs, and egress bandwidth.
1. Turn on the cluster autoscaler
The default failure mode is a node pool sized for peak with autoscaling off, or min and max set equal. That pays for peak capacity 24/7. The cluster autoscaler adds nodes when pods are pending and removes them when they're idle, typically cutting node spend 30-50% on workloads that aren't flat. Set a realistic min for system pods and a generous max for bursts.
2. Run a Spot node pool for interruptible work
A second user node pool with the Spot priority runs evictable VMs at up to ~90% off pay-as-you-go. Batch jobs, CI runners, and stateless services that tolerate eviction belong there; keep system and stateful pods on a regular pool. This mirrors the spot-instance trade-off on AWS: huge discount, must tolerate interruption.
3. Right-size the VM family
Match the node SKU to whether pods are CPU-, memory-, or burst-bound. A memory-heavy workload on compute-optimized nodes wastes RAM you're not paying efficiently for, and vice versa. B-series burstable nodes suit dev clusters; D-series for general production; E-series for memory-bound services. See Azure VM types compared for the family breakdown.
4. Commit the baseline, scale-to-zero the rest
For the nodes that genuinely run 24/7, a 1- or 3-year reservation or savings plan is 40-60% off. For everything else, user node pools can scale to zero — a batch pool that only runs nightly costs nothing the other 22 hours. Non-production clusters can be deallocated entirely off-hours with a schedule.
5. Consider Azure Hybrid Benefit on Windows node pools
If you run Windows containers, Windows nodes carry a per-core license surcharge. Azure Hybrid Benefit lets you apply existing licenses to those nodes — see Azure Hybrid Benefit explained. Linux node pools avoid the surcharge entirely.
FAQ
Does the AKS control plane cost anything?
On the Free tier the control plane is no-charge, but it carries no uptime SLA. The Standard tier adds a financially-backed 99.95% SLA for about $0.10/hour (~$73/month) per cluster. Either way, the control plane is a rounding error next to the worker node VMs, which are where the AKS bill actually lives.
Why is my AKS cluster so expensive?
Because AKS bills as the underlying Virtual Machine Scale Set nodes, and node pools are usually sized for peak and left there. Three Standard_D4s_v5 nodes running 24/7 are roughly $500/month before disks, load balancers, and egress. Idle node headroom — pools that never scale down — is the most common AKS overspend.
How do I reduce AKS costs?
Enable the cluster autoscaler so node pools scale to real demand, run a Spot node pool for fault-tolerant workloads (up to ~90% off), right-size the VM family, buy reservations or savings plans for the steady baseline, and scale non-production clusters to zero off-hours. The autoscaler and Spot pools together are usually the biggest win.
Can AKS node pools scale to zero?
Yes. User node pools can scale to zero nodes when no pods need them, which is ideal for batch or burst workloads. The system node pool must keep at least one node for cluster services, so it can't reach zero.
Is AKS cheaper than running VMs directly?
AKS itself adds only the optional control-plane SLA fee; the compute is the same VM pricing you'd pay anyway. The savings come from bin-packing many workloads onto shared nodes and autoscaling them, which is hard to replicate with standalone VMs.
How does C3X estimate AKS cost?
C3X prices the azurerm_kubernetes_cluster default node pool and any azurerm_kubernetes_cluster_node_pool from their VM size and node count, plus the control-plane tier — so you see the node bill, not just the cluster fee, before deploying.
What to do next
Before you tune autoscaling or buy reservations, see the node bill for what it is. C3X prices the azurerm_kubernetes_cluster and its node pools from your Terraform — VM size, node count, and control-plane tier — so the "basically free" cluster shows its real monthly cost up front. The quickstart gets you there in minutes.
Share this post
Try C3X on your own Terraform
Free and open source. No API key required. One command to install, one command to estimate.