AKS VMSS node pools: 3 config models and real-world lessons with deallocated nodes


Reading Time: 2 minutes

Table of Contents

Intro

I recently ran into capacity issues in Azure and saw firsthand how AKS VMSS node pools behave when you use deallocated nodes to speed up start-times. I’ll walk through the three node-pool zone models documented by Microsoft, then share what really happened in my setup and how I worked around it.

  • The three AKS VMSS node-pool deployment methods
  • How deallocated nodes restart under each method
  • A practical, under-the-radar pattern to improve reliability

Zone-spanning node pools (single VMSS across all zones)

Azure lets you spread a single node pool across multiple availability zones by specifying all your desired zones with --zones. AKS automatically balances the number of nodes in each zone.

Information

Nodes are deployed and balanced across every zone you list in the --zones parameter.

Warning

If a zonal outage occurs, nodes within the affected zone might be impacted even though nodes in other zones stay healthy. And when you use deallocate mode, deallocated nodes restart only in their original zone, so they can stay offline during a zone’s capacity shortage.

Real story

I ran a zone-spanning pool with --scale-down-mode Deallocate. When Azure was capacity-constrained in one zone, those deallocated nodes never came back up and AKS kept retrying in the wrong zone. My jobs queued until capacity finally returned.

Zone-aligned node pools (VMSS pinned to specific zone[s])

You can add separate node pools, each pinned to a single zone, by creating one pool per zone and passing --zones for each.

Information

Each node pool handles only its assigned zone, giving you precise control over placement and latency.

Warning

Deallocated nodes still restart only in their pinned zone. If that zone hits capacity or suffers an outage, your pool can’t recover until the zone heals.

Real story

We switched to three zone-aligned pools, thinking AKS would pick a healthy zone to spin up deallocated nodes. It didn’t. Each pool stayed in its own zone, and scaling failed when any one zone ran out of capacity.

Regional node pools (no availability zones)

When you omit the --zones parameter (or set it to null or an empty list), AKS creates a regional VMSS. Instances show up with a zone label of 0.

Information

Instances are regional and can be implicitly placed in any zone within the region, though there’s no guarantee of even spread.

Warning

In a full zonal outage, any or all instances might be affected because they aren’t tied to a specific zone.

Real story

My jobs were stateless, single-replica workloads. I removed zone assignments so the pool became regional. When deallocated nodes restarted, Azure placed them in whichever zone had capacity. Job reliability immediately improved.

Summary

Model Zone resilience Deallocated node restart behavior Good for
Zone-spanning Yes (auto-spread) Restarts in same zone – can stall if zone full Stateless multi-zone workloads
Zone-aligned Yes (fixed zone pins) Restarts only in that zone – brittle if busy Strict zone isolation needs
Regional Regional (no pinning) Restarts anywhere region-wide (best odds) Stateless jobs and burst workloads

Why deallocate mode matters

When you set --scale-down-mode Deallocate, nodes are stopped but not deleted. That preserves cached disks, avoids repeated image pulls, and gives much faster boots. For VMSS, existing VMs restart instead of being rebuilt, cutting cold-start times dramatically.

The catch is zone capacity. If Azure can’t allocate in a node’s home zone, the deallocated node sits offline until that zone frees up. That’s what tripped me up until I switched to a regional pool.

Final take-away

  • Zone-spanning + Deallocate = risky when any zone hits capacity limits
  • Zone-aligned = predictable but brittle if your chosen zone is busy
  • Regional + Deallocate = unofficial but highly reliable for stateless job workloads


Share this content:

I am a passionate blogger with extensive experience in web design. As a seasoned YouTube SEO expert, I have helped numerous creators optimize their content for maximum visibility.

Leave a Comment