Menu
Azure Architecture Blog·May 20, 2026

Achieving System-Level Performance in Azure IaaS Workloads

This article discusses a system-level approach to performance in Azure IaaS, emphasizing that optimal performance is a result of compute, storage, and networking working together, rather than optimizing individual resources. It highlights how Azure engineers the platform to deliver consistent, scalable performance for diverse workloads, including AI, cloud-native, and business-critical systems, by coordinating infrastructure capabilities.

Read original on Azure Architecture Blog

Traditional approaches to cloud performance often involve simply provisioning more resources, like larger VMs or faster disks. However, modern workloads exhibit dynamic bottlenecks, where a system might be constrained by storage at one moment and network bandwidth the next. This necessitates a shift from resource-level optimization to a system-level approach, where performance is an outcome of how compute, storage, and networking interact and are coordinated.

Rethinking Performance in the Cloud

Cloud performance today extends beyond peak speed to encompass consistency, scalability, and responsiveness under real-world conditions. Key dimensions include low tail latency (P99/P99.9) for user experience, high throughput, ability to maintain performance as demand increases (scalability), and predictable performance under load (consistency). Equally important is "time-to-performance," or how quickly infrastructure can be provisioned, scaled, or recovered, which dictates responsiveness to change.

💡

Holistic Performance View

When designing systems, consider performance as a multi-dimensional challenge, not just a raw speed metric. Focus on consistency, scalability, latency (especially tail latency), and the time it takes for your infrastructure to adapt to changes.

Optimizing for Diverse Workloads

  • AI Workloads: Require massive parallel compute, high-throughput data access, and low-latency communication. Azure uses platform acceleration (e.g., Azure Boost offloading I/O to dedicated hardware) and high-throughput storage (Blob Storage, ADLS) with optimized parallel access. Low-latency, high-bandwidth networking (ExpressRoute) ensures rapid data movement between distributed nodes.
  • Cloud-Native Applications (e.g., Kubernetes): Demand dynamic scaling. Azure Container Storage enables Kubernetes workloads to use local NVMe disks for sub-millisecond latency. Advanced Container Networking Services (e.g., eBPF host routing in Cilium) improve datapath efficiency for microservices communication.
  • Business-Critical Systems: Prioritize predictability and reliability. Azure provides consistent compute via purpose-built VMs and intelligent placement (VMSS), tunable storage performance (Ultra Disk, Premium SSD v2 for independent capacity/IOPS/throughput configuration), and reliable low-latency networking (Accelerated Networking, proximity placement groups). Fast recovery is supported by Instant Access Snapshots and Azure Site Recovery.

The Coordinated System Approach

The core message is that performance is not achieved by optimizing isolated components. It relies on how compute, storage, and networking are tailored in tandem for specific workload needs. This coordination helps reduce bottlenecks, ensures improvements in one area are reinforced by others, and simplifies operations by allowing teams to focus on workload design rather than low-level infrastructure tuning. Practical guidance emphasizes balancing throughput for AI, horizontal scaling with Kubernetes-native storage for cloud-native apps, and consistency/predictability for business-critical systems.

AzureIaaSCloud PerformanceScalabilityAI InfrastructureKubernetesStorage PerformanceNetworking

Comments

Loading comments...