Immutable Infrastructure
Never modify running servers: build new images, deploy, destroy old ones. Golden images, container images, and infrastructure-as-code integration.
The Problem with Mutable Servers
In traditional operations, servers are mutated in place: you SSH in, install packages, update config files, patch the OS. Over time, servers accumulate changes. Two servers that started identically diverge. This phenomenon — sometimes called configuration drift or snowflake servers — is the root cause of many operational problems: deployments that work on one server but fail on another, production incidents that cannot be reproduced in staging, and manual changes that are never committed to version control.
The Snowflake Server Anti-Pattern
A 'snowflake server' is one that has been manually configured over years to the point where no one knows exactly what is on it, and no one dares replace it. These servers become single points of failure — not because they are unreliable, but because they are irreplaceable. Teams treat them like pets, giving them names and nursing them back to health when they fail.
Immutable Infrastructure: The Core Principle
Immutable infrastructure means: once a server (or container) is deployed, it is never modified. To deploy a new version or change configuration, you build a new image and deploy it alongside or in place of the old one. The old instances are terminated. There is no patching in place, no SSH-ing in to fix things, no `apt-get upgrade` on running servers.
This is already how most teams work with Docker containers — you build a new container image, push it, deploy it, delete the old container. Immutable infrastructure extends this discipline to the full infrastructure layer, including VM images (AMIs on AWS), Kubernetes nodes, and all supporting infrastructure.
Golden Images
A golden image (or golden AMI) is a pre-baked machine image that contains the OS, runtime dependencies, application code, and base configuration — everything needed to boot a server that is immediately ready to serve traffic. Golden images eliminate the bootstrap time of traditional deployment (no need to run Ansible playbooks or Chef recipes after boot).
- Layer 1 — Base OS image: Hardened OS (Amazon Linux 2023, Ubuntu 22.04), security patches applied at build time
- Layer 2 — Runtime image: Add language runtime (JDK 21, Node 22), system dependencies
- Layer 3 — Application image: Add application JAR/binary, startup scripts
- Layer 4 — Configuration: Either baked into the image (rare) or pulled from an external config store at boot
HashiCorp Packer is the standard tool for building golden AMIs. It starts an EC2 instance, runs a provisioner (shell scripts or Ansible), and creates an AMI from the result. The AMI is then stored in AWS with version metadata and used by Auto Scaling launch templates.
// Packer template for a golden AMI
{
"variables": {
"app_version": "{{env `APP_VERSION`}}"
},
"builders": [{
"type": "amazon-ebs",
"region": "us-east-1",
"source_ami_filter": {
"filters": { "name": "amzn2-ami-hvm-*-x86_64-gp2" },
"owners": ["amazon"],
"most_recent": true
},
"instance_type": "t3.medium",
"ami_name": "my-app-{{user `app_version`}}-{{timestamp}}"
}],
"provisioners": [
{
"type": "shell",
"scripts": ["scripts/install-deps.sh", "scripts/install-app.sh"]
}
]
}Immutable Infrastructure with Containers
Containers are the most natural expression of immutable infrastructure. A Docker image is immutable by definition — once built and pushed, it cannot change (the image layers are content-addressed by SHA256 digest). Kubernetes enforces this model: you change the image tag in the Deployment spec, and Kubernetes rolls out new pods from the new image.
Key practices for truly immutable container infrastructure:
- Never use `latest` tag in production. Always pin to a specific digest or version tag so deployments are reproducible.
- Make containers read-only where possible. Set `readOnlyRootFilesystem: true` in the Kubernetes security context to prevent runtime modification.
- Externalize all mutable state. Logs go to stdout/stderr (collected by a sidecar or node agent). Config comes from ConfigMaps or external config stores. Data goes to persistent volumes or external databases.
- Scan images in the CI pipeline. Use Trivy, Snyk, or AWS ECR image scanning to catch vulnerabilities at build time, not at runtime.
Infrastructure as Code: The Enabler
Immutable infrastructure requires infrastructure as code (IaC) to be practical. If you cannot reproduce your infrastructure from code, you cannot safely destroy the old instances. Terraform, AWS CloudFormation, and Pulumi define infrastructure declaratively. Combined with immutable images, this means your entire system — from VPC subnets to application servers — can be rebuilt from source control.
| Mutable Infrastructure | Immutable Infrastructure |
|---|---|
| SSH in to fix problems | Build new image, deploy, destroy old |
| Configuration drift over time | Every instance from same image — no drift |
| 'Works on my (prod) server' | Reproducible — same image, everywhere |
| Patching is risky and ad-hoc | Patching = rebuild image + rolling deploy |
| Rollback is complex | Rollback = deploy previous image version |
Interview Tip
If asked about immutable infrastructure, connect it to broader operational benefits: reduced MTTR (you replace rather than repair), reproducible environments (staging matches production exactly), and security (patching is a first-class CI/CD operation, not a manual afterthought). Also mention the constraint: stateful data must be externalized — you cannot bake application state into an immutable image. This shows you understand the full picture.