Menu
Course/Infrastructure & DevOps/Kubernetes Orchestration

Kubernetes Orchestration

Kubernetes architecture: pods, services, deployments, StatefulSets, ConfigMaps, autoscaling, and the control plane. What problems K8s solves.

20 min readHigh interview weight

Why Kubernetes?

Running containers in production means answering: Which node has enough memory? How do I restart a crashed container? How do I roll out a new version without downtime? How do I scale under load? Kubernetes (K8s) is an open-source container orchestrator that answers all of these. It treats your cluster of machines as a single pool of compute and automatically places, heals, scales, and connects your containers.

Control Plane vs Data Plane

Loading diagram...
Kubernetes cluster architecture. The control plane manages state; worker nodes run workloads.
ComponentLocationRole
kube-apiserverControl planeThe REST API gateway — all kubectl and internal calls go through it
etcdControl planeDistributed key-value store that holds all cluster state
kube-schedulerControl planeAssigns unscheduled Pods to nodes based on resources and constraints
kube-controller-managerControl planeRuns reconciliation loops (ReplicaSet, Node, Endpoint controllers)
kubeletEach worker nodeEnsures containers declared for a node are running and healthy
kube-proxyEach worker nodeMaintains iptables/IPVS rules for Service virtual IP routing

Core Workload Resources

Pod is the atomic unit — one or more containers sharing a network namespace and storage. You almost never create Pods directly. Instead you use higher-level controllers:

  • Deployment — Manages stateless Pods. Handles rolling updates, rollbacks, and replica scaling. Use for web servers, APIs, workers.
  • StatefulSet — Provides stable network identities (`pod-0`, `pod-1`) and per-Pod persistent volumes. Use for databases (Cassandra, Kafka, Redis).
  • DaemonSet — Ensures exactly one Pod runs on every (or selected) node. Use for node-level agents: log shippers, metric collectors, CNI plugins.
  • Job / CronJob — Run-to-completion workloads. CronJob schedules them on a cron expression.
yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api-server
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1        # one extra Pod during rollout
      maxUnavailable: 0  # never drop below desired count
  template:
    metadata:
      labels:
        app: api-server
    spec:
      containers:
        - name: api
          image: myregistry/api:v2.1.0
          ports:
            - containerPort: 8080
          resources:
            requests:
              cpu: "250m"
              memory: "256Mi"
            limits:
              cpu: "1000m"
              memory: "512Mi"
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8080
            initialDelaySeconds: 10
            periodSeconds: 5
          readinessProbe:
            httpGet:
              path: /ready
              port: 8080
            periodSeconds: 5

Services and Ingress

A Service provides a stable virtual IP (ClusterIP) and DNS name in front of a dynamic set of Pods selected by labels. Kubernetes automatically updates the endpoint list as Pods come and go. Service types:

Service TypeAccessibilityUse Case
ClusterIP (default)Internal cluster onlyService-to-service communication
NodePortExternal via node IP + portDevelopment, bare-metal clusters
LoadBalancerExternal via cloud load balancerProduction external traffic
ExternalNameDNS alias to external serviceIntegrating external databases

An Ingress resource provides HTTP/HTTPS routing (path-based, host-based) in front of multiple Services via an Ingress controller (NGINX, AWS ALB, Traefik). It consolidates external access instead of provisioning one cloud load balancer per Service.

Configuration and Secrets

ConfigMaps hold non-sensitive configuration (feature flags, service URLs). Secrets hold sensitive data (passwords, API keys) — stored in etcd, base64-encoded (not encrypted by default; use etcd encryption at rest or an external vault like HashiCorp Vault). Both are injected into Pods as environment variables or mounted as files.

⚠️

Secrets are not truly secret without extra steps

Base64 encoding is not encryption. Enable etcd encryption at rest, use RBAC to restrict Secret access, and consider tools like sealed-secrets or HashiCorp Vault with the vault-agent-injector for production-grade secret management.

Autoscaling

Kubernetes provides three autoscaling dimensions. Horizontal Pod Autoscaler (HPA) scales the replica count of a Deployment based on CPU utilization or custom metrics from Prometheus. Vertical Pod Autoscaler (VPA) adjusts resource requests/limits based on actual usage. Cluster Autoscaler adds or removes worker nodes based on pending Pods and idle nodes — it works with cloud provider APIs (AWS, GCP, Azure).

💡

Interview Tip

Interviewers often ask how you'd handle a traffic spike. Walk through the autoscaling chain: HPA detects high CPU → scales Pods → if no node capacity, Cluster Autoscaler provisions a new node → new Pods schedule and become ready → traffic is balanced. Mention that HPA lag (scrape interval + scale cooldown) means pre-warming or KEDA (event-driven scaling) may be needed for sudden spikes.

📝

Knowledge Check

5 questions

Test your understanding of this lesson. Score 70% or higher to complete.

Ask about this lesson

Ask anything about Kubernetes Orchestration