Course/Deployment & Operations Patterns/Sidecar for Observability

Sidecar for Observability

Attach observability sidecars for logging, metrics, and distributed tracing: Envoy, Fluentd, OpenTelemetry collector, and service mesh integration.

10 min read

The Sidecar Pattern

The sidecar pattern deploys a co-located helper process alongside the primary application container in the same Kubernetes pod. The sidecar shares the same network namespace, filesystem volumes, and lifecycle as the main container, but runs as a separate process. This allows you to augment application behavior — especially cross-cutting concerns like observability — without modifying application code.

The name comes from the motorcycle sidecar: a separate attachment that rides with the main vehicle, sharing its journey but with its own distinct purpose. In container terms, a pod is the motorcycle, and the sidecar is the helper container attached to it.

Loading diagram...

Sidecar OTel Collector collects telemetry from the app and fans it out to multiple backends without app code changes.

The Three Pillars of Observability

Sidecars are used to implement all three pillars of observability without instrumenting every service individually:

Pillar	What It Tells You	Sidecar Tool	Backend
Logs	What happened, with full context	Fluentd, Fluent Bit, Logstash	Elasticsearch, Loki, Splunk
Metrics	How the system is behaving over time (rates, counts, durations)	Prometheus exporter, OTel Collector, StatsD	Prometheus, Datadog, CloudWatch
Traces	How a request flows across services (distributed context)	OTel Collector, Jaeger agent, Zipkin	Jaeger, Tempo, AWS X-Ray, Datadog APM

Log Collection with Fluent Bit

Fluent Bit is a lightweight log processor commonly used as a sidecar. The main application writes structured JSON logs to stdout. Kubernetes captures stdout into log files on the node. The Fluent Bit sidecar (or a node-level DaemonSet) reads these log files, applies transformations (parsing, enriching with pod metadata), and ships them to a central log store.

yaml

# Kubernetes pod with Fluent Bit log sidecar
apiVersion: v1
kind: Pod
spec:
  volumes:
    - name: app-logs
      emptyDir: {}
  containers:
    - name: app
      image: my-service:v2.0
      volumeMounts:
        - name: app-logs
          mountPath: /var/log/app
      # App writes structured JSON to /var/log/app/service.log

    - name: fluent-bit
      image: fluent/fluent-bit:2.2
      volumeMounts:
        - name: app-logs
          mountPath: /var/log/app
          readOnly: true
      env:
        - name: ELASTICSEARCH_HOST
          value: "https://logs.internal.example.com"
      resources:
        requests:
          cpu: "50m"
          memory: "64Mi"
        limits:
          cpu: "100m"
          memory: "128Mi"

OpenTelemetry Collector as Sidecar

The OpenTelemetry Collector is rapidly becoming the de facto observability sidecar. Applications instrument themselves using the OpenTelemetry SDK and export telemetry (traces, metrics, logs) to the local sidecar OTel Collector via the OTLP protocol. The collector handles all the complexity of fan-out routing, buffering, retry, and format translation to multiple backends.

The OTel Collector pipeline has three stages:

Receivers: Accept telemetry in various formats (OTLP, Jaeger, Zipkin, Prometheus scrape, StatsD)
Processors: Transform, sample, filter, and enrich telemetry (e.g., tail-based sampling, attribute enrichment, PII scrubbing)
Exporters: Send processed telemetry to backends (Jaeger, Prometheus, Datadog, AWS X-Ray, Splunk)

yaml

# OpenTelemetry Collector sidecar config (otel-collector-config.yaml)
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  batch:
    timeout: 1s
    send_batch_size: 1024
  memory_limiter:
    limit_mib: 100

exporters:
  jaeger:
    endpoint: jaeger-collector.tracing.svc.cluster.local:14250
  prometheusremotewrite:
    endpoint: http://prometheus.monitoring.svc.cluster.local/api/v1/write
  logging:
    verbosity: normal

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [memory_limiter, batch]
      exporters: [jaeger]
    metrics:
      receivers: [otlp]
      processors: [memory_limiter, batch]
      exporters: [prometheusremotewrite]

Service Mesh Sidecars: Envoy

Envoy proxy is the sidecar used by Istio and other service meshes. Unlike logging/metrics sidecars, Envoy sits in the network path — all inbound and outbound traffic for the pod flows through Envoy. This gives the service mesh deep visibility and control without any application code changes:

Distributed tracing: Envoy automatically generates spans for every request and propagates trace context headers (B3, W3C TraceContext) across services
Metrics: Envoy exports detailed per-service, per-route L7 metrics (request count, success rate, duration histograms) to Prometheus
Access logs: Structured access logs for every request with upstream service metadata
mTLS: Mutual TLS between all services — the application never handles certificates
Traffic management: Retries, timeouts, circuit breaking configured at the mesh level

💡

DaemonSet vs Sidecar for Log Collection

For log collection, you have two topologies: a Fluent Bit sidecar per pod (fine-grained control, more resource overhead) or a Fluent Bit DaemonSet with one collector per node (more efficient, less flexible). The DaemonSet pattern is preferred for homogeneous log formats. Use sidecar injection when pods have different log destinations or when you need log processing isolation between tenants.

Resource Overhead and Sidecar Injection

Sidecars are not free. Each sidecar container consumes CPU and memory resources. In a cluster with thousands of pods, the aggregate sidecar overhead can be significant. Design guidelines:

Set resource requests and limits on sidecars. A Fluent Bit sidecar typically needs 50m CPU and 64Mi memory.
Use sidecar injection via Kubernetes admission webhooks (as Istio does) so teams don't manually add sidecar specs to every pod manifest.
Consider consolidating at the node level (DaemonSet) for high-density workloads.
Kubernetes 1.29+ supports native sidecar containers (init containers with `restartPolicy: Always`) which have better lifecycle semantics than regular containers — they start before main containers and terminate after them.

💡

Interview Tip

When discussing observability in a system design interview, mention the sidecar/OTel Collector pattern as the modern approach to telemetry collection. Key points: (1) application code only talks to the local OTLP endpoint — it doesn't need to know about backends; (2) the collector handles fan-out, sampling, and format translation; (3) this separates operational concerns (where do traces go?) from development concerns (how do I instrument my code?). Also mention that service meshes like Istio give you L7 metrics and traces for free without any app instrumentation.

External Configuration Store

Distributed Consensus: Paxos & Raft