Sidecar for Observability
Attach observability sidecars for logging, metrics, and distributed tracing: Envoy, Fluentd, OpenTelemetry collector, and service mesh integration.
The Sidecar Pattern
The sidecar pattern deploys a co-located helper process alongside the primary application container in the same Kubernetes pod. The sidecar shares the same network namespace, filesystem volumes, and lifecycle as the main container, but runs as a separate process. This allows you to augment application behavior — especially cross-cutting concerns like observability — without modifying application code.
The name comes from the motorcycle sidecar: a separate attachment that rides with the main vehicle, sharing its journey but with its own distinct purpose. In container terms, a pod is the motorcycle, and the sidecar is the helper container attached to it.
The Three Pillars of Observability
Sidecars are used to implement all three pillars of observability without instrumenting every service individually:
| Pillar | What It Tells You | Sidecar Tool | Backend |
|---|---|---|---|
| Logs | What happened, with full context | Fluentd, Fluent Bit, Logstash | Elasticsearch, Loki, Splunk |
| Metrics | How the system is behaving over time (rates, counts, durations) | Prometheus exporter, OTel Collector, StatsD | Prometheus, Datadog, CloudWatch |
| Traces | How a request flows across services (distributed context) | OTel Collector, Jaeger agent, Zipkin | Jaeger, Tempo, AWS X-Ray, Datadog APM |
Log Collection with Fluent Bit
Fluent Bit is a lightweight log processor commonly used as a sidecar. The main application writes structured JSON logs to stdout. Kubernetes captures stdout into log files on the node. The Fluent Bit sidecar (or a node-level DaemonSet) reads these log files, applies transformations (parsing, enriching with pod metadata), and ships them to a central log store.
# Kubernetes pod with Fluent Bit log sidecar
apiVersion: v1
kind: Pod
spec:
volumes:
- name: app-logs
emptyDir: {}
containers:
- name: app
image: my-service:v2.0
volumeMounts:
- name: app-logs
mountPath: /var/log/app
# App writes structured JSON to /var/log/app/service.log
- name: fluent-bit
image: fluent/fluent-bit:2.2
volumeMounts:
- name: app-logs
mountPath: /var/log/app
readOnly: true
env:
- name: ELASTICSEARCH_HOST
value: "https://logs.internal.example.com"
resources:
requests:
cpu: "50m"
memory: "64Mi"
limits:
cpu: "100m"
memory: "128Mi"OpenTelemetry Collector as Sidecar
The OpenTelemetry Collector is rapidly becoming the de facto observability sidecar. Applications instrument themselves using the OpenTelemetry SDK and export telemetry (traces, metrics, logs) to the local sidecar OTel Collector via the OTLP protocol. The collector handles all the complexity of fan-out routing, buffering, retry, and format translation to multiple backends.
The OTel Collector pipeline has three stages:
- Receivers: Accept telemetry in various formats (OTLP, Jaeger, Zipkin, Prometheus scrape, StatsD)
- Processors: Transform, sample, filter, and enrich telemetry (e.g., tail-based sampling, attribute enrichment, PII scrubbing)
- Exporters: Send processed telemetry to backends (Jaeger, Prometheus, Datadog, AWS X-Ray, Splunk)
# OpenTelemetry Collector sidecar config (otel-collector-config.yaml)
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
batch:
timeout: 1s
send_batch_size: 1024
memory_limiter:
limit_mib: 100
exporters:
jaeger:
endpoint: jaeger-collector.tracing.svc.cluster.local:14250
prometheusremotewrite:
endpoint: http://prometheus.monitoring.svc.cluster.local/api/v1/write
logging:
verbosity: normal
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [jaeger]
metrics:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [prometheusremotewrite]Service Mesh Sidecars: Envoy
Envoy proxy is the sidecar used by Istio and other service meshes. Unlike logging/metrics sidecars, Envoy sits in the network path — all inbound and outbound traffic for the pod flows through Envoy. This gives the service mesh deep visibility and control without any application code changes:
- Distributed tracing: Envoy automatically generates spans for every request and propagates trace context headers (B3, W3C TraceContext) across services
- Metrics: Envoy exports detailed per-service, per-route L7 metrics (request count, success rate, duration histograms) to Prometheus
- Access logs: Structured access logs for every request with upstream service metadata
- mTLS: Mutual TLS between all services — the application never handles certificates
- Traffic management: Retries, timeouts, circuit breaking configured at the mesh level
DaemonSet vs Sidecar for Log Collection
For log collection, you have two topologies: a Fluent Bit sidecar per pod (fine-grained control, more resource overhead) or a Fluent Bit DaemonSet with one collector per node (more efficient, less flexible). The DaemonSet pattern is preferred for homogeneous log formats. Use sidecar injection when pods have different log destinations or when you need log processing isolation between tenants.
Resource Overhead and Sidecar Injection
Sidecars are not free. Each sidecar container consumes CPU and memory resources. In a cluster with thousands of pods, the aggregate sidecar overhead can be significant. Design guidelines:
- Set resource requests and limits on sidecars. A Fluent Bit sidecar typically needs 50m CPU and 64Mi memory.
- Use sidecar injection via Kubernetes admission webhooks (as Istio does) so teams don't manually add sidecar specs to every pod manifest.
- Consider consolidating at the node level (DaemonSet) for high-density workloads.
- Kubernetes 1.29+ supports native sidecar containers (init containers with `restartPolicy: Always`) which have better lifecycle semantics than regular containers — they start before main containers and terminate after them.
Interview Tip
When discussing observability in a system design interview, mention the sidecar/OTel Collector pattern as the modern approach to telemetry collection. Key points: (1) application code only talks to the local OTLP endpoint — it doesn't need to know about backends; (2) the collector handles fan-out, sampling, and format translation; (3) this separates operational concerns (where do traces go?) from development concerns (how do I instrument my code?). Also mention that service meshes like Istio give you L7 metrics and traces for free without any app instrumentation.