Menu
Dev.to #architecture·April 3, 2026

Building Robust Observability with OpenTelemetry and ADOT Collector

This article details a system's evolution from a lack of observability in v1 to a robust, integrated solution in v2. It highlights the architectural decision to treat observability as core infrastructure from day one, using OpenTelemetry for traces, metrics, and logs, and the AWS Distro for OpenTelemetry (ADOT) collector for vendor-agnostic export to CloudWatch. Key takeaways include the importance of proper SDK initialization and selective instrumentation for effective noise reduction.

Read original on Dev.to #architecture

The article recounts a common anti-pattern: neglecting observability until production incidents make its absence painfully clear. In v1, the system suffered from unstructured logs, a lack of correlation IDs, inconsistent log levels, and manual metric checks, making debugging a guessing game. This illustrates the critical importance of integrating observability early in the system design process.

Observability as Core Infrastructure

A pivotal architectural shift in v2 was establishing observability as a first-class concern, integrated *before* core features. This ensures that debugging and monitoring capabilities are baked into the system's foundation. The chosen stack leverages OpenTelemetry (OTEL) for its vendor neutrality and comprehensive support for traces, metrics, and logs.

💡

Key Observability Pillars

For a resilient system, ensure you have these three pillars covered: - Traces: To follow requests across distributed services. - Metrics: For numerical data on system performance and health. - Logs: For detailed event records within services.

OpenTelemetry Architecture with ADOT Collector

The chosen architecture involves applications instrumented with OpenTelemetry SDKs exporting data via OTLP (OpenTelemetry Protocol, typically gRPC) to an ADOT (AWS Distro for OpenTelemetry) Collector. The collector, often run as a sidecar, acts as an intermediary, processing and exporting the data to AWS services like CloudWatch Logs, Metrics, and X-Ray. This design decouples the application from vendor-specific APIs, enhancing portability.

plaintext
App (OTEL SDK) 
    	  		 		  ↓ OTLP (gRPC)
 		 		 		 ADOT Collector (sidecar)
 		 		 		  ↓
CloudWatch (Logs, Metrics, X-Ray)

Critical Initialization Order

A common pitfall with auto-instrumentation is the order of operations. OpenTelemetry SDKs must be initialized *before* the application's main code loads to correctly patch modules and capture telemetry. The article demonstrates this with Node.js, using a `--require` flag to load an instrumentation file first, ensuring the SDK is active before frameworks like NestJS bootstrap.

SDK Configuration and Noise Reduction

Effective observability also involves intelligent configuration. The article highlights decisions made within the `createOtelSDK` factory, such as using `BatchSpanProcessor` for efficiency and selectively disabling noisy instrumentations (e.g., file system I/O) or ignoring specific requests (e.g., health checks) to prevent data overload and focus on relevant telemetry.

observabilityOpenTelemetryADOT CollectormonitoringtracingloggingmetricsAWS CloudWatch

Comments

Loading comments...