Menu
AWS Architecture Blog·May 13, 2026

Push-Based Observability for CloudWatch Metrics to VPC-based OpenTelemetry Collectors

This article outlines a robust push-based observability architecture using AWS services to stream CloudWatch metrics to self-hosted OpenTelemetry collectors within a private VPC. It addresses common challenges of traditional pull-based monitoring at scale, such as API throttling and vendor lock-in, by leveraging CloudWatch Metric Streams, Amazon Kinesis Data Firehose, and AWS Lambda for real-time, cost-efficient data delivery.

Read original on AWS Architecture Blog

Organizations are increasingly adopting open-source observability frameworks like OpenTelemetry to reduce licensing costs and avoid vendor lock-in. This approach offers significant benefits for enterprises seeking to achieve sub-minute latency for real-time alerting and consolidate observability data from various sources. While CloudWatch Metric Streams natively support OpenTelemetry endpoints, self-hosting collectors within a Virtual Private Cloud (VPC) presents a connectivity challenge that requires an intermediary solution.

Push vs. Pull Monitoring Architectures

The article highlights the drawbacks of traditional pull-based monitoring, exemplified by Prometheus, at scale. Frequent API polling can lead to high costs, API throttling, metric loss, and gaps in observability data, failing to meet real-time alerting requirements. A push-based architecture, where metrics are actively sent to collectors, offers substantial advantages, especially for event-driven systems requiring near real-time data.

  • Event-driven architecture: Data transmission is triggered by events, enabling near real-time collection.
  • Cost efficiency: Reduces computational overhead and data transfer by processing and transmitting data only when relevant events occur.
  • Scalability: OpenTelemetry collectors can scale horizontally to handle varying traffic volumes, providing at-least-once delivery guarantees.
  • No licensing costs & Vendor neutrality: OpenTelemetry's Apache 2.0 license and open-source nature eliminate licensing fees and vendor lock-in, offering flexibility in choosing observability backends.

Solution Architecture for VPC-based Collectors

The proposed solution addresses the challenge of streaming CloudWatch metrics to private VPC-based OpenTelemetry collectors. It leverages an intermediary AWS Lambda function to bridge the gap, as Amazon Kinesis Data Firehose, while supporting HTTP endpoints, requires them to be public. This architecture ensures strict data privacy requirements are met by keeping the metric data and OpenTelemetry collector within the customer's VPC.

📌

Key Components and Their Roles

The architecture comprises: 1. CloudWatch Metric Streams: Streams metrics in near real-time, configured to output in JSON format to Firehose. 2. Amazon Kinesis Data Firehose: A fully managed service for reliable real-time data capture, transformation, and delivery. 3. AWS Lambda Transform Function: Invoked synchronously by Firehose to push metrics securely through an internal Network Load Balancer (NLB) to the VPC-based collector. This function preprocesses and filters data as needed. 4. OpenTelemetry Collector (on EC2): Runs as a container on EC2 instances within a private subnet, acting as a central hub to receive, process (via receivers, processors, exporters), and forward telemetry data to various backends (e.g., Amazon Managed Prometheus, AWS X-Ray, Amazon CloudWatch).

An internal Network Load Balancer (NLB) is crucial for distributing TCP traffic to the OpenTelemetry collectors running on EC2 instances. This setup provides a scalable and secure way to ingest metrics into a customer's private observability infrastructure, allowing aggregation of metrics from diverse sources into a single pane of glass.

AWSOpenTelemetryCloudWatchObservabilityMetricsKinesis FirehoseLambdaVPC

Comments

Loading comments...