Menu
👩‍💻Dev.to #systemdesign·February 26, 2026

Achieving Resilience: REST vs. Kafka for Database Failures

This article empirically compares the resilience of synchronous REST APIs versus event-driven architectures (EDA) using Kafka during database outages. Through a chaos engineering experiment simulating real-time location tracking, it demonstrates how Kafka's asynchronous decoupling and message buffering, coupled with a circuit breaker pattern, enables graceful degradation and zero client-facing errors compared to immediate failures in a tightly coupled REST setup. The core takeaway is that EDAs can significantly enhance system resilience against backend service failures.

Read original on Dev.to #systemdesign

The Challenge of Database Downtime

Database failures are an inevitable reality in complex systems. Traditional synchronous REST APIs often suffer from tight coupling, meaning a database outage directly translates to API failures and degraded user experience. This article explores a practical comparison of how two architectural styles - synchronous REST and asynchronous event-driven with Kafka - handle such failures, highlighting the critical role of decoupling for system resilience.

Experiment Design: Simulating Uber's Location Tracking

To quantify resilience, a chaos engineering experiment was set up, mimicking Uber's real-time driver location tracking system. Both architectures were subjected to a sustained load of 50 requests/second and a simulated PostgreSQL crash for 120 seconds. Key metrics included total requests, successful requests, error rate, and latency. This approach provides empirical evidence of architectural robustness under stress.

Architecture A: Synchronous REST

  • Direct HTTP POST to REST API
  • Immediate database INSERT
  • Response after database confirmation
  • Result: 50% error rate, immediate client errors, no request buffering.

Architecture B: Asynchronous Kafka

  • HTTP POST to Producer API
  • Event published to Kafka topic
  • Consumer processes events with circuit breaker pattern
  • Asynchronous database INSERT
  • Result: 0% error rate, all requests buffered and processed, graceful recovery.
💡

Why Event-Driven Architectures Excel

The experiment clearly demonstrates that event-driven architectures with message queues like Kafka provide superior resilience during database failures. By decoupling producers from consumers, Kafka acts as a buffer, preventing client-facing errors and allowing the system to recover gracefully once the database is restored. This shifts from an immediate failure to an eventual consistency model, which is often acceptable for high-throughput, fault-tolerant systems.

The Role of the Circuit Breaker Pattern

A crucial component in the Kafka architecture's success was the implementation of the Circuit Breaker pattern. When the consumer detected database failures, the circuit breaker transitioned to an OPEN state, preventing further attempts to write to the unavailable database and allowing events to queue up in Kafka. Upon database recovery, it transitioned to HALF_OPEN, cautiously testing connectivity before resuming full event processing. This pattern is essential for preventing cascading failures in distributed systems. A simplified JavaScript implementation was provided:

javascript
class CircuitBreaker {
  constructor(threshold = 5, timeout = 30000, resetTimeout = 60000) {
    this.state = 'CLOSED';
    this.failureThreshold = threshold;
    this.resetTimeout = resetTimeout;
    this.failureCount = 0;
    this.nextAttempt = Date.now();
  }

  async execute(fn) {
    if (this.state === 'OPEN') {
      if (Date.now() < this.nextAttempt) {
        throw new Error('Circuit breaker OPEN - database unavailable');
      }
      this.state = 'HALF_OPEN';
    }
    try {
      const result = await fn();
      this.onSuccess();
      return result;
    } catch (err) {
      this.onFailure();
      throw err;
    }
  }

  onSuccess() {
    this.failureCount = 0;
    if (this.state === 'HALF_OPEN') {
      this.successCount++;
      if (this.successCount >= 3) {
        this.state = 'CLOSED';
        this.successCount = 0;
      }
    }
  }

  onFailure() {
    this.failureCount++;
    if (this.failureCount >= this.failureThreshold) {
      this.state = 'OPEN';
      this.nextAttempt = Date.now() + this.resetTimeout;
    }
  }
}
event-driven architectureKafkaREST APIresiliencechaos engineeringcircuit breakerdatabase failuredecoupling

Comments

Loading comments...