This article empirically compares the resilience of synchronous REST APIs versus event-driven architectures (EDA) using Kafka during database outages. Through a chaos engineering experiment simulating real-time location tracking, it demonstrates how Kafka's asynchronous decoupling and message buffering, coupled with a circuit breaker pattern, enables graceful degradation and zero client-facing errors compared to immediate failures in a tightly coupled REST setup. The core takeaway is that EDAs can significantly enhance system resilience against backend service failures.
Read original on Dev.to #systemdesignDatabase failures are an inevitable reality in complex systems. Traditional synchronous REST APIs often suffer from tight coupling, meaning a database outage directly translates to API failures and degraded user experience. This article explores a practical comparison of how two architectural styles - synchronous REST and asynchronous event-driven with Kafka - handle such failures, highlighting the critical role of decoupling for system resilience.
To quantify resilience, a chaos engineering experiment was set up, mimicking Uber's real-time driver location tracking system. Both architectures were subjected to a sustained load of 50 requests/second and a simulated PostgreSQL crash for 120 seconds. Key metrics included total requests, successful requests, error rate, and latency. This approach provides empirical evidence of architectural robustness under stress.
Why Event-Driven Architectures Excel
The experiment clearly demonstrates that event-driven architectures with message queues like Kafka provide superior resilience during database failures. By decoupling producers from consumers, Kafka acts as a buffer, preventing client-facing errors and allowing the system to recover gracefully once the database is restored. This shifts from an immediate failure to an eventual consistency model, which is often acceptable for high-throughput, fault-tolerant systems.
A crucial component in the Kafka architecture's success was the implementation of the Circuit Breaker pattern. When the consumer detected database failures, the circuit breaker transitioned to an OPEN state, preventing further attempts to write to the unavailable database and allowing events to queue up in Kafka. Upon database recovery, it transitioned to HALF_OPEN, cautiously testing connectivity before resuming full event processing. This pattern is essential for preventing cascading failures in distributed systems. A simplified JavaScript implementation was provided:
class CircuitBreaker {
constructor(threshold = 5, timeout = 30000, resetTimeout = 60000) {
this.state = 'CLOSED';
this.failureThreshold = threshold;
this.resetTimeout = resetTimeout;
this.failureCount = 0;
this.nextAttempt = Date.now();
}
async execute(fn) {
if (this.state === 'OPEN') {
if (Date.now() < this.nextAttempt) {
throw new Error('Circuit breaker OPEN - database unavailable');
}
this.state = 'HALF_OPEN';
}
try {
const result = await fn();
this.onSuccess();
return result;
} catch (err) {
this.onFailure();
throw err;
}
}
onSuccess() {
this.failureCount = 0;
if (this.state === 'HALF_OPEN') {
this.successCount++;
if (this.successCount >= 3) {
this.state = 'CLOSED';
this.successCount = 0;
}
}
}
onFailure() {
this.failureCount++;
if (this.failureCount >= this.failureThreshold) {
this.state = 'OPEN';
this.nextAttempt = Date.now() + this.resetTimeout;
}
}
}