This article explores the synergy between Apache Kafka and Temporal for building resilient distributed systems. It emphasizes that Kafka excels at durable event streaming, while Temporal manages long-running workflow state, retries, and recovery. The combination allows for robust business process orchestration where Kafka provides the event backbone and Temporal acts as the control plane.
Read original on DZone MicroservicesThe core thesis is that Kafka and Temporal address different failure boundaries and are complementary, not substitutes. Kafka is optimized for moving ordered, replayable event streams across many consumers and machines, ideal for integration boundaries and decoupling services. Temporal, on the other hand, provides durable orchestration for long-running application logic, ensuring workflows recover from crashes, outages, and restarts by replaying persisted event history. This combination is powerful: Kafka carries "facts" (events) and Temporal remembers "intent" (business process state, timers, retries, compensations).
A critical design rule is to avoid embedding Kafka client calls directly within Temporal Workflow code. Temporal workflows require deterministic logic on replay. Non-deterministic operations, like API calls or database queries, should be encapsulated within Activities. Workflows should function as lean state machines deciding what actions to take, while Activities perform the actual side effects, which can be retried or fail. This separation ensures Kafka remains an external event fabric without compromising Temporal's replay semantics.
private boolean paymentReceived;
private final OrderActivities activities = Workflow.newActivityStub(
OrderActivities.class,
ActivityOptions.newBuilder()
.setStartToCloseTimeout(Duration.ofSeconds(30))
.setRetryOptions(
RetryOptions.newBuilder()
.setInitialInterval(Duration.ofSeconds(1))
.setMaximumInterval(Duration.ofSeconds(30))
.build())
.build());
@WorkflowMethod
public void process(String orderId) {
activities.reserveInventory(orderId);
boolean paid = Workflow.await(Duration.ofHours(2), () -> paymentReceived);
if (!paid) {
activities.releaseInventory(orderId);
activities.publishTimedOut(orderId);
return;
}
activities.publishConfirmed(orderId);
}
@SignalMethod
public void paymentCaptured(String paymentId) {
paymentReceived = true;
}This example shows a simple, robust workflow. Inventory reservation and event publication are delegated to Activities. The workflow merely manages state and waits for events. The two-hour wait is durable, meaning Temporal persists timers, allowing execution to resume even after worker or service interruptions. Kafka would supply the external `paymentCaptured` event, which a thin bridge translates into a Temporal signal.
The handoff from Kafka to Temporal should be designed for duplicate tolerance. While Kafka allows manual offset control, a crash between Temporal accepting a signal and the offset commit can lead to redelivery. Making Temporal Workflow IDs stable for a given business entity and ensuring Activities are idempotent are crucial, as Temporal may retry Activity executions. The article clarifies that neither Kafka's transactional features nor Temporal's effectively-once scheduling inherently provide end-to-end exactly-once semantics for external side effects; explicit idempotency keys or transactional guarantees across all involved systems are required.