Menu
Cloudflare Blog·June 25, 2026

Implementing Saga Rollbacks for Durable Workflows

This article discusses Cloudflare Workflows' new saga rollback feature, which addresses the challenge of maintaining data consistency in multi-step, distributed transactions. It explains how to define compensation logic for each step directly within the workflow, adhering to the saga pattern to semantically reverse operations rather than physically undoing them. The piece delves into the design decisions behind the API, emphasizing durability, idempotency, and predictable rollback ordering for complex asynchronous processes.

Read original on Cloudflare Blog

Cloudflare Workflows introduces built-in saga rollbacks to manage atomicity and consistency in long-running, multi-step distributed transactions. Traditional ACID transactions are not feasible across disparate systems (e.g., two different banks). The saga pattern provides a mechanism to handle failures by executing a series of compensating transactions when a step fails, ensuring eventual consistency.

The Challenge of Distributed Transaction Rollbacks

In distributed systems, operations on external services often commit immediately and cannot be simply 'undone'. For example, debiting an account at Bank A and crediting an account at Bank B. If the credit fails, the debit is already committed. Reverting this requires a *compensation* action (crediting Bank A back) rather than an atomic rollback. This compensation must itself be a durable operation.

📌

Saga Pattern Example: Funds Transfer

Consider a workflow: 1. Debit Bank A, 2. Credit Bank B, 3. Send Notifications. If step 2 (Credit Bank B) fails, the system must trigger a compensation for step 1 (Credit Bank A) to return the funds, as the debit is already final.

API Design Decisions for Rollbacks

Cloudflare explored several API designs for integrating rollback logic into their `step.do()` primitive, ultimately choosing to embed rollback as metadata within the step definition. This approach ensures that the forward action and its compensation logic are co-located, maintaining the clear execution model of `step.do()` without introducing new chaining behaviors or complex builders that could interfere with promise pipelining or step timing.

javascript
await step.do(
  "debit-account-a",
  async () => { /* Debit logic */ },
  {
    rollback: async ({ output }) => { /* Compensation logic for debit */ },
    rollbackConfig: {
      retries: { limit: 10, delay: '30 seconds', backoff: 'exponential' },
      timeout: '2 minutes'
    }
  }
);
  • Idempotency: Rollback functions, like regular workflow steps, must be idempotent to handle retries safely without adverse side effects (e.g., double refunds).
  • Rollback Triggering: Rollback only starts when the workflow is about to fail terminally, not for every caught error. If user code handles an error and continues, rollback is deferred.
  • Ordering: Rollback handlers execute in reverse *step-start order*, not reverse completion order, ensuring predictable compensation even with concurrent steps.
Saga PatternDistributed TransactionsWorkflow OrchestrationIdempotencyError HandlingCloudflare WorkflowsAPI DesignCompensation Logic

Comments

Loading comments...