This article discusses Cloudflare Workflows' new saga rollback feature, which addresses the challenge of maintaining data consistency in multi-step, distributed transactions. It explains how to define compensation logic for each step directly within the workflow, adhering to the saga pattern to semantically reverse operations rather than physically undoing them. The piece delves into the design decisions behind the API, emphasizing durability, idempotency, and predictable rollback ordering for complex asynchronous processes.
Read original on Cloudflare BlogCloudflare Workflows introduces built-in saga rollbacks to manage atomicity and consistency in long-running, multi-step distributed transactions. Traditional ACID transactions are not feasible across disparate systems (e.g., two different banks). The saga pattern provides a mechanism to handle failures by executing a series of compensating transactions when a step fails, ensuring eventual consistency.
In distributed systems, operations on external services often commit immediately and cannot be simply 'undone'. For example, debiting an account at Bank A and crediting an account at Bank B. If the credit fails, the debit is already committed. Reverting this requires a *compensation* action (crediting Bank A back) rather than an atomic rollback. This compensation must itself be a durable operation.
Saga Pattern Example: Funds Transfer
Consider a workflow: 1. Debit Bank A, 2. Credit Bank B, 3. Send Notifications. If step 2 (Credit Bank B) fails, the system must trigger a compensation for step 1 (Credit Bank A) to return the funds, as the debit is already final.
Cloudflare explored several API designs for integrating rollback logic into their `step.do()` primitive, ultimately choosing to embed rollback as metadata within the step definition. This approach ensures that the forward action and its compensation logic are co-located, maintaining the clear execution model of `step.do()` without introducing new chaining behaviors or complex builders that could interfere with promise pipelining or step timing.
await step.do(
"debit-account-a",
async () => { /* Debit logic */ },
{
rollback: async ({ output }) => { /* Compensation logic for debit */ },
rollbackConfig: {
retries: { limit: 10, delay: '30 seconds', backoff: 'exponential' },
timeout: '2 minutes'
}
}
);