Menu
Course/Distributed Systems/Distributed Transactions: 2PC & 3PC

Distributed Transactions: 2PC & 3PC

Coordinating transactions across services: two-phase commit, three-phase commit, blocking problems, and why modern systems often avoid them.

15 min readHigh interview weight

Why Distributed Transactions Are Hard

A distributed transaction spans multiple databases, services, or nodes. The fundamental challenge: each participant must either all commit or all abort, yet any participant (or the network) can fail at any point. Without a coordination protocol, you risk one service committing while another aborts — leaving data permanently inconsistent.

⚠️

ACID across services

ACID guarantees (Atomicity, Consistency, Isolation, Durability) are straightforward within a single database engine. Achieving them across multiple independent services requires explicit coordination protocols, and those protocols come with significant performance and availability costs.

Two-Phase Commit (2PC)

Two-Phase Commit is the classic distributed transaction protocol. A coordinator node orchestrates two phases: in Phase 1 (Prepare), the coordinator asks all participants whether they can commit. In Phase 2 (Commit/Abort), based on all responses, the coordinator sends the final decision.

Loading diagram...
Two-Phase Commit (2PC) — coordinator-driven atomic commit across three databases

The Blocking Problem

2PC is a blocking protocol. If the coordinator crashes after sending PREPARE but before sending the commit decision, participants are stuck holding locks indefinitely. They cannot unilaterally decide to commit or abort — aborting might violate atomicity if other participants already committed. This is called the in-doubt window: participants must wait for the coordinator to recover.

⚠️

2PC failure scenarios

Coordinator failure after prepare: all participants block until coordinator recovers. Participant failure after voting YES: coordinator must wait or abort. Network partition: participants on the wrong side of the partition cannot make progress. In all these cases, locks are held, reducing throughput significantly.

Three-Phase Commit (3PC)

Three-Phase Commit adds a `pre-commit` phase to eliminate the blocking problem in 2PC. The three phases are: CanCommit (equivalent to Prepare), PreCommit (coordinator and participants acknowledge readiness), and DoCommit (final commit). The key insight is that after all participants acknowledge PreCommit, a participant that loses contact with the coordinator can safely commit on its own — it knows all others are also in the PreCommit state.

Property2PC3PC
PhasesPrepare, Commit/AbortCanCommit, PreCommit, DoCommit
Blocking on coordinator failureYes — indefiniteNo — participants can proceed
Message complexity2N messages (prepare + commit)3N messages
Network partition safeNoNo (can cause split-brain)
LatencyLower (fewer messages)Higher (extra round trip)
Used in practicePostgreSQL FDW, XA transactionsRare — mostly theoretical
⚠️

3PC is not partition-tolerant

Despite solving the blocking problem, 3PC can cause split-brain under network partitions. If the partition occurs between PreCommit and DoCommit, the two partitions can independently decide differently (one aborts due to timeout, the other commits). For this reason, 3PC is rarely used in real distributed systems.

Modern Alternatives

Most modern microservice architectures avoid distributed transactions entirely. The alternatives include:

  • Saga pattern: A sequence of local transactions, each publishing events that trigger the next. Compensating transactions handle rollback. Used by Uber, Amazon, and most event-driven architectures.
  • Outbox pattern: Write to local DB and an outbox table atomically, then a relay process publishes events. Guarantees at-least-once delivery without distributed transactions.
  • Eventual consistency: Accept that different services will be temporarily inconsistent, and design the UX to handle it (e.g., show 'pending' state).
  • Google Spanner: Uses TrueTime-based timestamps to achieve external consistency with Paxos, avoiding traditional 2PC coordinator bottlenecks at scale.
💡

Interview Tip

Interviewers often ask 'how do you handle a transaction that spans two microservices?' The correct answer is NOT '2PC'. Explain that 2PC introduces tight coupling and a blocking coordinator. Instead, describe the Saga pattern with compensating transactions, and mention the Outbox pattern for reliable event publishing. This shows you understand the operational realities of distributed systems.

📝

Knowledge Check

4 questions

Test your understanding of this lesson. Score 70% or higher to complete.

Ask about this lesson

Ask anything about Distributed Transactions: 2PC & 3PC