This article explains vector clocks, a fundamental concept in distributed systems for detecting causal relationships and resolving conflicts in eventually consistent databases. It details their structure, operational logic with examples, and how they differentiate between ordered and concurrent updates. While powerful for conflict detection, vector clocks introduce client-side complexity and potential overhead from growing vector sizes.
Read original on Dev.to #systemdesignIn distributed databases, data replication across multiple servers is crucial for availability and scalability. However, this architecture inherently introduces challenges with concurrent writes, where different replicas might process updates independently. This can lead to conflicting versions of data, necessitating mechanisms to detect and reconcile these inconsistencies. Vector clocks provide a robust solution by allowing systems to understand the causal order of events.
A vector clock is a data structure, typically a list of `[server ID, version counter]` pairs, associated with a specific data item. It tracks which server has modified the data and how many updates each server has performed on that data item. The core purpose of a vector clock is to determine the relationship between two versions of data: whether one happened after another (ordered) or if they are concurrent (a conflict exists).
D([S1, v1], [S2, v2], …, [Sn, vn])When a server writes a data item, its version counter in the vector clock is incremented. If the server is new to the vector, a new entry `[Si, 1]` is created. The article provides a step-by-step example:
Vector clocks determine relationships based on counter comparisons:
Use Case: Amazon DynamoDB
Systems like Amazon DynamoDB utilize vector clocks to maintain eventual consistency while ensuring high availability. When a read request returns multiple conflicting versions, the client is responsible for resolving the conflict and writing back the reconciled version.
While effective, vector clocks come with architectural considerations: