InfoQ Architecture·March 6, 2026

Read-Copy-Update (RCU): Achieving Lock-Free Performance in Read-Heavy Systems

This article explores Read-Copy-Update (RCU), a concurrency mechanism that dramatically improves read performance in read-heavy, eventually consistent systems by eliminating lock overhead from the read path. It details RCU's three-phase pattern (read, copy, update) and contrasts it with traditional reader-writer locks, highlighting how RCU mitigates cache coherency issues and contention at scale. The discussion provides insights into applying RCU in real-world distributed system architectures.

Distributed Systems Performance & Scaling

Read original on InfoQ Architecture

Read-Copy-Update (RCU) is a synchronization primitive designed to enhance performance in highly concurrent, read-heavy workloads. Unlike traditional lock-based mechanisms, RCU allows readers to access shared data without acquiring any locks, thus avoiding contention and cache coherency overhead that plague reader-writer locks at scale. This lock-free read access comes at the cost of eventual consistency and increased memory usage.

The Core RCU Pattern: Read, Copy, Update

Read (Lock-Free Access): Readers access the shared data structure directly by grabbing a pointer to the current version. No locks are acquired, eliminating overhead and contention. Readers see a consistent snapshot of the data that existed when they started their read operation.
Copy (Writer Modification): When a writer needs to modify data, it does not alter the shared data in place. Instead, it creates a private copy of the current data, applies modifications to this copy, and prepares a new version.
Update (Atomic Swap and Grace Period): Once the modified copy is ready, the writer atomically swaps a pointer to point to the new version. New readers will then see the updated data. Crucially, RCU defers the reclamation of the old data's memory until a "grace period" has elapsed, ensuring that all existing readers, who might still be using the old pointer, have completed their operations. This prevents use-after-free issues.

Why Traditional Locks Fail at Scale

Traditional reader-writer locks, while allowing concurrent reads, introduce significant overhead in high-concurrency environments. Even shared read-lock acquisition involves atomic operations, leading to cache line invalidation across CPU cores. This "cache bouncing" becomes a major bottleneck as core counts increase. Additionally, writers still require exclusive access, causing all readers to wait, potentially leading to thundering herd problems, priority inversion, and convoying if the lock-holding thread is preempted. RCU fundamentally sidesteps these issues by removing locks from the read path entirely.

💡

When to Apply RCU

RCU is best suited for scenarios with a very high read-to-write ratio (e.g., 10:1 or 100:1) and where a brief period of eventual consistency is acceptable for readers. Examples include system configuration updates (Kubernetes, Envoy), DNS servers, and PostgreSQL's MVCC implementation. It is inappropriate for systems requiring strong consistency or immediate access to the latest data due to the potential for stale reads.

pthread_rwlock_t config_lock;
config_t *global_config;

// Reader with traditional lock
void handle_request() {
    pthread_rwlock_rdlock(&config_lock); // Acquire read lock
    route_t *route = lookup_route(global_config, request_path);
    pthread_rwlock_unlock(&config_lock); // Release read lock
    // ... forward request ...
}

// Writer with traditional lock
void update_config(config_t *new_config) {
    pthread_rwlock_wrlock(&config_lock); // Acquire write lock
    global_config = new_config;
    pthread_rwlock_unlock(&config_lock);
}

// RCU Reader (conceptual)
void handle_request_rcu() {
    rcu_read_lock(); // Mark RCU critical section
    route_t *route = rcu_dereference(global_config);
    rcu_read_unlock(); // Exit RCU critical section
    // route *must not* be used after this point
}

RCUConcurrencyLock-FreeSynchronizationPerformance OptimizationCache CoherencyRead-Write LocksKernel

Comments

Loading comments...

Architecture Design

Design this yourself

Design a high-throughput API gateway or a distributed configuration service that uses the Read-Copy-Update (RCU) pattern for managing its routing rules or configuration settings, ensuring lock-free read access for millions of requests per second while allowing infrequent, non-blocking updates. Detail how RCU helps mitigate cache coherency issues and contention in a multi-core environment, and discuss the trade-offs regarding consistency and memory management.

Practice Interview

Focus: Read-Copy-Update (RCU) for lock-free data access

Other design angles

· Design a real-time analytics system's in-memory data store where RCU is used to manage frequently read, but occasionally updated, aggregation tables.· Architect a DNS server that leverages RCU to serve millions of queries per second for zone data, accommodating infrequent updates from administrators.· Design a feature flag management system where configuration updates are propagated and consumed by client applications using RCU principles to ensure low-latency reads.

Read-Copy-Update (RCU): Achieving Lock-Free Performance in Read-Heavy Systems

The Core RCU Pattern: Read, Copy, Update

Why Traditional Locks Fail at Scale

Comments

Architecture Design

Related Lessons