Menu
Dev.to #systemdesign·March 21, 2026

Strategies for Efficiently Transferring Large Datasets from Backend to Frontend

This article explores common pitfalls and effective architectural strategies for transferring large datasets (e.g., millions of rows) from a backend service to a client. It details various techniques, including pagination, server-side streaming, batching, and advanced serialization formats like Protocol Buffers and Parquet, alongside compression methods, to optimize performance, memory usage, and network efficiency. The discussion highlights crucial trade-offs and considerations across database, network, and client layers.

Read original on Dev.to #systemdesign

Sending large volumes of data from a backend to a client is a common challenge in system design. A naive approach, such as fetching all data and serializing it into a single JSON response, can lead to severe performance bottlenecks and failures. This includes high server memory consumption (OOM errors), long serialization times blocking event loops, significant network latency, and client-side crashes due to excessive parsing memory requirements.

The Perils of Single-Blob Data Transfer

⚠️

Impact of a Single Large JSON Response

Attempting to send a million rows as a single JSON object can consume 1-2 GB of server RAM per request, block the event loop for tens of seconds, transfer 50-500 MB over the network (even compressed), and cause client browsers to allocate 1-2 GB, leading to crashes or unresponsiveness. This approach is highly unscalable.

Technique 1: Pagination for Controlled Data Delivery

Pagination is the most fundamental strategy to avoid sending all data at once. It breaks down a large dataset into smaller, manageable chunks. Two primary types are:

  1. Offset Pagination: Uses `LIMIT` and `OFFSET` SQL clauses. While simple to implement, it becomes inefficient at deep pages because the database still has to scan and discard rows up to the offset, leading to degraded performance for users accessing later pages.
  2. Cursor-Based (Keyset) Pagination: Utilizes an indexed column (e.g., a primary key or timestamp) as a 'cursor' to fetch the next set of rows directly after the last seen item. This is significantly more efficient for large datasets as it avoids full table scans and leverages index lookups, making it ideal for infinite scrolling or high-performance data retrieval.
sql
-- Cursor-Based Pagination Example
SELECT * FROM users WHERE id > 12345 ORDER BY id LIMIT 50;
data transferpaginationstreamingprotocol bufferscompressionbackend performancefrontend performancescalability

Comments

Loading comments...