Menu

Software Architecture and System Design News

Latest curated articles from top engineering blogs

NetflixUberMetaLinkedInSpotifyGitHubAirbnbPinterestSlackDropboxCloudflareStripeDatadogFigmaShopifyAWSGoogle CloudAzureWerner Vogels& 15+ more

126 articles

Medium #system-design·11d ago

Architectural Strategy for Migrating Legacy Database-Centric Systems with Event Sourcing

This article outlines an architectural strategy for migrating legacy database-centric systems using events and progressive ownership transfer. It focuses on how to incrementally modernize monolithic applications by extracting functionalities and data, leveraging event-driven patterns to decouple services and manage data consistency during the transition.

MicroservicesDistributed Systems
135489681
DZone Microservices·11d ago

Optimizing Hadoop Big Data Workloads on Arm-based AmpereOne Processors

This article explores the setup, tuning, and performance evaluation of Hadoop on AmpereOne Arm-based processors, highlighting their power efficiency and cost advantages for big data workloads. It delves into the architectural benefits of AmpereOne processors, Hadoop's compatibility with Arm, and provides practical guidance for deploying and optimizing Hadoop clusters on this infrastructure. The focus is on leveraging modern hardware for scalable and cost-effective big data processing.

Cloud & InfrastructureDatabases & Storage
59740832
InfoQ Cloud·12d ago

Replacing Database Sequences at Scale: A Distributed ID Generation System

This article details Coupang's journey to replace legacy database sequences with a highly available, low-latency distributed ID generation system without breaking over 100 existing services. The solution leverages local application caching, server-side caching, and DynamoDB as the source of truth, optimizing for performance and availability over strict global ordering and gap-free IDs. It highlights practical design principles for large-scale migrations, emphasizing simplicity and backward compatibility.

Distributed SystemsDatabases & Storage
49131887
InfoQ Architecture·12d ago

Replacing Database Sequences at Scale: A Cached, Distributed ID Generation System

This article details Coupang's journey to replace traditional database sequences with a highly scalable, available, and low-latency distributed ID generation system. It highlights critical design decisions, such as prioritizing eventual consistency and local caching over strict global ordering and network calls, to support over 100 services and facilitate a seamless migration from relational databases to NoSQL.

Distributed SystemsDatabases & Storage
47932020
Dev.to #systemdesign·12d ago

Scaling Challenges with Misused Vector Databases

This article highlights a common architectural pitfall where a system broke during scaling not due to performance bottlenecks, but incorrect database selection. The author mistakenly used a vector database for both similarity search and general data storage, leading to poor performance and scalability issues. The solution involved adopting a hybrid architecture, leveraging a vector database for its strengths (semantic search) and a traditional database for its (exact-match queries and structured data storage).

Databases & StorageDistributed Systems
43127071
The New Stack·12d ago

Hybrid Search Architectures for Production RAG Pipelines

This article discusses an architectural problem in RAG (Retrieval-Augmented Generation) pipelines where pure vector similarity falls short in production environments, leading to issues like stale information, security leaks, and incorrect answers. It proposes "hybrid search," which combines vector similarity with structured SQL predicates within a single database query, as a solution. The article highlights how this approach improves retrieval accuracy, enhances security through relational joins, and simplifies operational complexity compared to a "vector sidecar" anti-pattern.

Databases & StorageAI & ML Infrastructure
36728002
ByteByteGo·13d ago

Database Performance Optimization Trade-offs

This article explores various strategies for optimizing database performance, emphasizing the inherent trade-offs associated with each. It highlights that while optimizations like indexing and caching improve specific aspects such as read speed, they can negatively impact others like write performance or data consistency. The core message is to understand the costs and benefits of each strategy to make informed architectural decisions based on application requirements.

Databases & StoragePerformance & Scaling
25115370
Dropbox Tech·13d ago

Optimizing Storage Efficiency in Immutable Blob Stores: Dropbox's Magic Pocket Compaction Strategies

This article details Dropbox's approach to improving storage efficiency in Magic Pocket, their exabyte-scale immutable blob store. It highlights the challenges of fragmentation in immutable systems and introduces a multi-strategy compaction system (L1, L2, L3) designed to efficiently reclaim fragmented space and reduce storage overhead, especially in scenarios with severely under-filled volumes.

Databases & StorageDistributed Systems
23714546
Dev.to #architecture·13d ago

Domain-Driven Design with Functional Deciders for Flexible Persistence

This article introduces a functional approach to Domain-Driven Design (DDD) using the Decider pattern in TypeScript, aiming to decouple domain logic from persistence concerns. It proposes a framework called noDDDe that leverages pure functions to manage domain state and events, enabling the same business logic to work seamlessly with different storage mechanisms like SQL databases and Event Stores. This approach tackles common challenges in DDD implementations, offering a more pragmatic alternative to full Event Sourcing or heavy OOP boilerplate.

MicroservicesDatabases & Storage
26016427
MongoDB Blog·13d ago

Enhancing MongoDB Development with AI Agent Skills and Plugins

This article introduces MongoDB Agent Skills and plugins, which provide structured guidance and best practices for AI coding agents to generate more reliable and architecturally sound code for MongoDB applications. It addresses common pitfalls in agent-generated code by embedding expert knowledge directly into development workflows, improving schema design, indexing strategies, and overall performance at scale. The integration with the MongoDB MCP Server also ensures secure and controlled agent access, mitigating architectural risks.

Databases & StorageTools & Frameworks
20312407
DZone Microservices·13d ago

DocumentDB High Availability on Kubernetes with an Operator

This article explores deploying DocumentDB, a MongoDB-compatible database built on PostgreSQL, with high availability on Kubernetes using a dedicated operator. It details the architecture for local HA, leveraging CloudNativePG for WAL replication and failover, and demonstrates automatic primary failover to ensure minimal downtime for applications.

Databases & StorageDistributed Systems
19413383
ByteByteGo·14d ago

Datadog's Data Replication Platform: From OLTP to Real-time Search

This article details how Datadog tackled performance bottlenecks in their shared PostgreSQL database by implementing a robust, asynchronous data replication platform. They shifted from using Postgres for real-time search workloads to a dedicated search platform, leveraging Change Data Capture (CDC) with Debezium, Kafka, and a Schema Registry. The solution highlights critical distributed system trade-offs, particularly between consistency and availability, and the importance of automation for managing complex data pipelines at scale.

Distributed SystemsDatabases & Storage
17812697