Software Architecture and System Design News

Latest curated articles from top engineering blogs

Netflix

Uber

Netflix's GenPage: A Unified Generative AI Model for Personalized Homepages

Netflix developed GenPage, a generative AI system that replaces their traditional multi-stage recommendation pipeline with a single model. This system directly generates personalized homepages, unifying item selection, row construction, and layout generation. Key benefits include improved user engagement, a 20% reduction in serving latency, and greater flexibility for new content types and product experiences.

AI & ML InfrastructureDistributed Systems

21512923

Dev.to #architecture·21h ago

Database Read/Write Splitting for Scalability

This article discusses database read/write splitting as a crucial architectural pattern for scaling high-growth applications. It highlights how separating read and write operations into distinct database instances (primary for writes, replicas for reads) prevents heavy analytics queries from blocking critical transactional writes, thereby improving throughput and reducing latency. The article also addresses the challenge of replication lag and offers solutions for maintaining data consistency.

Databases & StoragePerformance & Scaling

19812501

Medium #system-design·21h ago

Scaling AI Customer Support: Architecture Lessons from Industry Leaders

This article explores the system design principles and architectural patterns used by major travel platforms like Airbnb, Booking.com, and Expedia to scale AI-driven customer support. It focuses on how these systems intelligently automate millions of requests while ensuring a seamless human handover when necessary, highlighting the underlying ML infrastructure and decision-making processes.

AI & ML InfrastructureDistributed Systems

22713932

Dev.to #systemdesign·21h ago

Designing a URL Shortening Service

This article outlines the system design for a URL shortening service, covering functional and non-functional requirements, high-level architecture, database schema, and key components like short code generation and analytics. It details the workflows for URL creation and redirection, emphasizing distributed ID generation and asynchronous analytics processing.

Distributed SystemsDatabases & Storage

21513298

DZone Microservices·1d ago

Microservices Anti-Patterns: Lessons From Production

This article explores common anti-patterns in microservices architecture, drawing from real-world production experiences in fintech and online gambling. It emphasizes that while microservices offer benefits like scalability and isolation, they often shift complexity rather than eliminate it, necessitating careful architectural discipline.

MicroservicesDistributed Systems

19213376

Dev.to #systemdesign·1d ago

Netflix's Scalable Architecture: A Deep Dive into Control Plane, Data Plane, and Chaos Engineering

This article dissects Netflix's robust backend architecture, highlighting its "two-brain" approach: a smart control plane on AWS for logic and a dumb data plane on Open Connect for media delivery. It explores key components like API gateways, circuit breakers, and distributed databases, alongside Netflix's innovative chaos engineering practices for system resilience.

Distributed SystemsCloud & Infrastructure

18411932

DZone Microservices·2d ago

Designing Scalable Containerized Backend Services with Asynchronous Python and Docker

This article explores best practices for building scalable, containerized backend services, particularly focusing on high-concurrency relational microservices using Python and Docker. It addresses common bottlenecks like blocking I/O in state management and demonstrates how to leverage asynchronous programming and multi-stage Docker builds to achieve horizontal scalability and operational efficiency. The core idea is to move away from monolithic, blocking designs to decoupled, non-blocking architectures.

MicroservicesPerformance & Scaling

1619189

Dev.to #systemdesign·2d ago

Designing a High-Performance Cryptocurrency Exchange: Architecture and Matching Engine

This article details the system architecture of a high-performance cryptocurrency exchange, emphasizing low-latency order processing and scalability. It covers the stateless API layer, various communication protocols, an event-driven design, and a deep dive into the in-memory, single-threaded matching engine's data structures and algorithms, which are crucial for deterministic and microsecond-level order execution.

Distributed SystemsPerformance & Scaling

1459776

InfoQ Architecture·2d ago

Migrating to Micro-Frontends: Architectural Lessons and Strategic Decisions

This article discusses the architectural considerations and lessons learned when migrating from monolithic web applications to distributed micro-frontend architectures. It clarifies the core differences between reusable components and independent micro-frontends, emphasizing the importance of team autonomy and organizational scalability. The presentation outlines a strategic decision framework for adoption and highlights how crucial non-technical aspects like culture are to success.

MicroservicesAPI Design

1349703

Netflix Tech Blog·2d ago

Netflix's In-House LLM Serving Architecture and Design Decisions

This article details Netflix's architectural decisions and trade-offs in building an in-house LLM serving platform within their existing production environment. It covers critical aspects like engine selection (vLLM), model packaging strategies, API surface design (OpenAI-compatible), and robust deployment strategies (Red-Black vs. Versioned) for GPU-intensive workloads, highlighting operational challenges discovered in production. The discussion focuses on integrating LLM inference into a unified serving system, addressing challenges such as managing model dependencies, ensuring independent evolution of models and frontends, and scaling constrained decoding for performance-critical applications.

AI & ML InfrastructureDistributed Systems

14110455

Dev.to #systemdesign·3d ago

Beyond Prompt Engineering: System Design for Robust AI Agents

This article argues that while prompt engineering has been a focus, the real differentiator for production-grade AI lies in robust system design. It highlights the importance of the 'harness' surrounding LLMs, which includes crucial components like memory management, tool calling, durable execution, and orchestration, to build reliable and scalable AI agents.

AI & ML InfrastructureDistributed Systems

1679973

The New Stack·3d ago

Identifying Bottlenecks in the Software Delivery Value Stream Beyond Code Review

This article challenges the common perception that AI shifts the bottleneck from coding to code review, arguing that the true constraints often lie further downstream in the software delivery value stream, specifically in deployment batching and release processes. It highlights how focusing on improving code review throughput without addressing these deeper bottlenecks can exacerbate problems by increasing the accumulation of changes awaiting deployment. The piece encourages identifying and optimizing the real constraint to improve overall value delivery and reduce risk.

DevOps & SREPerformance & Scaling

14110270