Menu

Software Architecture and System Design News

Latest curated articles from top engineering blogs

NetflixUberMetaLinkedInSpotifyGitHubAirbnbPinterestSlackDropboxCloudflareStripeDatadogFigmaShopifyAWSGoogle CloudAzureWerner Vogels& 15+ more

242 articles

Dev.to #architecture·8d ago

Designing RAG Pipelines: Ingestion and Query Shifts

This article provides a detailed breakdown of the two distinct operational shifts in a Retrieval Augmented Generation (RAG) pipeline: ingestion (offline) and query time (live). It emphasizes the architectural decisions and potential failure points within each shift, focusing on critical steps like document parsing, chunking, embedding, and retrieval to ensure accurate and contextually relevant AI responses. Understanding these shifts is crucial for building robust and debuggable RAG systems.

AI & ML InfrastructureDistributed Systems
82755144
The New Stack·8d ago

Vultr's AI-Powered Infrastructure Automation for Developer Portals

Vultr leverages Nvidia GPUs and AI agents to offer a cost-effective infrastructure automation platform, aiming to simplify infrastructure provisioning for developers through internal developer portals (IDPs). This approach shifts the platform engineering role from manual scripting to high-level architectural design, abstracting complex infrastructure details away from application developers. The system uses 'skill files' trained on organizational policies to automate deployments via API-driven AI agents.

Cloud & InfrastructureDevOps & SRE
58236845
Dev.to #systemdesign·8d ago

Architectural Deep Dive into Claude Code's LLM Agent Loop

This article dissects the core `while(true)` loop powering Claude Code's AI coding agent, revealing its state machine architecture for managing complex interactions with large language models and tools. It highlights critical design decisions like avoiding recursion for stack overflow prevention and implementing streaming tool execution for significant performance gains, showcasing a robust approach to building interactive AI agents.

AI & ML InfrastructureDistributed Systems
44827870
The New Stack·9d ago

Hybrid Search Architectures for Production RAG Pipelines

This article discusses an architectural problem in RAG (Retrieval-Augmented Generation) pipelines where pure vector similarity falls short in production environments, leading to issues like stale information, security leaks, and incorrect answers. It proposes "hybrid search," which combines vector similarity with structured SQL predicates within a single database query, as a solution. The article highlights how this approach improves retrieval accuracy, enhances security through relational joins, and simplifies operational complexity compared to a "vector sidecar" anti-pattern.

Databases & StorageAI & ML Infrastructure
30724085
Meta Engineering·9d ago

KernelEvolve: Optimizing AI Infrastructure through Autonomous Kernel Generation

This article introduces KernelEvolve, Meta's agentic kernel authoring system that autonomously generates and optimizes low-level hardware kernels for diverse AI models and heterogeneous hardware. It addresses the scalability bottleneck of manual kernel tuning by leveraging AI agents, search algorithms, and a feedback loop to significantly improve inference and training throughput.

AI & ML InfrastructurePerformance & Scaling
30719263
Martin Fowler·9d ago

Harness Engineering for Effective AI Agent Development

This article introduces the concept of Harness Engineering, a mental model for effectively guiding and utilizing coding agents. It explores the architectural implications of integrating AI agents into software development workflows, focusing on how to structure interactions and provide the necessary context and feedback loops for agents to perform complex tasks reliably. Understanding harness engineering is crucial for designing robust systems that leverage AI for code generation and development.

AI & ML InfrastructureDevOps & SRE
24515486
Dev.to #architecture·9d ago

Claude Code's Multi-Agent Architecture for LLM Orchestration

This article dissects the hidden multi-agent architecture of Anthropic's Claude Code, revealing how LLMs are orchestrated to perform complex tasks. It highlights the use of a recursive `AgentTool` for spawning sub-agents, explicit model selection for cost-quality tradeoffs, and a surprisingly simple filesystem-based mailbox for inter-agent communication. The architecture prioritizes simplicity and debuggability for local multi-agent systems.

AI & ML InfrastructureDistributed Systems
24816717
The New Stack·9d ago

Security Posture and Supply Chain Risks in AI System Development

This article highlights critical security lapses at Anthropic, including a leaked AI model and exposed source code due to a misconfigured npm package source map. It emphasizes the importance of a holistic security approach that extends beyond just model behavior to encompass release pipelines, infrastructure, and governance to prevent supply chain attacks and intellectual property exposure.

SecurityDevOps & SRE
25115952
InfoQ Architecture·9d ago

Automated AI-Powered Accessibility Feedback Workflow at GitHub

GitHub implemented an automated, AI-powered workflow to centralize and manage accessibility feedback across product teams. This system, built with GitHub Actions, Copilot, and Models APIs, automates the intake, classification, and initial triage of accessibility issues, significantly improving resolution times and efficiency. It showcases a practical application of AI in operational workflows for large-scale engineering organizations.

AI & ML InfrastructureDevOps & SRE
20111992
Martin Fowler·9d ago

Managing Software Debt and AI in System Development

This article discusses various forms of 'debt' in software systems—technical, cognitive, and intent debt—and introduces a 'Tri-System theory of cognition' involving humans and AI. It highlights how AI's increasing role in coding shifts the focus from writing code to verification, emphasizing the need for robust testing and a re-organization around validation to ensure system correctness and quality.

Distributed SystemsDevOps & SRE
18513099
Dev.to #architecture·9d ago

Architecting a Retrieval-Augmented Generation (RAG) Chatbot for Resume Screening

This article details the architecture and development of AskRich, a retrieval-backed chatbot designed to enhance technical screening by providing citation-backed answers from a candidate's portfolio. It explores the system's design, including a Cloudflare Worker at the edge, a LangGraph orchestrator, and a crucial feedback loop for continuous improvement of answer quality and retrieval effectiveness. The discussion also covers the implementation of a resilient rate limiting mechanism.

AI & ML InfrastructureDistributed Systems
18911360
Azure Architecture Blog·9d ago

Designing Sovereign AI at the Edge with Azure Local and Modular Datacenters

This article details Microsoft's collaboration with Armada to deliver sovereign AI capabilities at the edge using Azure Local on Galleon modular datacenters. It addresses the critical need for secure, compliant, and resilient cloud services in disconnected or highly regulated environments, enabling mission-critical AI workloads to run closer to data origin. The solution emphasizes data sovereignty, low-latency processing, and operational control in challenging operational settings.

Cloud & InfrastructureDistributed Systems
18211621