Menu

Software Architecture and System Design News

Latest curated articles from top engineering blogs

NetflixUberMetaLinkedInSpotifyGitHubAirbnbPinterestSlackDropboxCloudflareStripeDatadogFigmaShopifyAWSGoogle CloudAzureWerner Vogels& 15+ more

9 articles

Meta Engineering·3h ago

Large-Scale Data Ingestion System Migration at Meta

This article details Meta's strategy and solutions for migrating its petabyte-scale data ingestion system, which powers the social graph analytics and ML. It highlights the architectural shift from customer-owned pipelines to a self-managed data warehouse service, emphasizing the rigorous migration lifecycle, data quality validation, and robust rollback mechanisms crucial for ensuring reliability during such a massive transition.

Databases & StorageDistributed Systems
352299
Dev.to #systemdesign·3h ago

Lessons in System Reliability from a Stock Market Crash

This article draws parallels between the behavior of the Nigerian stock market during a crash and recovery, and fundamental concepts in distributed system design. It highlights single points of failure, cascading failures, eventual consistency, redundancy, and decentralized architecture using real-world market dynamics as examples. The author emphasizes that understanding system behavior is crucial for designing resilient software.

Distributed SystemsCase Studies & Postmortems
161888
Dev.to #systemdesign·1mo ago

Scaling WebSockets: From 100k to 1M Users and Tackling Backpressure

This article provides a post-mortem on the challenges faced when scaling a WebSocket-based live commentary platform from 100,000 to 1 million concurrent users. It details how an initially simple fan-out architecture led to Out-Of-Memory (OOM) kills due to slow consumers and backpressure, and outlines the architectural changes implemented to achieve resilience and scalability, including ruthless message dropping and coalescing.

Distributed SystemsPerformance & Scaling
39223806
Azure Architecture Blog·1mo ago

Migrating Legacy Oracle Databases to PostgreSQL on Azure: A Case Study in Enterprise Transformation

This article details Apollo Hospitals' successful migration from a legacy Oracle database system to Azure Database for PostgreSQL. It highlights the architectural and operational benefits of moving to a cloud-native, open-source database platform, including significant improvements in performance, scalability, and cost efficiency. The piece also introduces AI-assisted tooling designed to streamline complex Oracle-to-PostgreSQL migrations.

Databases & StorageCloud & Infrastructure
1127459
InfoQ Architecture·1mo ago

Netflix's Evolving Commerce Architecture: From DVDs to Global Streaming

This article details the pragmatic evolution of Netflix's billing and payment systems, showcasing how architectural assumptions shifted from a simple, US-centric DVD rental model to a complex global streaming platform. It highlights key challenges and architectural decisions made to adapt to asynchronous payments, international regulatory differences, and fluctuating demand patterns.

Distributed SystemsPerformance & Scaling
975375
ByteByteGo·2mo ago

Evolving Stripe's Payments API: A Decade of Design Challenges and Abstractions

This article explores the 10-year evolution of Stripe's Payments API, detailing the architectural challenges faced in unifying diverse payment methods globally. It highlights the progression from simple synchronous credit card processing to more complex asynchronous methods like ACH and Bitcoin, culminating in the design of the flexible PaymentIntents and PaymentMethods abstractions. The narrative provides valuable insights into API design, state management, and handling distributed transaction complexities in a rapidly expanding fintech platform.

API DesignDistributed Systems
38925122
Dev.to #architecture·2mo ago

Architectural Red Flags in Deceptive Financial Platforms: A Forensic Analysis

This article conducts a forensic architectural analysis of BTDUex, a fake cryptocurrency exchange, highlighting critical system design red flags that expose its fraudulent nature. It deconstructs the backend architecture, examining state management, wallet topology, and withdrawal logic, to demonstrate how a seemingly legitimate frontend can mask a deceptive, ingress-only system designed for asset extraction rather than secure financial operations. The analysis offers valuable insights for builders on identifying scam architectures through deep inspection of data flow and backend integrity.

SecurityDistributed Systems
65042413
Dev.to #architecture·2mo ago

Migrating a GenAI Game Engine from Web to Native Mobile for Performance and UGC

This article details the architectural migration of Pixelsurf, a web-based generative AI game engine, to a native mobile application, Plutusgg. The shift was driven by performance bottlenecks on mobile browsers, clunky user-generated content (UGC) distribution, and poor 'cold start' times. The re-architecture leveraged native asset caching and a unified database schema to improve latency, UGC management, and overall user experience.

Performance & ScalingDistributed Systems
100721650
High Scalability·2mo ago

Hotstar's Real-time Emoji and Voting System Architecture

This article details Hotstar's journey in building an in-house, scalable system for real-time emoji reactions and live voting, moving away from a third-party service. It highlights architectural decisions around asynchronous processing, message queuing with Kafka, and stream processing with Spark to handle billions of user interactions during live events.

Distributed SystemsPerformance & Scaling
7326100