Rate limiting at the edge vs application layer: defense in depth

·105 views

We're currently scaling our public API to handle about 10,000 requests per second, and rate limiting is a big part of that. We're trying to decide on the best strategy for implementing it: at the edge via a CDN or API Gateway, or deeper in the application layer. Right now, we're doing a bit of both, but I'm wondering about the best defense-in-depth approach. We're using a token bucket algorithm, but ensuring consistency across multiple API gateway instances is proving tricky. If a user hits instance A, then instance B, their rate limit might not be accurately tracked without some sort of distributed store like Redis. Is it generally better to push rate limiting as far to the edge as possible to shed load before it hits our services? Or is there value in application-level rate limiting for specific, more expensive endpoints, even if the edge handles the majority? I'm trying to avoid letting malicious traffic even touch our application servers, but also provide fair usage.

13 comments

Rate limiting at the edge vs application layer: defense in depth

Comments