Feature Flags
Decouple deployment from release: flag types (release, experiment, ops, permission), flag lifecycle management, and technical debt considerations.
The Core Idea: Deployment Is Not Release
Feature flags (also called feature toggles or feature switches) decouple deployment from release. A deployment is when code ships to production. A release is when users can see and use the feature. With feature flags, you can deploy code to 100% of servers but activate it for 0% of users — or activate it for internal employees only, or activate it for 10% of users in a specific region.
This is a fundamental shift in how teams ship software. Instead of coordinating big-bang releases, teams merge small incremental code changes behind flags continuously. The business, not the deploy pipeline, decides when a feature 'goes live'.
The Four Types of Feature Flags
Pete Hodgson's canonical taxonomy (from Martin Fowler's bliki) identifies four distinct flag types, each with different lifespans and audiences:
| Type | Purpose | Owner | Lifespan | Example |
|---|---|---|---|---|
| Release toggle | Hide incomplete features during development | Engineering | Short (days to weeks) | New checkout flow — off until feature-complete |
| Experiment toggle | A/B test feature variants to measure impact | Product / Data Science | Medium (weeks) | Test two recommendation algorithms |
| Ops toggle | Operational circuit breakers for risky subsystems | SRE / Ops | Medium to permanent | Disable ML inference and fall back to rule-based scoring under load |
| Permission toggle | Enable features for specific user segments | Product / Billing | Long (months to permanent) | Enterprise tier gets advanced analytics dashboard |
Flag Granularity
Flags can target at multiple granularities: individual users (by user ID), percentage of users, user segments (by attribute), organizations or tenants, geographic regions, or device types. Modern flag platforms like LaunchDarkly allow complex boolean and multi-variate rules combining any of these dimensions.
Architecture of a Feature Flag System
A production-grade feature flag system has three components:
- Flag management service: A dashboard where product managers, engineers, and SREs define flag rules, target audiences, and rollout percentages. Examples: LaunchDarkly, Unleash, Split.io, AWS AppConfig, GrowthBook (open source).
- SDK (in-process evaluation): The SDK runs inside the application process. It caches flag rules locally (updated via streaming or polling) and evaluates flags locally with no network round-trip on the hot path. This keeps latency impact near zero.
- Analytics / audit log: Every flag evaluation is streamed back for debugging, auditability, and experiment analysis. This is how you prove that users in the treatment group converted at a higher rate.
Code Example: Evaluating a Flag
// LaunchDarkly SDK — server-side evaluation
import * as ld from '@launchdarkly/node-server-sdk';
const client = ld.init(process.env.LD_SDK_KEY);
await client.waitForInitialization();
async function getRecommendations(userId: string): Promise<Recommendation[]> {
const context = { kind: 'user', key: userId };
// Evaluate a multi-variate flag
const algorithm = await client.variation(
'recommendation-algorithm', // flag key
context,
'collaborative-filtering' // default value if SDK fails
);
if (algorithm === 'llm-rerank') {
return await llmRerankedRecommendations(userId);
} else {
return await collaborativeFilteringRecommendations(userId);
}
}Flag Lifecycle Management
Flags accumulate over time. A codebase with hundreds of stale flags becomes a maintenance nightmare — every engineer has to understand which branches are dead. This is called flag debt, and it is one of the most common pitfalls of feature flag adoption.
- Set a removal date when creating the flag. Release toggles should have a maximum lifespan of 2 weeks. Create a ticket on flag creation to clean it up.
- Treat flag cleanup as a first-class task. After a feature is fully rolled out, remove the flag and the dead code path within the same sprint.
- Tag flags by type. Your flag management platform should categorize flags so you can identify stale ones.
- Add automated alerts for aged flags. LaunchDarkly and Unleash support notifications when flags haven't been changed in N days.
The Technical Debt of Flag Sprawl
Knight Capital Group's $440 million trading loss in 2012 was partly caused by reactivating a dormant feature flag that triggered old, dead code. Flag sprawl is a genuine operational risk. Every flag that is not cleaned up is code complexity and a potential incident waiting to happen.
Ops Toggles: The Operational Circuit Breaker
Ops toggles deserve special attention because they are permanent infrastructure, not temporary scaffolding. An ops toggle lets SREs disable a risky subsystem in seconds without a code deploy. Examples:
- Disable the ML inference endpoint and fall back to rule-based scoring when GPU inference latency spikes
- Turn off expensive third-party enrichment API calls during a provider outage
- Disable real-time personalization and serve cached recommendations during a database incident
- Kill a background job that is causing database contention
These are your graceful degradation levers. In a well-operated system, SREs can pull these toggles from an ops dashboard in under 30 seconds to contain an incident — no deploy, no rollback, no on-call engineer woken up to merge a PR.
Interview Tip
Feature flags are a Swiss Army knife — use them for the right job. In interviews, distinguish flag types: a release toggle is temporary scaffolding; an ops toggle is permanent infrastructure. When discussing dark launches (deploying code that exercises production systems without exposing UI to users), mention that feature flags are the mechanism. And always bring up flag debt — it shows you understand the long-term operational cost, not just the immediate benefit.