Async Request-Reply Pattern
Long-running operations over messaging: polling endpoints, webhooks, status callbacks, and the async HTTP pattern with 202 Accepted.
The Problem: Long-Running Operations
Some operations take seconds, minutes, or even hours to complete: video transcoding, report generation, ML inference on large datasets, batch ETL jobs. Holding an HTTP connection open for that duration is impractical — clients timeout, load balancers kill idle connections, and your server threads are tied up doing nothing but waiting.
The Async Request-Reply pattern solves this by immediately returning a job ID (or status URL) to the client and processing the work in the background. The client can then check on progress later — either by polling a status endpoint or by registering a webhook to be called when the work is complete.
HTTP 202 Accepted: The Entry Point
The canonical async HTTP entry point uses the `202 Accepted` status code. Unlike `200 OK` (which means 'done'), `202` means 'I received your request and am working on it.' The response body should include a status URL or job ID the client can use to check progress.
POST /api/reports/generate
Content-Type: application/json
{ "reportType": "annual", "year": 2025 }
──── Response ────
HTTP/1.1 202 Accepted
Location: /api/reports/status/job_7f3c2a
Retry-After: 5
{
"jobId": "job_7f3c2a",
"status": "queued",
"statusUrl": "https://api.example.com/api/reports/status/job_7f3c2a",
"estimatedSeconds": 30
}Full Async Request-Reply Flow
Polling vs Webhooks
Once the job is submitted, the client needs to learn when it's done. There are two primary approaches:
| Approach | How It Works | Client Requirements | Best For |
|---|---|---|---|
| Polling | Client repeatedly calls the status endpoint until status is `complete` or `failed` | Any HTTP client; no inbound connectivity needed | Browser clients, mobile apps, simple scripts |
| Webhook (callback) | Server POSTs the result to a pre-registered callback URL when done | Client must expose a publicly reachable HTTPS endpoint | Server-to-server integrations, CI/CD pipelines, payment processors (Stripe, PayPal) |
| WebSocket / SSE | Server pushes status updates over a persistent connection | Client must maintain an open connection | Real-time progress bars, dashboard UIs |
Polling Interval Strategy
Use exponential backoff for polling: start at 1s, double each attempt up to a cap (e.g., 30s). This avoids hammering the status endpoint if the job takes longer than expected. Include a `Retry-After` header in your 202 response as a hint to clients.
Status Endpoint Design
The status endpoint should return a consistent structure that the client can machine-parse. Include at minimum: `status` (queued, processing, complete, failed), `progress` (optional 0–100), `resultUrl` (when complete), and `error` (when failed).
// GET /api/jobs/:jobId
interface JobStatus {
jobId: string;
status: "queued" | "processing" | "complete" | "failed";
progress?: number; // 0-100 when processing
resultUrl?: string; // set when complete
error?: string; // set when failed
createdAt: string; // ISO8601
completedAt?: string; // ISO8601 when done
}
// Example responses:
// 102 Processing — during active work (optional)
// 200 { status: "complete", resultUrl: "/results/job_7f3c2a.pdf" }
// 200 { status: "failed", error: "Report data not found for year 2025" }Job State Storage
Job state must be persisted somewhere the API server can read and workers can update. Common choices:
- Redis with TTL: fast reads, automatic expiry for completed jobs. Ideal for short-lived jobs (minutes to hours).
- Relational DB table (`jobs` table): durable, queryable, supports rich filtering. Use for jobs that must survive restarts or require audit history.
- DynamoDB / Firestore: serverless, no ops overhead. Good for high-volume jobs at scale.
Idempotency Keys for Retried Submissions
Clients may retry the initial POST if they don't receive a 202 (network failure, timeout). Without protection, you'd create duplicate jobs. The solution is an idempotency key — the client generates a UUID and sends it as a header (`Idempotency-Key: <uuid>`). The server stores the key and returns the existing job ID if the same key is submitted again within a TTL window.
Interview Tip
This pattern appears in virtually every large-scale design interview: ML prediction APIs, video processing, batch reporting, payment processing. Lead with the 202 Accepted response, explain the status polling endpoint, mention webhooks as an alternative, and don't forget idempotency keys for retried submissions. These four points demonstrate real production experience.