API Design — System Design Interviews

01 / 06

API Paradigms

Choosing the right API style is the first signal your interviewer looks for. Each paradigm has a distinct sweet spot — knowing the tradeoffs cold is non-negotiable.

REST

Resource-oriented architecture over HTTP. Leverages standard methods and status codes. Stateless by design.

Universally understood

HTTP caching built-in

Easy to debug & test

Over/under-fetching

Multiple round trips

Use when: Public APIs, CRUD services, browser clients, broad ecosystem compatibility

GraphQL

Query language for your API. Clients declare exactly what data they need in a single request.

No over/under-fetching

Strongly typed schema

Single endpoint

Complex caching

Query depth attacks

Use when: Complex UIs, mobile (bandwidth), rapid frontend iteration

gRPC

High-performance RPC using Protocol Buffers over HTTP/2. First-class streaming support.

~10x faster than REST

Bidirectional streaming

Strict contract

Not browser-native

Binary (harder to debug)

Use when: Internal microservices, low-latency, streaming data pipelines

GET    /v1/users/{id}           # Get a user
POST   /v1/users                # Create user
PUT    /v1/users/{id}           # Full replace
PATCH  /v1/users/{id}           # Partial update
DELETE /v1/users/{id}           # Delete

# Nested resources — keep depth ≤ 2
GET    /v1/users/{id}/orders    # OK
GET    /v1/orders?userId={id}   # Better for deep nesting

02 / 06

API Versioning

Breaking changes are inevitable. How you version determines how much pain you cause your consumers and yourself.

Strategy	Example	Pros	Cons	Best for
URL Path	`/v1/users`	Explicit, cacheable, easy routing	URL proliferation	Public REST APIs
Header	`API-Version: 2024-01`	Clean URLs, date-based	Harder to test in browser	Stripe-style APIs
Query Param	`?version=2`	Easy to test	Cache pollution, easily missed	Internal / beta APIs
Content Type	`Accept: application/vnd.api.v2+json`	Semantically correct	Very verbose, low adoption	Rarely used in practice

⚑ Interview Signal

Stripe uses date-based header versioning and pins each API key to the version active at creation. Users upgrade explicitly. This is widely considered best-in-class — mention it when discussing public APIs.

Breaking vs Non-Breaking Changes

Safe (Non-Breaking)

Adding new optional fields to responses
Adding new endpoints
Adding new optional request parameters
Expanding enum values (with care)
Relaxing validation rules

Breaking Changes

Removing or renaming fields
Changing field types (string→int)
Changing HTTP methods or status codes
Adding required request fields
Changing auth mechanisms

03 / 06

Pagination Patterns

Every list endpoint needs pagination. The strategy you pick has deep implications for performance, consistency, and client complexity.

Strategy	Mechanism	Scale	Consistency	When to use
Offset	`?offset=20&limit=10`	Poor	Unstable	Admin dashboards, small datasets
Cursor	`?after=eyJpZCI6MTIzfQ`	Excellent	Stable	Feeds, infinite scroll, large datasets
Page Token	`?pageToken=Abc123XY`	Good	Stable	Google APIs pattern, opaque tokens
Time-based	`?since=2024-01-01T00:00Z`	Excellent	Depends	Event logs, audit trails, webhooks

{
  "data": [...],
  "pagination": {
    "cursor": "eyJpZCI6MTAwfQ==",   // base64({"id":100})
    "has_more": true,
    "total": null            // omit for perf — avoid COUNT(*)
  }
}

// Next page request
GET /v1/messages?after=eyJpZCI6MTAwfQ==&limit=20

⚑ Interview Signal

Offset pagination is O(offset) in most databases — OFFSET 10000 still scans 10,000 rows. For any high-traffic or large dataset, always recommend cursor-based. The cursor is typically an encoded primary key or (timestamp, id) composite.

04 / 06

Auth & Rate Limiting

Authentication Patterns

API Keys

Simple, long-lived tokens
Best for server-to-server
Store hashed, never plain
Support rotation & scoping

JWT / OAuth 2.0

Short-lived access tokens (15min)
Refresh tokens for renewal
Stateless — no DB lookup
Include iss, sub, exp, scope

// Header
{ "alg": "RS256", "typ": "JWT" }

// Payload — keep small (network overhead)
{
  "sub": "user_abc123",
  "iss": "auth.example.com",
  "aud": "api.example.com",
  "exp": 1735689600,      // 15 min from now
  "scope": "read:users write:orders"
}

// Never put PII or secrets in JWT payload — it's only base64 encoded

Rate Limiting Algorithms

Algorithm	How it Works	Burst?	Complexity
Fixed Window	Count resets each window (e.g., per minute)	Edge bursts possible	O(1) Redis INCR
Sliding Window	Log timestamps in a sorted set, count within window	Smooth	O(N) memory
Token Bucket	Tokens refill at constant rate, consumed per request	Allows bursts	O(1) state
Leaky Bucket	Requests queue and drain at fixed rate	No bursts	O(1) state

# Always return these headers so clients can back off gracefully
X-RateLimit-Limit: 1000          # requests per window
X-RateLimit-Remaining: 847       # remaining this window
X-RateLimit-Reset: 1735689600    # epoch when window resets
Retry-After: 60                  # seconds (on 429)

# When limit exceeded:
HTTP/1.1 429 Too Many Requests

05 / 06

Error Handling & Idempotency

Error Response Design

{
  "error": {
    "code": "VALIDATION_ERROR",          // machine-readable
    "message": "Invalid email format",   // human-readable
    "field": "email",                  // optional: field context
    "request_id": "req_7Xk3mN9pL",     // for tracing/support
    "docs_url": "https://api.example.com/errors/VALIDATION_ERROR"
  }
}

// HTTP Status Codes — use them correctly
200 OK         | 201 Created       | 204 No Content
400 Bad Request | 401 Unauthorized  | 403 Forbidden
404 Not Found   | 409 Conflict      | 422 Unprocessable
429 Rate Limited| 500 Server Error  | 503 Unavailable

401 vs 403

401 = not authenticated (who are you?)
403 = not authorized (I know you, but no)
401 must include WWW-Authenticate header
Never expose 403 for existence probing — use 404

500 Best Practices

Always return request_id for debugging
Never expose stack traces in production
Log the full error server-side
Include retry guidance (Retry-After)

Idempotency

Network failures are inevitable. Clients retry. Your API must handle duplicate requests safely — especially for writes and financial operations.

// Client generates a unique key per "logical operation"
POST /v1/payments
Idempotency-Key: k1_2Xm9pL8nQ7rT4vF   // client-generated UUID

{
  "amount": 5000,
  "currency": "usd"
}

// Server behavior:
// 1. Hash the key, check cache/DB
// 2. If found → return stored response (no re-execution)
// 3. If not found → process, store result with key (TTL: 24h)
// 4. Return same response for all retries

// GET, HEAD, OPTIONS = naturally idempotent
// PUT, DELETE = idempotent by spec
// POST, PATCH = require explicit handling

06 / 06

Interview Playbook

Signals that separate mid-level answers from staff-level answers in system design.

01

Clarify Before Designing

Ask who the consumers are (browser, mobile, microservices?), expected scale (QPS), and SLA requirements. Never jump to REST vs gRPC without this.

02

Name Resources, Not Verbs

Use /payments, not /createPayment. Actions are HTTP methods. Exception: long-running operations like /payments/{id}/cancel are acceptable.

03

Design for Failure

Proactively mention idempotency keys, retry headers, exponential backoff guidance, and circuit breaker patterns. Shows operational maturity.

04

Think in Contracts

An API is a contract. Distinguish breaking vs additive changes. Propose a deprecation strategy (sunset headers, migration guides, dual-running versions).

05

Mention Observability

Always include request IDs in responses. Discuss distributed tracing (trace-id propagation), structured logs, and rate limit telemetry dashboards.

06

Security Isn't Optional

HTTPS everywhere, input validation at the edge, short-lived JWTs, CORS policies, and API key rotation. Name these unprompted — most candidates don't.

Quick Reference Cheat Sheet

REST idempotent methods GET, HEAD, PUT, DELETE, OPTIONS — safe to retry without side effects

JWT expiry best practice Access token: 15min · Refresh token: 7–30 days · Rotate refresh on use

Pagination default Cursor over offset for scale · Default limit 20 · Max limit 100

Rate limit algorithm Token bucket for APIs · Leaky bucket for queuing · Sliding window for precision

gRPC over REST when Latency <10ms required · Streaming needed · Internal service mesh

Idempotency key TTL 24 hours is standard (Stripe) · Store key → response in Redis/DB

URL versioning depth Nest at most 2 levels: /users/{id}/orders — deeper → use query params

Error envelope keys code (machine) · message (human) · request_id (tracing) · docs_url

GraphQL vs REST GraphQL for complex UIs / mobile bandwidth · REST for public APIs / caching

Breaking change strategy Sunset header → deprecation notice → parallel run → removal window (6–12mo)

APIDesign

API Paradigms

API Versioning

Breaking vs Non-Breaking Changes

Safe (Non-Breaking)

Breaking Changes

Pagination Patterns

Auth & Rate Limiting

Authentication Patterns

API Keys

JWT / OAuth 2.0

Rate Limiting Algorithms

Error Handling & Idempotency

Error Response Design

401 vs 403

500 Best Practices

Idempotency

Interview Playbook

Clarify Before Designing

Name Resources, Not Verbs

Design for Failure

Think in Contracts

Mention Observability

Security Isn't Optional

API
Design