System Design Notes All designs

Fundamentals

Kafka Deep Dive

Kafka is a distributed, partitioned, replicated append-only commit log — both a durable message queue and a real-time event stream. The default answer whenever a design needs async processing, ordered processing, decoupling, or stream processing. Almost every "deep dive" reduces to questions about that log: how it scales, survives failure, retries, performs, and gets cleaned up.

When to use Kafka

Two umbrella reasons. If your problem matches one, name Kafka and move on.

Need a queue Need a stream
Async processing — YouTube transcoding Process lots of data in real time — Ad Click Aggregator
In-order processing — Ticketmaster waiting queue Many consumers read the same stream — Messenger, FB Live comments
Decouple producer/consumer to scale independently — LeetCode judge Replayable continuous history

When NOT to: tiny scale (a DB-backed queue is simpler), strict request/response RPC, per-message priorities or TTL/delays (use RabbitMQ/SQS), or millions of distinct queues (Kafka scales with partitions, not topics).

Core concepts

Term Definition Nuance
Broker Servers that hold the log Each broker is leader for some partitions, follower for others
Partition An ordered, immutable, append-only log The unit of parallelism, ordering & replication; order only within a partition
Topic A logical grouping of partitions Name + partition count + config; repartitioning later is disruptive
Producer Writes records to topics Picks partition by key hash; controls durability via acks
Consumer Reads records from topics Pulls; tracks an offset per partition; joins a group

A record = key, value, timestamp, headers + broker-assigned partition & offset. The key drives partition assignment and therefore ordering and compaction.

Architecture at a glance

Producers append records into a topic's partitions; a consumer group spreads those partitions across its consumers so each partition is owned by exactly one consumer in the group.

flowchart LR
    P1["Producer 1"] --> T
    P2["Producer 2"] --> T
    P3["Producer 3"] --> T
    subgraph T["Topic"]
        PA["Partition 0"]
        PB["Partition 1"]
    end
    subgraph CG["Consumer Group"]
        C3["Consumer A"]
        C4["Consumer B"]
    end
    PA --> C3
    PB --> C4
      

Partitioning & ordering

Partitions are simultaneously the unit of parallelism, ordering, and replication. A producer picks a partition by: explicit partition → else hash(key) mod n (same key ⇒ same partition ⇒ ordered) → else round-robin.

Ordering guarantee: total order within a partition, none across partitions. To preserve per-entity order, key by that entity (e.g. by gameId). You can increase but not decrease partitions, and increasing rehashes keys (breaks per-key order) — so right-size early. More partitions also mean longer rebalances and more leader elections.

Consumer groups & rebalancing

The consumer group is both the unit of scaling and the queue-vs-pubsub switch: consumers sharing a group.id divide partitions (competing consumers); each extra group gets its own copy of the stream (fan-out).

Offsets & delivery semantics

An offset is the consumer's bookmark. The order of {process, commit} determines semantics:

Semantic How Failure outcome
At-most-once Commit before processing Crash after commit → message lost
At-least-once (default) Process then commit Crash before commit → redelivered (dup)
Exactly-once Idempotent producer + transactions No loss, no dupes

Most systems run at-least-once + idempotent consumers (dedupe by id / upsert by key). True exactly-once is only "free" Kafka→Kafka (Streams/transactions); external sinks still need idempotent writes.

sequenceDiagram
    participant P as Producer
    participant B as Broker Leader
    participant R as Follower ISR
    participant C as Consumer
    P->>B: Send record PID seq n
    B->>R: Replicate
    R-->>B: Ack
    B-->>P: Ack committed
    Note over B,R: Dedup by PID and seq
    C->>B: Fetch from offset
    B-->>C: Records
    C->>C: Process
    C->>B: Commit offset
      

Replication, leader/follower & ISR

Each partition has R replicas (typically 3) on different brokers. One leader handles reads + writes; followers fetch from it.

flowchart TD
    Prod["Producer"] --> L["Leader Broker 1"]
    subgraph ISR["In-Sync Replicas"]
        L --> F1["Follower Broker 2"]
        L --> F2["Follower Broker 3"]
    end
    L --> Cons["Consumer"]
      

Durable-by-default config to recite

replication.factor=3, min.insync.replicas=2, acks=all, enable.idempotence=true, unclean.leader.election.enable=false. Survives one broker failure with zero data loss. Modern Kafka replaces ZooKeeper with KRaft.

Retention vs log compaction

Kafka decouples consumption from deletion via the topic's cleanup policy:

Use delete when you care about the event-history window; compact when you care about the current value per key.

Worked use-cases

Async queue — YouTube transcoding

Upload returns fast; raw video lands in S3; Kafka buffers transcode jobs; a worker fleet consumes in parallel and writes renditions back. Kafka decouples upload from slow, bursty transcoding.

flowchart LR
    Client["Client"] --> Upload["Upload Server"]
    Upload --> S3in["S3 raw video"]
    Upload --> K["Kafka topic"]
    K --> T1["Transcoder 1"]
    K --> T2["Transcoder 2"]
    T1 --> S3out["S3 transcoded"]
    T2 --> S3out
      

Common pitfalls