Messaging
WhatsApp — Chat & Messaging
Global real-time 1:1 and group chat with media and offline delivery for billions of users. The core challenge: hold hundreds of millions of persistent connections and route each message to the right recipient's socket (or queue) in < 500 ms, guaranteeing delivery while not storing messages longer than necessary.
Requirements
Functional
- Group chats (1:1 = group of 2); send/receive messages & media; access messages after being offline. (Audio/video calling out of scope.)
Non-functional
- Low latency (< 500 ms); guaranteed delivery (at-least-once + client dedupe).
- Billions of users; messages not stored unnecessarily (delete after delivery); fault-tolerant.
- Per-chat ordering required; global ordering not.
Scale & back-of-the-envelope
- 1B users × 100 msgs/day = 100B msgs/day (~1.16M/s avg, ~5–10M/s peak); ~1 KB each ≈ 100 TB/day written.
- ~100M–300M live WebSockets; a tuned server holds ~100k–1M → hundreds to thousands of Chat Servers.
- Key insight: since messages are deleted after delivery, durable storage is only the undelivered backlog + media in S3 — not 100 TB/day forever.
API design
REST for the control plane + media; WebSocket for the data-plane hot path.
# Control plane (REST)
POST /chats | POST /chats/{id}/participants | GET /chats/{id}/messages
POST /media/presign -> presigned S3 PUT + blobRef # media goes direct to S3
# Data plane (WebSocket)
C->S: send_message {clientMsgId, chatId, contents} # idempotent
C->S: ack_delivered {messageId} | ack_read {chatId, upToMessageId} | pull_backlog {sinceSeq}
S->C: message {...} | ack {SENT} | receipt {DELIVERED|READ} | presence {...}
High-level design
A Chat Registry (over ZooKeeper) tells a client which Chat Server to connect to; servers route to each other via Redis pub/sub; messages persist to DynamoDB; an Inbox queues messages for offline devices; media flows through S3; a Cleanup worker enforces "not stored unnecessarily."
flowchart LR
C1["Client A (phone)"]
C2["Client B (laptop)"]
LB["Load Balancer"]
REG["Chat Registry"]
ZK["Zookeeper"]
WS1["Chat Server 1 (WebSocket)"]
WS2["Chat Server 2 (WebSocket)"]
PS["Redis Pub/Sub"]
DB[("DynamoDB: chats, messages, inbox")]
BLOB[("Blob Store S3")]
CLEAN["Cleanup Worker"]
PUSH["Push APNs / FCM"]
C1 -->|"1 which server?"| REG
REG --> ZK
C1 -->|"2 WSS connect"| LB
LB --> WS1
C2 --> LB --> WS2
WS1 --> PS --> WS2
WS1 --> DB
WS2 --> DB
C1 -->|"media upload"| BLOB
BLOB --> C2
WS1 --> PUSH
CLEAN --> DB
ZK --> WS1
ZooKeeper stores ephemeral
clientId → chatServer assignments — when a server dies
its entries vanish, so routing self-heals; Redis pub/sub is the
cross-server hop.
Deep dive · delivery to an online user (receipts)
A message is persisted before it's acked to the
sender (durability first), then routed to the recipient's server and
pushed down their socket; receipts flow back to drive the tick states
SENT (✓) → DELIVERED (✓✓) → READ (blue).
sequenceDiagram
participant A as Client A
participant SA as Chat Server A
participant DB as DynamoDB
participant SB as Chat Server B
participant B as Client B
A->>SA: send_message(clientMsgId, chatId, text)
SA->>DB: persist message + inbox row (B)
SA-->>A: ack SENT (1 tick)
SA->>SB: route via Redis Pub/Sub
SB->>B: deliver message over WSS
B-->>SB: ack_delivered
SB->>DB: mark delivered, clear inbox row
SB->>SA: DELIVERED
SA-->>A: DELIVERED (2 ticks)
B->>SB: ack_read
SA-->>A: READ (blue ticks)
Persist-then-ack costs one hot-path DB write but is the only way to guarantee delivery across crashes. Redis pub/sub is at-most-once on the wire — safe because the persisted Inbox is the source of truth and is re-drained on reconnect.
Deep dive · delivery to an offline user
With no live socket, the message stays in the recipient's
Inbox (QUEUED); a push notification
wakes the app; on reconnect the client drains the inbox in order, acks
each, and the server deletes delivered rows.
sequenceDiagram
participant A as Client A
participant SA as Chat Server A
participant DB as Inbox / DynamoDB
participant PUSH as APNs / FCM
participant B as Client B (offline)
A->>SA: send_message(chatId, text)
SA->>DB: write to B inbox (QUEUED)
SA->>PUSH: trigger push notification
PUSH->>B: notification (wake app)
B->>SA: WSS connect + pull_backlog(sinceSeq)
SA->>DB: read QUEUED for B (ordered)
SA->>B: deliver backlog
B-->>SA: ack_delivered (each)
SA->>DB: delete delivered rows
DynamoDB TTL is a safety-net GC for delivered rows whose ack-driven delete was missed.
Deep dive · group fan-out & ordering
On send, look up ChatParticipant, write one inbox row per
recipient-client, and route to those online. A group
message is deleted only once every participant has
acked.
flowchart TD
A["Sender Client"] --> WS["Chat Server"]
WS --> DB[("Lookup ChatParticipant")]
DB --> FAN{"Fan-out per participant device"}
FAN --> I1["Inbox: member 1"]
FAN --> I2["Inbox: member 2"]
FAN --> I3["Inbox: member 3"]
I1 --> O1["Online: push via WSS"]
I2 --> O2["Offline: keep QUEUED"]
I3 --> O3["Online: push via WSS"]
O1 --> ACK["Collect ACKs"]
O3 --> ACK
ACK --> DEL["Delete when ALL members ack"]
Fan-out on write gives fast reads + simple offline
delivery but is expensive for large groups (cap ~1,024 members).
Ordering: assign a time-sortable ID (Snowflake/KSUID)
at the first server; clients re-sequence by ID and dedupe by
clientMsgId. A strict total order would need a per-chat
sequencer bottleneck — per-chat ordering localizes the guarantee.
Deep dive · multi-device & E2E encryption
-
Multi-device: the Inbox is keyed by
recipientClientId— each device drains and acks its own copy; a per-devicelastSeenSeqpulls only what it missed. - E2EE (Signal): devices publish pre-key bundles; senders establish an X3DH session and ratchet with the Double Ratchet. Servers route ciphertext only — they can't read messages, which reinforces "not stored unnecessarily." Groups use Sender Keys (encrypt once, fan out).
Data model
Chat id PK; name, isGroup, metadata
ChatParticipant chatId PK, participantId SK
Client userId PK, clientId SK; pushToken, lastSeenSeq # multi-device
Messages id PK; chatId GSI, contents|blobRef, creatorId, timestamp
Inbox recipientClientId PK, messageId SK; status QUEUED|DELIVERED # mailbox
Availability userId PK; status, connectedServer # presence
Design through-line
The Inbox/mailbox is the source of truth for delivery, the WebSocket is just a fast path, and deletion-after-ack simultaneously delivers guaranteed delivery, low latency, and "messages not stored unnecessarily." DynamoDB (partitioned by chatId / recipientClientId) + S3 for media + ZooKeeper for routing + Redis pub/sub for the hop.