System Design Notes All designs

Social & Feeds

Facebook News Feed

2B daily users publish posts, follow others, and read a personalized feed. The hard part isn't storing posts — it's assembling each feed cheaply at massive read volume, while one post from a celebrity may need to reach tens of millions of feeds.

Requirements

Functional

Non-functional

Scale & back-of-the-envelope

Quantity Estimate
DAU 2B
Feed reads ~20B/day (~230k/s avg, ~1M/s peak)
Posts ~200M/day (~2.3k/s avg)
Read : write ~100 : 1 (extremely read-heavy)
Avg followers ~300
Celebrity fan-out up to 100M feed writes for one post

The 100:1 ratio justifies fan-out on write (pay once at write so each read is one cheap query). The celebrity case breaks naive fan-out — the problem the hybrid design solves. Feed storage is bounded by capping each user's feed (~1,000 newest) ≈ 100 TB with TTL.

API design

POST /posts                  { content }                 -> 201 { postId }
PUT  /users/{id}/followers   { }   # idempotent follow   -> 200
GET  /feed?pageSize&cursor    -> { posts[], nextCursor }   # cursor-based

creatorId/userId derive from the auth token, never from the body.

High-level design

Stateless services behind a gateway, backed by DynamoDB, plus an async fan-out pipeline. The write path (async push) returns 201 immediately; the read path (≤500 ms) reads the precomputed feed merged with a live pull of celebrity posts.

flowchart LR
    Client["Client App"] --> GW["API Gateway"]
    GW --> PS["Post Service"]
    GW --> FS["Feed Service"]
    GW --> FoS["Follow Service"]
    PS --> Posts[("Posts Table")]
    PS -. "publish event" .-> Q(["Post Create Queue"])
    Q -. consume .-> FW["Feed Workers"]
    FW -. "read followers" .-> FGSI[("Follow GSI")]
    FW -. "write entries" .-> PF[("PrecomputedFeed")]
    FS --> PF
    FS --> PostGSI[("Post GSI")]
    FS --> Posts
    FoS --> Follow[("Follow Table")]
      

Deep dive · fan-out on write vs read

Fan-out on read (pull): at read time, query who you follow, then their recent posts via the Post GSI (creatorId PK, createdAt SK), merge by time. Trivial writes, but a 1,000-followee user triggers a 1,000-way scatter-gather on every open — impossible under 500 ms at 230k+ reads/s.

Fan-out on write (push): an async pipeline precomputes each follower's feed via the Follow GSI (reverse index "who follows this creator?").

sequenceDiagram
    actor U as User
    participant PS as Post Service
    participant Q as Post Create Queue
    participant FW as Feed Workers
    participant FG as Follow GSI
    participant PF as PrecomputedFeed
    U->>PS: POST /posts
    PS->>Q: publish postId creatorId
    PS-->>U: 201 postId
    Q->>FW: deliver event
    FW->>FG: query followers of creatorId
    FG-->>FW: follower list
    loop each follower
        FW->>PF: put userId createdAt postId
    end
      
Pull (read) Push (write)
Write cost O(1) O(followers) — amplified
Read cost O(followees) scatter-gather O(1) single query
Celebrity post Cheap Write storm (≤100M writes)
Best for Sparse graphs / low reads Read-heavy (our case)

Deep dive · the celebrity problem + hybrid

A 100M-follower post fans out to 100M writes and its Follow GSI row is a giant hot partition. Solution — a precomputed flag on each follow edge: normal creators are pushed; celebrities are precomputed=false (skip fan-out) and pulled at read time. The Feed Service merges both.

flowchart TD
    A["New post by creator"] --> B{"Follower count high?"}
    B -- "No normal user" --> C["Push to PrecomputedFeed"]
    B -- "Yes celebrity" --> D["Skip fan-out, edge precomputed=false"]
    C --> E["Read = single PrecomputedFeed query"]
    D --> F["Pull at read via Post GSI"]
    E --> G["Feed Service merges both sources"]
    F --> G
      

Celebrities are few but huge (pull a handful, cache it); normal creators are many but modest (push, keep reads O(1)). The flag flips when a creator crosses a follower threshold. Fan-out is idempotent (keyed by (userId, createdAt, postId)) so queue redelivery is safe.

Deep dive · ranking, caching & pagination

Data model (DynamoDB)

Posts        PK id; content, creatorId, createdAt
Post GSI     PK creatorId, SK createdAt        # recent posts by a creator (pull)
Follow       PK userFollowing, SK userFollowed; createdAt, precomputed   # hybrid flag
Follow GSI   PK userFollowed, SK userFollowing # "who follows this creator?" (fan-out)
PrecomputedFeed  PK userId, SK createdAt; postId   # cursor key; trim/TTL to ~1000

Evolution recap

(1) services + Posts/Follow/Follow GSI (read foundation) → (2) add Post GSI for efficient pull → (3) add Post Create Queue + Feed Workers + PrecomputedFeed + the precomputed flag → async fan-out on write and the hybrid feed.