Location & Matching
Uber — Ride Hailing
A rider enters pickup + destination, gets a fare estimate, and requests a ride; the system matches a nearby driver who accepts. Hard problems: absorbing a firehose of driver location writes, low-latency proximity search, and consistent 1:1 matching so a driver is never double-booked.
Requirements
Functional
- Rider inputs start + destination and gets a fare estimate; requests a ride.
- Driver can accept/deny and navigate to pickup then drop-off.
Non-functional
- Low-latency matching (< 1 min or fail); 1:1 matching consistency (no double-booking).
- High availability outside the matching path; surge handling (100k+ requests/region).
CAP stance
The matching path chooses CP (fail-and-retry rather than double-assign). Fare-estimate, location-ingest, and notifications choose AP.
Scale & back-of-the-envelope
| Quantity | Value | Notes |
|---|---|---|
| Total / active drivers | 6M / 3M | concurrent online |
| Location interval | every 5 s | each active driver pings |
| Location writes | ~600k/s | 3M ÷ 5 — the dominant write load |
| Location DB throughput | 100k–1M TPS | proximity reads + writes |
The 600k writes/s must never touch the transactional DB — it lands in an in-memory geospatial store (Redis, ~300 MB hot state). Reads push toward 1M TPS → shard by geography. Matching is regionally bursty → shard/scale per region and queue requests.
API design
POST /ride/fare-estimate { source, destination } -> { rideId, fare, ETA } # creates Ride
PATCH /ride/request { rideId } # enqueues for matching
POST /location/update { lat, long } -> 202 # async, ~every 5s
PATCH /ride/driver/accept { rideId, accept }
PATCH /ride/driver/update { rideId, status } -> next nav target | null
/location/update returns 202 immediately and
is processed asynchronously — it must never block on the transactional
DB.
High-level design
flowchart LR
Rider["Rider App"]
Driver["Driver App"]
GW["API Gateway: auth, SSL, rate limit"]
RS["Ride Service: fare estimation"]
MAP["3rd Party Mapping"]
Q["Ride Request Queue"]
RMS["Ride Matching Service"]
LS["Location Service"]
LDB[("Location DB - Redis geohash")]
PDB[("Primary DB - DynamoDB")]
LOCK[("Driver Lock - TTL")]
NOTIF["Notification - APN/FCM"]
Rider -->|fare est / request| GW
Driver -->|location / accept| GW
GW -->|fare, accept| RS
GW -->|requestRide| Q
GW -->|updateLocation| LS
RS -->|ETA, route| MAP
RS -->|read/write ride| PDB
Q -->|dequeue| RMS
RMS -->|getDriverLocations| LDB
RMS -->|lock driverId| LOCK
RMS -->|notify driver| NOTIF
LS -->|write location| LDB
NOTIF -->|push| Driver
sequenceDiagram
participant R as Rider
participant RS as Ride Service
participant Q as Request Queue
participant RMS as Matching Service
participant LDB as Location DB
participant LK as Driver Lock
participant D as Driver
R->>RS: POST /ride/fare-estimate
RS-->>R: fare, ETA, rideId
R->>Q: PATCH /ride/request
Q->>RMS: dequeue ride
RMS->>LDB: getDriverLocations(geo)
LDB-->>RMS: nearby drivers ranked
loop until match or 1 min timeout
RMS->>LK: lock(driverId, TTL)
RMS->>D: notify ride offer
D->>RMS: accept
end
RMS-->>R: driver assigned, ETA
Deep dive · geospatial indexing
Scanning all 3M drivers per query is O(n) and impossible
at 1M TPS. A spatial index turns proximity search into a key lookup
over a few cells.
| Criterion | Geohash (chosen) | Quadtree | H3 |
|---|---|---|---|
| Implementation | Simplest (Redis built-in) | Complex | Moderate (library) |
| Density adaptivity | Fixed grid | Adaptive | Multi-resolution |
| Write cost (600k/s) | Cheapest | Expensive (rebalance) | Cheap |
| Neighbor accuracy | Corner anomalies | Good | Best (6 equal) |
Decision:
Geohash on Redis
(GEOADD/GEOSEARCH) — meets 1M TPS with
simple ops and easy geographic sharding. Query the rider's cell
+ its 8 neighbors and re-rank by true distance/ETA to
remove boundary effects. Production Uber uses
H3 (hexagons → 6 equidistant neighbors) — the
strongest "evolve the design" answer.
Deep dive · high-frequency location writes
~600k writes/s is the system's heartbeat:
- Separate the write path — a dedicated Location Service writes to Redis, never DynamoDB.
-
Async / fire-and-forget (
202) — losing one ping is harmless; another arrives in 5 s. - Last-write-wins per driver (one Redis key, overwrite in place); shard by geography.
- Adaptive ping frequency (slower when parked, faster in-ride); TTL eviction so a crashed driver drops out of search.
Trade-off: in-memory + async = blistering throughput but weaker durability — acceptable because live position is ephemeral and self-healing; only committed ride state needs DynamoDB durability.
Deep dive · matching & avoiding double-booking
A region-partitioned queue absorbs surge bursts and decouples request latency from matching. The offer loop reserves one driver at a time with a TTL lock:
while (noMatch) {
driver = nextDriver // best-ranked candidate
lock(driver) // conditional PutItem, with TTL
sendNotification(driver) // push the offer
wait(10s) // accept -> break; decline/timeout -> next
} // bounded by the < 1 min match-or-fail SLA
flowchart TD
A["Dequeue ride request"] --> B["getDriverLocations from Redis"]
B --> C{"Drivers found?"}
C -->|No| F["Mark ride Failed"]
C -->|Yes| D["driver = nextDriver"]
D --> E{"Acquire driver lock - TTL?"}
E -->|Lock held| L
E -->|Lock acquired| G["sendNotification(driver)"]
G --> H["wait 10s"]
H --> I{"Accepted?"}
I -->|Yes| J["Assign ride, status = matched"]
I -->|No or timeout| K["Lock auto-expires via TTL"]
K --> L{"More drivers and within 1 min?"}
L -->|Yes| D
L -->|No| F
J --> M["Driver status = in_ride"]
Two invariants: never send more than one request per ride, and never
send a driver more than one request at a time. A
Driver Lock keyed by driverId with
TTL in DynamoDB acquired via conditional write
(attribute_not_exists) gives atomic compare-and-set; the
TTL auto-releases if the matcher crashes or the window lapses (no
stranded driver). On accept, a TransactWriteItems commits
ride + driver status together.
Sequential + lock trades a little latency for the 1:1
correctness the NFR demands (vs broadcast, which reintroduces the
double-accept race).
Data model & ride lifecycle
stateDiagram-v2
[*] --> FareEstimated
FareEstimated --> Requested: rider confirms
Requested --> Matching: enqueued
Matching --> Matched: driver accepts
Matching --> Failed: timeout or no driver
Matched --> EnRoute: heading to pickup
EnRoute --> InRide: pickedup
InRide --> Completed: droppedoff
Completed --> [*]
Failed --> [*]
Primary DB (DynamoDB)
Ride id PK; riderId, driverId?, fare, ETA, source, destination, status
Driver id PK; status: available | in_ride | offline
DriverLock driverId PK; rideId, ttl (epoch -> DynamoDB TTL auto-delete)
Location DB (Redis)
GEOADD region driverId long lat # per-key TTL; GEOSEARCH BYRADIUS
Surge pricing (extension)
A pricing component watches per-cell demand:supply (open requests vs available drivers in Redis) and applies a multiplier on the fare-estimate path, so the rider sees the surged price before requesting — and it naturally throttles demand during peaks.