CODITECT
CODITECT VTR
Visual Test Report
PASSED

ADR-003: Real-Time (Superseded)

Source: ADR-003-realtime-websocket.md

ADR-003: Real-Time Strategy — Native WebSocket with Redis Pub/Sub

Date: 2026-04-09 | Amended: 2026-04-10

Status: Superseded by ADR-007 (Durable Objects over Redis Pub/Sub)

Deciders: Platform Engineering Lead, Infrastructure Lead

CODITECT Classification: Architecture Decision Record · A6

v2.0 note: This ADR described the v1.0 Node.js + Redis approach. It was fully superseded by ADR-007 in `ADR-006-007-cloudflare-d1-durable-objects.md`, which replaces `ws` npm + Redis with Cloudflare Durable Objects using the WebSocket hibernation API. The v1.0 content below is retained for historical context.

Context

The organizer results page and the participant response page both need to show live availability updates as other participants submit responses. Without real-time updates, users must manually refresh to see current state — a poor experience that increases the chance of the organizer reading stale results.

Real-time strategies evaluated:

OptionMechanismComplexityCostNotes
Manual refreshNoneNoneNonePoor UX; no live heatmap possible
HTTP polling (every 3s)setInterval + GETLowMedium (DB reads)Simple but inefficient at scale
Server-Sent Events (SSE)HTTP/1.1 streamingLow-MediumLowOne-directional; can't reuse for future bidirectional needs
Native WebSocket (`ws` library)Persistent TCPMediumLowFull control; requires custom Next.js server
Managed WebSocket (Pusher/Ably)Third-party SaaSLowMedium-High (per message pricing)Vendor dependency; data leaves infrastructure
Socket.IOWS + HTTP fallbackMediumLowHeavier than needed; polling fallback unnecessary given modern browser support

Horizontal scaling concern: A single WebSocket server holds connections in memory. If we run multiple instances, a connection to instance A cannot receive a broadcast published by instance B. This requires a pub/sub layer to fan out across instances.


Decision

Use native ws (npm) WebSocket library with a Redis Streams pub/sub fan-out for horizontal scaling.

  • A custom Next.js server (server.ts) upgrades HTTP connections to WebSocket connections using the Node.js http.Server upgrade event — the standard pattern for adding WS to Next.js without a separate process
  • Each poll gets a channel identified by poll:{slug}
  • When a response is submitted (via Route Handler), ResponseService publishes a results-updated event to Redis on the poll:{slug} channel
  • The WS server subscribes to Redis channels and fans out to all connected WebSocket clients for that poll
  • Client-side: a minimal RealtimeProvider React context holds the WS connection and triggers a results refetch on results-updated

For v1.0 single-node deployment: The Redis pub/sub layer is implemented from day one even though it is not strictly necessary on a single node — this avoids a painful refactor when horizontal scaling is added.


Consequences

Positive:

  • No third-party SaaS dependency — data never leaves our infrastructure
  • Native ws is a minimal, well-maintained library (no Socket.IO overhead)
  • Redis pub/sub fan-out means the WS layer is horizontally scalable without code changes — just add nodes
  • Real-time heatmap gives immediate visual feedback as participants respond
  • No per-message pricing — operational cost is flat Redis cost already in the stack

Negative:

  • Custom Next.js server (server.ts) is required — cannot use next start directly; must wrap it. This means Vercel's serverless deployment model is incompatible (Vercel does not support persistent WebSocket connections in serverless functions). Deployment constraint: use Railway, Render, Fly.io, or self-hosted Kubernetes — not Vercel serverless.
  • Custom WS server adds operational complexity vs. managed service — team must monitor WS connection counts and handle graceful shutdown
  • Redis becomes a required dependency (not optional) — health check must include Redis connectivity

Neutral:

  • SSE was rejected primarily because we anticipate future bidirectional needs (e.g., organizer pushing a "poll is closing" notification to participants). Native WS supports this without a protocol change.

Deployment Constraint (Critical)

Next.js with a custom `server.ts` using persistent WebSocket connections cannot be deployed to Vercel's serverless infrastructure. All deployment targets for this application must support long-running Node.js processes. Approved targets for v1.0: Railway, Render, Fly.io, AWS ECS (Fargate), self-hosted Kubernetes.

This constraint must be communicated to DevOps before any infrastructure provisioning begins.


Alternatives Rejected

Managed WebSocket (Pusher/Ably): Introduces a third-party data processor for all availability data. For regulated enterprise contexts (CODITECT's target verticals), routing participant response data through an external SaaS creates unnecessary data processing agreements and potential compliance risk. Rejected on data sovereignty grounds.

HTTP polling every 3s: Creates approximately 200 unnecessary DB reads per minute for a poll with 10 active participants. Wasteful and does not provide smooth real-time heatmap animation.

Socket.IO: Adds 60kB+ to the client bundle and includes a polling fallback transport we don't need. Native ws is ~10kB and sufficient.


Review Trigger

Revisit if concurrent active WebSocket connections exceed 5,000 sustained — at that point, evaluate whether a dedicated WS microservice or managed service (with appropriate DPA) is more operationally appropriate than scaling the monolithic Next.js custom server.