Skip to main content

ADR-017: Postgres outbox + table-poller is the event-transport spine; no running broker

  • Status: Accepted
  • Date: 2026-06-18
  • Deciders: Principal Architect, Engineering Lead
  • Relates to: ADR-001, ADR-008, ADR-011

Context

DemozPay emits domain events (EWA/lending/payroll/equb lifecycle) via a transactional outbox — the event row is written in the SAME DB transaction as the state change + audit row (ADR-008). The open question has been: what carries those events to consumers — Kafka, RabbitMQ, or the database itself?

The implementation already answers it, and an audit of the running system made the truth concrete:

  • Internal cross-domain consumers poll the outbox_event table directly (FOR UPDATE SKIP LOCKED), not a broker: payroll-deductions-poller, notification-poller, payroll-equb-fanout (via the deductions poller), and the settlement poller all read Postgres.
  • Kafka is a dormant producer only. apps/api/src/_infra/outbox/kafka-event-publisher.ts is built only when KAFKA_BROKERS is set, and the relay runs only when OUTBOX_PUBLISHER_ENABLED=true. With no consumer subscribed, enabling it today publishes to zero readers.
  • RabbitMQ is absent from the codebase entirely.

So the messaging spine that actually moves events between domains is Postgres. A broker is not on the critical path for any shipped feature.

Decision

Adopt Option C: the Postgres transactional outbox + table-poller IS the event-transport spine. Do not run a broker (Kafka or RabbitMQ) in production until a consumer genuinely requires one.

  • The single broker-swap seam stays: the EventPublisher interface (packages/shared/events) with KafkaEventPublisher as one impl. It remains dormant behind OUTBOX_PUBLISHER_ENABLED + KAFKA_BROKERS — kept as the future streaming on-ramp, not deleted.
  • Kafka is activated only when a real streaming/replay consumer exists — the clear future candidates are high-volume, ordered, replayable domains (Wallet, Risk/Fraud). Activating it before then feeds zero consumers and adds ops burden for no benefit.
  • RabbitMQ is not adopted. The competing-consumer / work-queue needs it would serve are met today by FOR UPDATE SKIP LOCKED table polling; and if Kafka is later activated for streaming, running a second broker alongside it is rejected on operational-simplicity grounds.
  • New internal consumers default to polling the outbox table, following the existing poller pattern, until/unless they need cross-process streaming or replay.

Consequences

Positive

  • Zero broker infra to run, secure, monitor, or recover — fewer moving parts for a small team (operational simplicity > architectural fashion).
  • Exactly-once-ish delivery semantics are simple and auditable: the outbox row is the durable record; pollers are idempotent via SKIP-LOCKED claims + per-consumer state.
  • No dual-write problem (the outbox + state share one tx, ADR-008).
  • The swap seam is preserved, so adopting Kafka later is a config flip + a consumer, not a rewrite.

Negative / accepted trade-offs

  • Polling adds latency (seconds) versus push. Acceptable for every current flow; revisit per-consumer if a sub-second SLA appears.
  • Table polling does not scale to very high event throughput or fan-out to many independent subscribers — which is precisely the signal that will justify activating Kafka for that workload.
  • Cross-process replay/streaming is not available until Kafka is activated. No current consumer needs it.

Revisit when

  • A domain needs ordered, replayable, high-volume streaming (Wallet ledger projection, Risk feature pipeline) → activate the existing Kafka publisher + add that consumer.
  • A consumer needs delivery the table poller can't serve at volume → reassess (still likely Kafka, not a second broker).