04 — How the backend works
This document explains the runtime architecture: what happens between the moment an HTTP request lands and the moment a money operation commits.
For the full operator-level picture — every container, every port, every API surface, every database, plus three end-to-end traces (EWA disburse, daily reconciliation, what happens at
pnpm dev:apiboot) — seedocs/architecture/SYSTEM_TOPOLOGY.md. That doc is the single most concrete reference; this one is the conceptual primer.
The architecture in one picture
┌────────────────────────────────────────────────────────────────────┐
│ Web apps + curl / partner systems │
└──────────────────────────────┬─────────────────────────────────────┘
│ HTTPS
▼
┌──────────────────────────────────┐
│ NestJS modular monolith │
│ apps/api (TypeScript) │
│ │
│ ┌─────────────────────────┐ │
│ │ /api/auth/* │ │
│ │ better-auth handler │ │
│ │ (Express-mounted │ │
│ │ BEFORE Nest) │ │
│ └─────────────────────────┘ │
│ │
│ ┌─────────────────────────┐ │
│ │ /api/* (Nest routes) │ │
│ │ • SessionMiddleware │ │
│ │ • TenantContextMW │ │
│ │ • AuthGuard (default- │ │
│ │ deny; @Public() │ │
│ │ opt-out) │ │
│ │ • Controller │ │
│ │ • UseCase │ │
│ │ • PrismaTxnRunner │ │
│ │ ↓ writes to: │ │
│ │ - domain table │ │
│ │ - audit_entry │ │
│ │ - outbox_event │ │
│ └─────────────────────────┘ │
└──────┬────────────────────┬──────┘
│ │
│ gRPC │ Postgres
│ (PostTransaction, │ (Prisma)
│ GetBalance, …) │
▼ ▼
┌────────────────────────┐ ┌────────────────────────┐
│ services/ledger │ │ Postgres (single DB │
│ Go gRPC │ │ for the monolith) │
│ Money source of truth │ │ RLS forced on the 5 │
│ Own Postgres DB │ │ financial tables │
│ Append-only journal │ │ │
└────────────────────────┘ └────────────────────────┘
│
│ Background drain
▼
┌────────────────────────┐
│ OutboxPublisherService│
│ (worker in apps/api) │
│ Uses BYPASSRLS role │
│ via OUTBOX_DATABASE_URL│
└────────────────────────┘
│
│ Kafka producer
▼
┌────────────────────────┐
│ Redpanda / Kafka │
│ Topic per bounded ctx │
└────────────────────────┘
│
│ Consumers
▼
┌─────────────────────────────────────────────────┐
│ services/notifications [Stub] │
│ services/integration-gateway [Stub] │
│ Future: reconciliation engine, BI sink │
└─────────────────────────────────────────────────┘
Status of each box, brutally honest:
| Box | Status |
|---|---|
NestJS monolith (apps/api) | Live — boots cleanly, all routes registered, health/metrics work, default-deny auth works |
/api/auth/* better-auth handler | Partial — email/password + organization plugin Live (sign-up E2E proven); phone OTP Partial (LoggingSmsSender is dev-only); 2FA Planned |
| Postgres (monolith DB) | Live — 9 migrations apply clean; RLS forced + verify-guarded on 5 tables |
services/ledger | Partial — schema + DB invariants Live; Go server Compile-only on most hosts |
services/integration-gateway | Stub |
services/notifications | Stub |
OutboxPublisherService | Live (gated by env; disabled by default) |
| Redpanda / Kafka | Live for local dev (docker-compose has it); not exercised in any consumer yet |
The two-language ceiling (ADR-010)
DemozPay uses only TypeScript and Go. Each has its strengths and the choice for each piece is deliberate:
| TypeScript / NestJS owns | Go owns |
|---|---|
| HTTP API surface | Money posting (the ledger) |
| Auth (better-auth) | Reconciliation engine (planned) |
| Domain orchestration | Settlement engine (planned) |
| Business workflows | High-throughput financial operations |
| Tenant context plumbing | Anything where GC pauses or runtime surprises are unacceptable |
| Prisma + frontend BFF | Long-lived background workers |
The rule isn't "Go for fast, TS for slow." It's:
- Money correctness paths → Go. Predictable runtime, no GC surprises, easy to audit.
- Domain orchestration → TS. Fast iteration, type-safe domain modeling, large ecosystem.
When you find yourself wanting to introduce a third language (Java, Python, Rust, Kotlin), stop and write an ADR. ADR-010 explicitly rejects this.
The request flow for a financial route
This is what happens when a request lands on POST /api/ewa/requests:
1. HTTPS request arrives. Express receives it.
2. Express's path matcher: does it match /api/auth/* ?
YES → forward to better-auth's toNodeHandler. RETURN.
NO → pass to NestJS router.
3. NestJS router matches /api/ewa/requests to EwaController.create.
4. NestJS middleware chain (in declared order):
a. SessionMiddleware (apps/api/src/identity/auth/session.middleware.ts)
- Reads the better-auth.session_token cookie
- Calls auth.api.getSession({ headers })
- Sets req.user = {
id, email, emailVerified,
businessId = session.session.activeOrganizationId,
role
}
- If no session: leaves req.user undefined; passes through.
b. TenantContextMiddleware (apps/api/src/identity/tenant/...)
- Resolves tenantId from, in order:
req.user.businessId
req.user.tenantId
req.headers['x-tenant-id']
- If found: runs the rest of the request inside
runWithTenant({tenantId}, next)
which establishes an AsyncLocalStorage frame.
- If not found: passes through; downstream getTenantId()
will return undefined.
5. NestJS guard chain:
a. AuthGuard (registered as APP_GUARD)
- If @Public() decorator on the controller/handler: pass.
- Otherwise: req.user.id must be set or throw 401.
6. NestJS controller invocation:
EwaController.create({ body, headers })
body validated via Zod / class-validator (DTOs)
idempotencyKey extracted from header
7. UseCase call:
RequestEwaUseCase.execute({
tenantId: getTenantId(),
employeeId, payPeriodId, amount, idempotencyKey
})
8. UseCase orchestration (all inside one PrismaTransactionRunner):
- SELECT set_config('app.tenant_id', $tenantId, true) [RLS]
- Read accrued earnings via AccruedEarningsPort
- Compute fee via EligibilityPolicy
- Check IdempotencyStore: have we seen this key before?
YES → return cached result; commit no-op
NO → continue
- Insert ewa_request row
- Call LedgerGrpcClient.postTransaction(...) over gRPC
(this hits services/ledger; if it isn't running, 500)
- Append audit_entry row
- Append outbox_event row
- Insert idempotency_record row with the result
- Commit. The deferred trigger ledger_assert_balanced
runs at COMMIT inside the ledger's own txn.
9. NestJS response handling:
- HttpMetricsInterceptor records latency + count
- Pino logs with trace_id/span_id from OpenTelemetry
- Response sent
Every step is testable. Steps 1–6 are exercised by the live
sign-up demo (you can curl your way through them yourself per
02-running-locally.md). Steps 7–9 require the Go ledger to be
running.
The transactional spine
The single most important piece of code in the platform:
apps/api/src/_infra/shared-infra/prisma-transaction-runner.ts
async runInTransaction<T>(
work: (tx: Prisma.TransactionClient) => Promise<T>
): Promise<T> {
const tenantId = getTenantId(); // from AsyncLocalStorage
return this.prisma.$transaction(async (tx) => {
// ① Set app.tenant_id for RLS (LOCAL — resets at COMMIT)
await tx.$executeRaw`SELECT set_config('app.tenant_id', ${tenantId}, true)`;
// ② Hand the typed tx to the use case
return work(tx);
});
}
What this gives you for free, inside work(tx):
- Tenant isolation — RLS sees the right tenant on every query.
- Atomicity — domain write + audit + outbox all commit or all roll back together (ADR-008).
- Idempotency — the idempotency store insert and the domain write are in the same tx; a retry cannot half-commit.
Rule: every money-touching write goes through the runner. If
you find yourself calling prisma.something.create() directly in
a domain use case, you're bypassing the runner — and you've also
bypassed RLS, audit, and the outbox. Don't.
The ledger (services/ledger)
The ledger is a separate Go service with its own Postgres database. Two reasons (ADR-006):
- Blast radius isolation. A bug in the EWA domain can't corrupt the ledger. A bug in the ledger code can't corrupt EWA data.
- Different durability story. The ledger is append-only at the database level (triggers reject UPDATE/DELETE). It can use a different backup strategy, different replicas, different access patterns.
The ledger schema (services/ledger/migrations/0001_init.up.sql)
ledger_account Chart of accounts, per tenant
(id, tenant_id, code, name, type, currency)
type ∈ {ASSET, LIABILITY, EQUITY, REVENUE, EXPENSE}
ledger_transaction Journal headers
(id, tenant_id, idempotency_key, request_fingerprint,
description, value_date, posted_at, reverses_transaction_id,
metadata)
UNIQUE (tenant_id, idempotency_key)
ledger_entry Journal lines (debits and credits)
(id BIGINT, transaction_id, tenant_id, account_id,
direction (DEBIT|CREDIT), amount_santim NUMERIC(20,0),
currency, created_at)
ledger_account_balance Derived view, never a column
(computes signed balance from entries using account-type sign rules)
DB-level invariants — enforced regardless of service code:
- Balanced:
ledger_assert_balanceddeferred constraint trigger raises at COMMIT if debits ≠ credits per currency. - Append-only:
ledger_block_mutationtrigger raises on any UPDATE/DELETE. - Idempotent:
UNIQUE(tenant_id, idempotency_key)rejects duplicate posts. - Single reversal: partial UNIQUE index on
(tenant_id, reverses_transaction_id)rejects double-reversal. - Tenant isolated: RLS forced on all 3 ledger tables.
These are all runtime-proven via psql probes on this host. You
can re-run them with the verification harness at
services/ledger/test/verify.sh.
The ledger's RPCs (packages/contracts/grpc/ledger.proto)
| RPC | What it does | Status |
|---|---|---|
PostTransaction | Atomic multi-leg insert; idempotent via key+fingerprint | code written |
GetBalance | Read derived balance (current or as-of) | code written |
Reverse | Compensating-entry txn with double-reversal lockout | code written |
GetEntries | Paginated journal scan with opaque cursor | code written |
ReconcileAccount | Independent Go-side sum vs view; returns drift | code written |
All five are implemented in services/ledger/internal/server/. None
have been run against a live Go server on this host (no Go
toolchain). Schema-level invariants ARE proven live.
The outbox pattern (ADR-008)
The transactional outbox is how DemozPay does cross-service messaging without losing events. The pattern:
1. Inside the runner's transaction, AS PART of the same commit:
- Insert the domain row (e.g. ewa_request)
- Insert an outbox_event row with the event payload
- Insert an audit_entry row
If the txn rolls back, ALL three roll back. If it commits, ALL
three commit. No half-states.
2. A SEPARATE process (OutboxPublisherService in apps/api):
- Polls outbox_event WHERE publishedAt IS NULL
- Claims a batch with FOR UPDATE SKIP LOCKED (multi-instance safe)
- Publishes each row to Kafka
- Marks publishedAt = now()
- Commits the publish + the mark
If the publisher crashes between publish and mark, the row stays
unpublished and gets re-tried on the next tick. Consumers must
be idempotent (the event id is stable).
3. Consumers:
- services/notifications: send SMS / email / push
- services/integration-gateway: trigger bank/wallet calls
- Future: reconciliation, BI
Critical: the publisher must use a separate DB role with
BYPASSRLS. With tenant RLS active, the API role cannot see other
tenants' outbox rows. The role provisioning is at
infra/sql/00_create_outbox_publisher_role.sql. See ADR-013.
If OUTBOX_DATABASE_URL is unset, the publisher falls back to the
API role and emits a loud WARN at boot. Read those warnings.
Auth (better-auth)
DemozPay uses better-auth for authentication. Why:
- Self-hosted (no SaaS lock-in, important for Ethiopia compliance)
- Phone-OTP plugin (primary auth path for our market)
- Organization plugin (maps cleanly to
tenantId == businessId) - Prisma adapter
- 2FA plugin (TwoFactor table exists; plugin not yet wired —
Planned)
The wiring:
apps/api/src/identity/auth/
├── better-auth.factory.ts Constructs the better-auth instance
├── auth.module.ts NestJS DI wiring
├── auth.tokens.ts DI tokens
├── session.middleware.ts Populates req.user from the session cookie
├── auth.guard.ts Default-deny APP_GUARD; respects @Public()
├── public.decorator.ts The opt-out for health/metrics/root
└── sms-sender.ts SmsSender interface + LoggingSmsSender
The handler is mounted at /api/auth/* on the underlying Express
app, BEFORE NestJS's router takes effect:
// apps/api/src/main.ts
const expressApp = app.getHttpAdapter().getInstance();
expressApp.all('/api/auth/*splat', toNodeHandler(auth));
This is why you won't find an "auth controller" in NestJS terms.
better-auth owns the entire /api/auth/* subtree.
The Organization.id == Business.id invariant
Session.activeOrganizationId
== Organization.id
== Business.id
== app.tenant_id (the GUC value RLS uses)
This identity-equation runs all the way down. Set up in commit #3 of the better-auth integration:
BusinessService.create()runs Business + Organization creation in a single$transactionwith the same id.- A backfill migration (
20260526050000_bootstrap_organizations) filled in matching Organizations for existing Businesses, with a verify guard. Organization.idhas a FK toBusiness.idto enforce at DB level.
The benefit: zero transformation layer. The tenant is whatever the user's active org is, and the active org IS the Business.
Tenant isolation (ADR-013)
Read ADR-013 for the full canonical doc. Quick version:
- Every financial-tier table has a
tenantIdcolumn and an RLS policy:"tenantId" = current_setting('app.tenant_id', true). FORCE ROW LEVEL SECURITY— even the table owner is bound.current_setting('app.tenant_id', true)returns NULL when unset; any row'stenantIdcompared to NULL returns NULL → no rows match → fail-closed.- The runner sets the GUC inside every txn via parameterized SQL.
- The 5 financial tables (
ewa_request,loan,outbox_event,idempotency_record,audit_entry) are under RLS. - The 8 identity tables (
User,Session,Account,Verification,TwoFactor,Organization,Member,Invitation) are INTENTIONALLY excluded. They need cross-tenant reads by design. - Legacy financial tables (
Business,Employee,Payroll,Wallet, etc.) are NOT yet under RLS. They're scoped only by application-levelWHERE. This is the largest standing isolation risk — flagged in ADR-013.
Idempotency (ADR-007)
Two layers of idempotency, both Live at the ledger:
- API gateway layer —
Idempotency-Keyheader on money-moving POSTs. Stored inidempotency_recordtable with(tenantId, scope, key)composite PK. A duplicate request hitsINSERT ... ON CONFLICT DO NOTHING→ returns the cached result from the original. Concurrent duplicates fail loudly. - Ledger layer —
(tenant_id, idempotency_key)UNIQUE onledger_transaction. Plus arequest_fingerprintcolumn for the second arm of the contract:- Same key + same fingerprint → return cached transaction
- Same key + DIFFERENT fingerprint → FailedPrecondition
The two layers are independent so a retry between API and ledger is also safe.
Observability
| Concern | Status | Where |
|---|---|---|
| Structured logging | Live | apps/api/src/_infra/observability/pino-logger.ts (pino + trace correlation) |
| Distributed tracing | Live (gated) | apps/api/src/_infra/observability/tracing.ts — OTel SDK, exports if OTEL_EXPORTER_OTLP_ENDPOINT set. No OTLP backend wired in dev. |
| Metrics | Live | apps/api/src/_infra/observability/metrics/ — prom-client, custom + Node defaults, /api/metrics endpoint |
| Health probes | Live | apps/api/src/_infra/health/ — /healthz (liveness), /readyz (deps), startup checks at boot |
Custom metrics worth knowing:
demozpay_http_requests_total{method,route,status_code}demozpay_http_request_duration_seconds{...}(histogram with fintech-tuned buckets)demozpay_ledger_grpc_duration_seconds{rpc,status}demozpay_outbox_unpublished_totaldemozpay_outbox_oldest_unpublished_age_seconds← the SLO signaldemozpay_dependency_up{dependency}(1=up, 0=down, -1=skipped)
Cardinality discipline (enforced by code review): NO userId, tenantId, idempotency_key, raw URL labels. Route TEMPLATES only, exact status codes only.
Continue reading
Next: 05-the-frontend-apps.md for a
short tour of the web apps, or skip to
06-status-matrix.md for the honest
matrix.