Skip to main content

DemozPay — System Gap Action Plan

Companion to: SYSTEM_GAP.md Snapshot: 2026-05-28 Purpose: sequenced, sprint-level plan to close every gap in SYSTEM_GAP.md and transition the platform from wallet-bookkeeping to bank-orchestrator reality. No commits while standing instruction is in force. Code changes sit in the working tree.

How this document relates to SYSTEM_GAP.md

  • SYSTEM_GAP.md is the inventory — every gap, every checkbox, every architecturally-invalid line.
  • SYSTEM_GAP_ACTION_PLAN.md (this file) is the execution plan — sprint windows, who does what, what "done" looks like, what blocks what.

When a gap is closed, you tick it in both files: the entry in SYSTEM_GAP.md AND the corresponding row here.


Executive summary

The transition is a 4-sprint program (~8 calendar weeks at 1 senior backend engineer; ~4 weeks at 3 engineers in parallel).

SprintCalendarThemeOutcome
S1weeks 1–2Ledger + ports + domain — the foundationLedger supports PENDING/POSTED/FAILED. Use cases pre-commit. Domain knows bank states. Nothing actually moves money yet — gateway is still a stub.
S2weeks 3–4Integration-gateway real Go service + Dashen adapter (longest single job)First real bank API call lands. End-to-end EWA disbursement to a Dashen test account works in staging.
S3weeks 5–6Settlement confirmation loop + webhook receiver + repaymentThe settlement-state machine closes. Lending has repayment. We can claim "money moves end-to-end."
S4weeks 7–8Reconciliation + statement ingestion + legacy cleanup + first real soakLedger drift is measurable against the bank. Legacy wallet models out of the way. Production-readiness gate clears.

Gate to go-live: end of S4. No real bank rail can be opened to real users before all of S1–S3 has closed AND S4's drift detection runs clean for at least 7 consecutive days.

The single biggest risk to this plan is integration-gateway / Dashen adapter (S2). It is the largest piece of work and the one item where external dependencies (Dashen API spec, sandbox access, signing keys) could delay everything. Mitigate by starting Dashen sandbox onboarding in parallel with S1 (week 1 day 1).


Sprint 1 — Ledger foundation + domain shift

Weeks 1–2. Single engineer can carry this. Cannot be parallelised meaningfully — items depend on each other.

Goal

After S1, the codebase models the bank-orchestrator reality at the type level. Use cases pre-commit ledger entries as PENDING, call the gateway, handle accepted/rejected outcomes, and emit lifecycle-accurate events. The gateway is still a stub, so nothing actually moves money — but every layer above it is correct.

Tasks in execution order

#GapTaskDaysAcceptance criteria
S1.1GAP-01aWrite migration 0003_pending_posted.up.sql. Adds status column, transition function, replaces append-only trigger with column-immutability.0.5Migration applies clean against hermetic PG. Verify guard catches illegal transitions (test: try POSTED → PENDING, must raise).
S1.2GAP-01eUpdate packages/contracts/grpc/ledger.proto. Add LedgerTransactionStatus enum, add ConfirmSettlement + MarkSettlementFailed RPCs, add status field on PostTransactionRequest.0.5buf lint passes. buf generate produces stubs (run on Go-equipped host).
S1.3GAP-01bservices/ledger/internal/server/post_transaction.go accepts status parameter (default POSTED for back-compat, PENDING allowed).0.5Integration test: post with PENDING → row visible, status='PENDING', balance trigger still fires at COMMIT.
S1.4GAP-01cNew services/ledger/internal/server/confirm_settlement.go — RPC ConfirmSettlement(tenant_id, tx_id, partner_reference). Calls the SQL transition function PENDING → POSTED. Idempotent (calling on an already-POSTED tx returns success, calling on REVERSED/FAILED returns FailedPrecondition).0.5Integration tests: confirm twice → both succeed (idempotent). Confirm a FAILED tx → FailedPrecondition.
S1.5GAP-01dNew services/ledger/internal/server/mark_failed.go — RPC MarkSettlementFailed(tenant_id, tx_id, reason). Calls SQL transition PENDING → FAILED. Idempotent in the same shape.0.5Integration tests: same idempotency contract.
S1.6GAP-02Extend DisbursementPort (both EWA + lending versions) with getStatus(reference) method and richer DisbursementResult (bankStatus, acceptedAt, failure fields).0.5EWA + lending unit tests still pass with updated in-memory adapters. Type-check passes.
S1.7GAP-03a / GAP-03bAdd bank-state enum values to EwaStatus and LoanStatus. Update transition tables.1Unit tests cover every new transition path.
S1.8GAP-04a / GAP-04bRewrite apps/api/src/products/ewa/ledger-accounts.adapter.ts and apps/api/src/products/lending/ledger-accounts.adapter.ts to use bank-orchestrator account taxonomy. Drop cashAccountId. Add payableToBusinessBankAccountId, payableToFiPartnerAccountIdByFi, payrollClearingAccountId.1Type-check passes against new use-case shape. Unit tests pass.
S1.9GAP-05Rewrite DisburseEwaUseCase and DisburseLoanUseCase: pre-commit ledger as PENDING → call gateway → on ACCEPTED leave PENDING + emit *_accepted.v1 + persist SUBMITTED_TO_BANK → on REJECTED MarkSettlementFailed + Reverse + persist BANK_REJECTED + emit *_failed.v1. Settlement confirmation (PENDING→POSTED) is NOT in the use case — it's the poller (S3).1.5Unit tests cover: happy ACCEPTED path, REJECTED-at-submit path, gateway-throws path. In-memory adapters simulate each.
S1.10GAP-06a / GAP-06bRename outbox events in packages/ewa/backend/domain/events.ts + packages/lending/backend/domain/events.ts.0.5Old event names removed entirely. Tests assert the new names.

S1 total: ~7 engineer-days. Closes 9 of 12 gap headlines structurally (but downstream verification waits for S2).

S1 acceptance gate

  • All S1 unit tests pass (nx test ewa, nx test lending, services/ledger/test/verify.sh).
  • nx build api clean. nx lint api clean.
  • Migration 0003_pending_posted applies + verify guard runtime-proven against hermetic PG.
  • At least one integration test on the Go ledger that exercises: post PENDING → confirm → POSTED visible; post PENDING → mark_failed → FAILED visible; balance trigger still rejects unbalanced PENDING at INSERT.
  • Honest negative test: a synchronous gateway throw mid-use-case → ledger still has PENDING + EWA marked SUBMITTED_TO_BANK (NOT DISBURSED). Reconciliation will catch it later. Documented as the expected hole until S3.

Sprint 2 — Real integration-gateway + Dashen

Weeks 3–4. The single longest engineering job in the program. 1 Go engineer + parallel work on Dashen API onboarding (started week 1).

Goal

After S2, services/integration-gateway/ is a real Go gRPC service with its own Postgres database, a state machine, and at least one production-shape partner adapter (Dashen). EWA + lending disbursement reaches Dashen's sandbox end-to-end.

Parallel pre-work (start week 1)

This is the longest external blocker. Begin immediately, in parallel with S1.

TaskOwnerNotes
Onboard with Dashen Bank API teamBizDev + Engineering leadSandbox credentials, signing key provisioning, webhook URL whitelisting
Obtain Dashen API spec documentEngineering leadNeed: auth model, signing algorithm, request shapes for transfer-initiation + status-query, webhook payload shape + signature verification
Set up Dashen sandbox account + test bank-to-bank transfer (manually, via Dashen's UI)OperationsConfirms credentials + answers "is the sandbox real money or test money?"

If Dashen blocks for more than 7 days, escalate. The whole sprint depends on it.

Tasks in execution order

#GapTaskDaysAcceptance criteria
S2.1GAP-07aScaffold services/integration-gateway/internal/{config,pg,server,store,statemachine,adapters,webhook}. Replace cmd/integration-gateway/main.go with real gRPC server + HTTP listener for webhooks.1Service boots; /health works; gRPC server listening; grpcurl shows registered methods.
S2.2GAP-07bMigration services/integration-gateway/migrations/0001_init.up.sqldisbursement + bank_event tables per SYSTEM_GAP.md §3.7.0.5Migration applies; RLS on both tables (gateway is multi-tenant too); the same tenant_isolation pattern as the API/ledger.
S2.3GAP-07cImplement PartnerAdapter interface and adapter registry. Move proto-defined types Rail, Account, DisbursementStatus into the Go server.1Unit test: registry resolves by partner string; unknown partner → InvalidArgument.
S2.4GAP-07dImplement the gRPC handlers InitiateDisbursement, LookupAccount, GetDisbursementStatus, GetAdapterStatus. Each routes to the right adapter and persists disbursement rows + bank_event records.1.5Integration test against a fake adapter: idempotency works (same idempotency_key → cached row); state transitions are forward-only; concurrent InitiateDisbursement for same key serializes safely.
S2.5GAP-07eImplement state machine: INITIATED → SUBMITTED → ACCEPTED → SETTLED plus failure branches. Updates are append-only in bank_event; disbursement.status is the materialised state.1Unit tests for every legal + illegal transition.
S2.6GAP-07fImplement Dashen adapter in internal/adapters/dashen/. HTTP client (Go std net/http) with mTLS, request signing per Dashen's spec, response normalisation. Loads credentials from env.2–3Adapter unit tests against canned Dashen response fixtures. Integration test against Dashen sandbox: real transfer initiated, partner_reference returned, status query returns the expected lifecycle. Cannot proceed without Dashen sandbox.
S2.7GAP-07gWire the Dashen adapter into the partner registry. Add INTEGRATION_GATEWAY_DASHEN_BASE_URL, INTEGRATION_GATEWAY_DASHEN_API_KEY, INTEGRATION_GATEWAY_DASHEN_SIGNING_KEY to env schema.0.5Boot of gateway in dev with Dashen creds succeeds; without them, Dashen adapter is unregistered (other partners still available).
S2.8End-to-end smoke: API → use case → gateway → Dashen sandbox → ACCEPTED received. Ledger has PENDING entry. EWA row is SUBMITTED_TO_BANK.1Manual run + recorded bank_event rows.

S2 total: ~8–9 engineer-days. Critical path. Slippage here delays everything.

S2 acceptance gate

  • services/integration-gateway boots as a real gRPC service.
  • Migration applied cleanly. RLS on disbursement + bank_event.
  • Dashen sandbox transfer initiated end-to-end, bank_event row visible.
  • nx run integration-gateway:test (if a hermetic test harness mirroring the ledger's is built — recommended).
  • Ledger has a PENDING entry for the test transfer. Nothing has been settled yet (that's S3).

Sprint 3 — Settlement loop + repayment + webhooks

Weeks 5–6. Parallelisable: GAP-08 + GAP-09 + GAP-10 can run with 3 different engineers.

Goal

The settlement-state machine closes. EWA + lending know about SETTLED. Lending knows how to repay. The webhook path is hardened.

Tasks (parallelisable)

#GapTaskDaysAcceptance criteria
S3.1GAP-08New NestJS service apps/api/src/money/integration/settlement-poller.service.ts. Polls every 30s. Queries ewa_request + loan in SUBMITTED_TO_BANK / ACCEPTED_BY_BANK. For each, calls getStatus via gateway. On COMPLETED → ledger.ConfirmSettlement + EWA DISBURSED + emit *_settled.v1. On FAILED → ledger.MarkSettlementFailed + ledger.Reverse + EWA BANK_REJECTED + emit *_failed.v1. Uses BYPASSRLS role (provisioned via infra/sql/01_create_settlement_poller_role.sql — new).2Integration test: seed a PENDING row, mock gateway returns COMPLETED → poller flips to DISBURSED + ledger POSTED. Same for FAILED. Stale-pending (>24h) escalates an alert metric.
S3.2GAP-09New NestJS controller apps/api/src/money/integration/bank-webhook.controller.ts. Path POST /api/integration/bank-callback/:partner. HMAC signature verification per adapter. @Public() (webhooks don't carry user auth). Normalised payload → same ConfirmSettlement / MarkSettlementFailed flow as poller. Stores raw payload in bank_event. Idempotent: re-receiving the same partner_reference returns 200 without re-applying.1.5Integration test: replay a webhook → only one ledger transition observed. Bad signature → 401. Unknown partner → 404.
S3.3GAP-10New use case packages/lending/backend/application/record-repayment.usecase.ts. Takes (loanId, installmentIndex, amountFromPayrollClearing). Ledger entries: DR payrollClearing / CR loanReceivable (principal) + CR interestRevenue (interest portion). Marks installment PAID. If last → loan CLOSED. Emits loan.installment_repaid.v1.1.5Unit tests against in-memory ports. The use case is consumed by a payroll-deduction event consumer (which doesn't exist yet — for now the consumer is a script/admin endpoint, called manually).
S3.4Wire a NestJS endpoint POST /api/lending/loans/:id/installments/:idx/record-repayment that calls the use case. Auth-gated (AuthGuard). Temporary admin trigger until payroll domain ships.0.5E2E test: disburse a fake loan → trigger 3 repayments via the endpoint → ledger zeros the receivable → loan CLOSED.

S3 total: ~5.5 engineer-days, parallelisable to ~2 elapsed days with 3 engineers.

S3 acceptance gate

  • Settlement poller observed (in staging or hermetic harness) flipping a PENDING row to POSTED.
  • Webhook controller observed accepting + rejecting based on signature.
  • Loan repayment integration test passes against real Go ledger.
  • First true end-to-end: trigger EWA in staging → Dashen sandbox transfer → wait for settlement → API state goes SUBMITTED_TO_BANK → ACCEPTED_BY_BANK → DISBURSED. Ledger POSTED.

Sprint 4 — Reconciliation + legacy cleanup + soak

Weeks 7–8. Parallelisable: GAP-11 + GAP-12 can run alongside soak testing.

Goal

The bank-vs-ledger reconciliation primitive lands. Legacy Wallet* models are out of the way. The platform soaks for at least 7 consecutive days with zero drift before being declared go-live ready.

Tasks

#GapTaskDaysAcceptance criteria
S4.1GAP-11aAdd services/integration-gateway/internal/reconciliation/{statement-ingester,matcher,drift-reporter}.go. Define bank_statement_line table. Implement Dashen-format CSV parser as the first ingester.2Test: ingest a synthetic Dashen statement file → each line creates a bank_statement_line row.
S4.2GAP-11aMatcher: for each unmatched bank_statement_line, find matching disbursement by (amount, partner_reference, value_date ± 1d). Mark RECONCILED. Unmatched → flagged.1.5Integration test: 100 disbursements + statement → all matched. Inject one fake line → flagged.
S4.3GAP-11bLedger RPC ReconcileWithBank(account_id, period, statement_total). Returns (ledger_total, statement_total, drift). Logs drift as error if non-zero.1Unit test against fake data: drift exactly 0 by construction. Inject one tampered row → drift detected.
S4.4GAP-12aapps/api/prisma/schema.prisma: add /// @deprecated triple-slash comments to Wallet, WalletTransaction, WithdrawalRequest, LedgerAccount.balance.0.5Prisma generate clean. Schema diff shows comments only.
S4.5GAP-12bCustom ESLint rule tooling/eslint/no-deprecated-prisma-models.ts. Blocks imports of Wallet, WalletTransaction, WithdrawalRequest from @prisma/client outside apps/api/prisma/.1Lint test: importing Wallet from anywhere in packages/ fails.
S4.67-day soak in staging. Daily reconciliation report. Drift must be 0 every day. Any non-zero drift halts go-live until root-caused and fixed.(calendar days; ~1 engineer-day per daily review)7 consecutive days drift=0, observed in dashboard + log.

S4 total: ~6 engineer-days code + 7 calendar days soak.

S4 acceptance gate (== Go-live gate)

  • All gaps GAP-01..GAP-12 ticked [x] in SYSTEM_GAP.md.
  • 7-day soak with zero drift, dashboarded.
  • Production runbooks exist for: webhook failure, gateway-down, drift-detected, bank-statement-parse-failed.
  • At least one full incident drill executed: simulated bank rejection → observed reversal, observed alert.
  • Sign-off from: engineering lead, security lead, finance/ops lead.

Until ALL of these are green, no real bank rail opens to real users.


Dependency map

GAP-01 (ledger pending/posted)
│ blocks everything below
├─→ GAP-02 (port shape) ──┐
│ ├─→ GAP-03 (domain status) ─→ GAP-05 (use cases) ─→ GAP-06 (events)
├─→ GAP-04 (account taxonomy) ┘ │
│ │
└─→ GAP-07 (integration-gateway) ──┬─→ GAP-08 (poller) ────────────────────────┐ │
├─→ GAP-09 (webhooks) ────────────────────┐│ │
└─→ GAP-11 (reconciliation) ───────────┐ ││ │
│ ││ │
GAP-10 (lending repayment) — independent of bank flow once GAP-01..05 land ──┐│ ││ │
▼▼ ▼▼ ▼
[GO-LIVE GATE]
GAP-12 (legacy deprecation) — runs whenever, blocks nothing

The critical path is GAP-01 → GAP-07 → soak. Everything else parallelises around it.


Parallel tracks (for a 3-engineer team)

EngineerSprint 1Sprint 2Sprint 3Sprint 4
Eng-A (TS/NestJS)GAP-01 (proto + ledger client side) → GAP-02 → GAP-03 → GAP-05 → GAP-06(Free for review / catch-up)GAP-08 (poller)GAP-12 (legacy deprecation) + soak monitoring
Eng-B (Go)GAP-01 (Go server side: status column, ConfirmSettlement, MarkSettlementFailed) → GAP-04 reviewGAP-07 (gateway + Dashen) ← critical pathGAP-09 (webhook receiver, in TS — pair with Eng-A)GAP-11 (reconciliation, in Go)
Eng-C (TS/NestJS)GAP-04 (account taxonomy rewrite) + S1 testing infraTests for GAP-07 in integration with APIGAP-10 (lending repayment)Soak monitoring + runbooks

Single-engineer fallback: sequential execution as listed in S1→S4. ~22 working days.


Per-gap acceptance criteria summary (quick lookup)

Gap"Done" means
GAP-01ledger_transaction.status column exists. Migration verify guard catches illegal transitions. PostTransaction(PENDING) returns; ConfirmSettlement flips to POSTED; MarkSettlementFailed flips to FAILED. Both idempotent.
GAP-02DisbursementPort.getStatus(ref) exists in EWA + lending ports. EWA + lending unit tests use the new DisbursementResult shape with bankStatus.
GAP-03EwaStatus + LoanStatus include the new bank states. Transition tables cover every path. Tests for every transition.
GAP-04LedgerAccountsAdapter no longer returns cashAccountId. Returns the bank-orchestrator taxonomy. Use cases compile against the new shape.
GAP-05Disburse use cases pre-commit ledger as PENDING. ACCEPTED leaves PENDING. REJECTED reverses. Unit tests cover all three paths.
GAP-06Outbox events renamed. Old names removed entirely. Notification consumer subscribes to *_settled + *_failed only.
GAP-07services/integration-gateway is a real gRPC server. Dashen adapter loaded. Integration test against Dashen sandbox proves end-to-end.
GAP-08Poller running. Stale-pending alert fires after 24h threshold. Stage E2E proves settlement closes.
GAP-09Webhook endpoint live. HMAC verification per partner. Idempotent replay-safe.
GAP-10record-repayment.usecase.ts exists. Loan installment flows to PAID. Last installment closes loan.
GAP-11Bank statement ingestion + matcher + ReconcileWithBank RPC live. Drift dashboard exists.
GAP-12Legacy Prisma models @deprecated. ESLint blocks new imports.

Shipping gates (the kill-switches)

Before any of these dates can be considered, gates must clear:

GateWhat it permitsRequired state
Internal dev demoDemo to leadership teamS1 + S2 + S3 closed
Staging E2E with real Dashen sandboxTest with internal employees as fake employeesS3 closed + 24h drift-clean
First real employer pilot (1 employer, 5 employees, capped at 500 ETB EWA per employee)First real money movesS4 closed + 7-day drift-clean + sign-off
Multi-employer rolloutUp to 10 employersAfter 30 days post-pilot drift-clean + first real reconciliation report
Second bank rail (CBE, Awash, Telebirr — pick first)Diversify partnersAt least 90 days operating cleanly on Dashen + second adapter implemented + adapter-test-harness validated

No bank rail goes to real users without leadership sign-off AND a 7-day soak.


Risk plan

RiskLikelihoodImpactMitigation
Dashen API spec arrives late or with breaking ambiguityMediumHigh (S2 blocked)Start Dashen onboarding in S1 week 1 day 1. If no spec by S1 end, escalate. Treat the integration-gateway Go scaffolding (GAP-07a..e) as still doable without it — build with a fake adapter first.
Dashen sandbox is unreliable / down during S2MediumMediumBuild a high-fidelity mock adapter in services/integration-gateway/internal/adapters/mock-dashen/ first. Use it for all unit + integration tests. The real Dashen adapter is the LAST item of S2.
The team discovers payroll-deduction integration needs to exist before lending repayment can be testedMediumMediumMitigated: S3.4 wires a temporary admin endpoint that triggers repayment manually. Sufficient for testing. Real payroll integration is Q3 work (outside this program).
Reconciliation reveals real drift on day 1 of soakHighCritical (go-live blocked)This IS the point of soak. Plan 2–3 weeks of buffer past S4 for drift hunting. Do not treat go-live date as fixed until first 24h of drift-clean observed.
Engineering team turnover during S1–S3LowHighEvery gap entry in this file is self-contained. A new engineer can pick up mid-program with SYSTEM_GAP.md + this file + the per-task acceptance criteria.
Regulator (NBE) asks for evidence that we are not a custodian during S2–S3LowHighDrop the GAP-12 legacy-deprecation work into S2 — strips the visible Wallet.balance columns. Cite ADR-014 (to be written: "DemozPay is orchestrator, not custodian").

What this plan deliberately does NOT include

  • Payroll domain implementation. EWA's accruedEarnings adapter stays as a placeholder. Payroll is a separate program; flagged in SYSTEM_GAP.md but not in any S1–S4 sprint.
  • BNPL. No code exists; nothing to fix. Flagged for a separate program.
  • Savings / Equb. Same as BNPL.
  • Frontend → API integration. All 5 Next.js apps still use mocks. Out of scope; tracked separately.
  • MFA, WebAuthn, AfricasTalking SMS. Auth surface improvements; separate programs.
  • ADR-014 ("DemozPay is orchestrator, not custodian"). Recommended as a parallel doc task but explicitly out of this code-execution plan.

What you should do with this file going forward

  1. Print it. Pin it. Each sprint, the team works from this document.
  2. At sprint start: team picks the sprint section and breaks tasks down further if needed.
  3. At sprint end: tick the corresponding rows in SYSTEM_GAP.md, update this file's sprint table with completion dates.
  4. At go-live gate: print the gate checklist, walk every checkbox, get every sign-off.

The two files are an audit trail. Together they answer "did we do what we said we'd do?" with file:line evidence.


Next step (in this session): start executing Sprint 1, beginning with GAP-01 (ledger pending/posted lifecycle). See the changes landing in your working tree — no commits per standing instruction.