Skip to main content

DemozPay — 90-Day Execution Plan

⚠️ STALE SNAPSHOT (2026-05-29). Overtaken by work since shipped — the payroll engine, banking, and polymorphic multi-org are now Live, so statements below like "none of the work has been executed" and "no payroll engine" reflect the 2026-05-29 baseline, not current state. For what's actually built, see ../audits/CURRENT_STATE_AUDIT.md and ../plans/TARGET_ARCHITECTURE_ALIGNMENT_PLAN.md. Re-baseline or archive this doc.

Snapshot: 2026-05-29 → 2026-08-27 Companion to: GO_LIVE_BLOCKERS.md, PRODUCTION_READINESS.md, REAL_SYSTEM_STATE.md.

Purpose: sequence the 22 go-live blockers + payroll + frontend-integration so that the platform is realistically pilot-ready in 90 days. Replaces "ship after S4.6" with a calibrated path.

Premises

  • Team assumed: 3 backend engineers, 1 frontend engineer, 1 operations/SRE engineer. Fewer = stretch the timeline proportionally.
  • No commits today (standing instruction). The work below assumes commits resume once the user reviews + approves the audit.
  • Telemedhin infrastructure stays untouched throughout (standing instruction).
  • No real bank rail opens until the 19 pilot-blockers in GO_LIVE_BLOCKERS.md are green.

Phase shape

Three 30-day phases:

  • Phase A — Days 1–30 — "Stop the bleeding" (close BLACK items, ship Authz/PII/LookupAccount).
  • Phase B — Days 31–60 — "Production hardening" (observability, secrets, TLS, backups, recon cadence).
  • Phase C — Days 61–90 — "Pilot enablement" (KYC + sanctions + sign-off + the one integrated frontend + payroll-domain skeleton).

Soak + sign-off + first pilot enrolment happen at the end of Phase C.

Phase A — Days 1–30 — Stop the bleeding

Theme: every BLACK item in PRODUCTION_READINESS.md closes. Authz model becomes real. Money model gets its missing primitives. PII no longer leaks.

Sprint A1 — Week 1 (Days 1–5) — Authz emergency

DayItemOwnerAcceptance
1GL-01 — validate x-actor-idBackend-AEWA + lending controllers reject mismatched header; unit tests cover spoof.
1GL-02 (start) — tenant-scope Business + EmployeeBackend-BSpike: design the membership-based scoping query.
2GL-02 (continue) — implementBackend-BMember-table join filters all CRUD; cross-tenant E2E test proves zero rows.
3GL-02 (test + ship)Backend-BAll 9 employer/employee endpoints proven cross-tenant safe.
3GL-03 — PII redaction (start)Backend-CPino redaction config drafted.
4GL-03 (test + ship)Backend-CUnit tests prove redaction for nationalId, phone, email, password, idempotencyKey, x-demoz-signature.
5Hardening pass — security-review the diffAllAll three changes peer-reviewed; merge.

Exit: the three BLACK AuthN/Z findings are LIVE. Audit logs are trustworthy. Cross-tenant enumeration impossible.

Sprint A2 — Week 2 (Days 6–10) — Money model gap-closures

DayItemOwnerAcceptance
6–7GL-04 (LookupAccount in Dashen adapter + use-case wiring)Backend-B (Go)Dashen LookupAccount real; EWA + lending disburse fail-fast on account mismatch; bank-sandbox covers both happy + mismatch.
8GL-05 — EWA repayment use-case (admin endpoint variant)Backend-ARecordEwaRepaymentUseCase; admin endpoint; 5 unit tests; idempotent on (ewaRequestId).
9–10GL-06 — PayrollClearing → FI remittance for repaid installmentsBackend-A + Backend-BNew RemitInstallmentToFiUseCase; per-installment outbound transfer via gateway; integration test against bank-sandbox.

Exit: the money model has no dead-ends. EWA can be repaid. Loans can be remitted. Disbursements verify the account.

Sprint A3 — Week 3 (Days 11–15) — Operational baselines

DayItemOwnerAcceptance
11–12GL-11 — Prometheus on Go servicesBackend-BCounters + histograms per RPC in ledger + gateway; /metrics endpoint live.
13GL-07 — idempotency-record TTL cleanupBackend-CDaily cron + counter + runbook.
14–15Build the daily reconciliation harness — start of GL-09Backend-CCron skeleton calls reconciliation Runner end-to-end against test data; not yet wired to prod.

Exit: the Go services are observable. The recon cron skeleton exists.

Sprint A4 — Week 4 (Days 16–20) — ADR-014 + sign-off draft

DayItemOwnerAcceptance
16GL-15 — write ADR-014 ("DemozPay is orchestrator, not custodian")Engineering lead + legalDraft circulated.
17–18GL-21 — write reconciliation-daily-process.md runbookEngineering lead + finance leadDraft circulated.
19GL-22 — sign-off matrix templateEngineering leadOne-page form, all 22 blockers listed.
20Phase A retrospectiveAllLessons captured; Phase B plan locked.

Exit of Phase A:

  • 7 of 22 go-live blockers closed (GL-01..07 + ADR-014 draft).
  • The platform is no longer dangerous to ship — but is not yet ready.
  • Burn rate: ~50% of total 90-day budget on the highest-risk items, deliberately.

Phase B — Days 31–60 — Production hardening

Theme: observe, alert, deploy, encrypt, back up. The boring stuff that turns "running code" into "operable system".

Sprint B1 — Week 5 (Days 31–35) — Alerting backbone

DayItemOwnerAcceptance
31GL-08 (start) — Alertmanager / paging provider chosenSREDecision recorded. Probably Grafana OnCall + on-call rota in PagerDuty or similar.
32–33Alertmanager / Grafana deployed; scrape config covers API + Go servicesSRE/metrics scraped; dashboards built.
34First 5 alert rules authoredSRE + Backend-AWebhook-failure, gateway-down, ledger-down, outbox-stale, poller-error.
35Drift = non-zero alert rule (depends on GL-09 finishing first)SREDrift rule, linked runbook.

Exit: someone gets paged when something breaks.

Sprint B2 — Week 6 (Days 36–40) — Reconciliation cadence + statement pull

DayItemOwnerAcceptance
36–37GL-10 — Dashen statement-pull adapter (SFTP or API)Backend-BDaily cron pulls yesterday's file; ingests; produces a "ingest success" counter.
38GL-09 — wire the daily-recon cron to call ReconcileWithBank per (tenant, account)Backend-CCron schedules daily; output → Slack channel + drift counter.
39–40GL-12 — adapter health surfacingBackend-BGetAdapterStatus real; degraded threshold; surfaced to API.

Exit: drift is reported daily, automatically, to a channel that humans watch.

Sprint B3 — Week 7 (Days 41–45) — Secrets + TLS

DayItemOwnerAcceptance
41GL-16 — secrets-manager decisionSRE + Engineering leadDecision recorded (likely AWS SM if the cloud is AWS; Vault if self-hosted).
42–44Secrets pipeline integrated; init-container or sidecar pattern; rotation runbookSREApplication reads secrets at boot; no .env in any environment after this date.
45GL-17 — TLS at the edgeSREIngress terminates TLS 1.3; HSTS; cert renewal automated.

Exit: secrets are managed; traffic is encrypted.

Sprint B4 — Week 8 (Days 46–50) — Backups + CI hardening

DayItemOwnerAcceptance
46GL-18 (start) — backup strategy decisionSREDecision recorded.
47–48PITR enabled per database (API, ledger, gateway)SREBackups configured + monitored.
49Restore drillSRE + Backend-ATest database restored from backup; recent transaction verified. RTO + RPO documented.
50GL-19 — secret-leak + dep-vuln scanning in CISREgitleaks + Snyk merged into the GitHub Actions workflow.

Exit: the system can survive a disk loss. CI catches accidentally-committed secrets.

Sprint B5 — Week 9 (Days 51–55) — On-call rota + drill

DayItemOwnerAcceptance
51–52GL-20 — on-call rota named; tooling configuredEngineering lead3 engineers signed up; weekly handoff documented.
53First tabletop drill: webhook-failure.mdAll on-callDrill executed; postmortem written.
54Second drill: drift-detected.mdAll on-callDrill executed; postmortem written.
55Incident commander + scribe roles documentedEngineering leadDocumented; trial-run completed.

Exit: someone knows what to do when the page fires.

Sprint B6 — Week 10 (Days 56–60) — Phase B catch-up + retrospective

Buffer. There will be slippage. Use this week to absorb it. Anything still open at end of Day 60 becomes a Phase C carry-over.

Exit of Phase B:

  • 15 of 22 go-live blockers closed (GL-01..18 + GL-19..21).
  • The platform is operationally hardened.
  • Ready for the compliance + identity work in Phase C.

Phase C — Days 61–90 — Pilot enablement

Theme: the things partners and regulators require, plus one integrated frontend, plus the payroll-domain skeleton that unblocks scale beyond the pilot.

Sprint C1 — Week 11 (Days 61–65) — KYC + sanctions

DayItemOwnerAcceptance
61–63GL-13 — KYC primitiveBackend-A + Frontendpackages/kyc/ skeleton; capture (nationalId, photo, document); tenant-scoped table; outbox event; manual-review workflow.
64–65GL-14 — sanctions screening (enum-list pilot tier)Backend-BPre-disburse check against OFAC + UN + Ethiopia list (CSV ingested daily); hit halts disbursement + alerts.

Exit: identity verification + sanctions screening exist at pilot tier.

Sprint C2 — Week 12 (Days 66–70) — Frontend integration: employer-web

This is the strategic frontend bet. Employer-web is the customer that pays. Integrate it first; the other four can wait.

DayItemOwnerAcceptance
66Replace mock data with real API callsFrontendLogin + dashboard + employee list + payroll view.
67–68Wire the EWA + lending disburse approval flowsFrontendEmployer admin can approve / reject pending EWAs and loans from the UI.
69Wire the admin endpoints for EWA repayment + loan repayment recordingFrontendManual repayment recording is operable from the UI (not psql).
70E2E test in CIFrontend + SREOne full employer-web E2E test passes against a deployed staging.

Exit: the employer experience is real, not theatre.

Sprint C3 — Week 13 (Days 71–75) — Payroll-domain skeleton

Cannot ship a full payroll engine in 5 days. Ship a skeleton that the pilot can use for one or two manual payroll runs while we build out the rest.

DayItemOwnerAcceptance
71Domain package packages/payroll/ createdBackend-AStandard 4-layer scaffold.
72–73Payroll-run aggregate: state machine, deduction calculator (consume EWA + loan-installment open balances)Backend-AUnit tests.
74Bulk-disbursement endpoint: 1 transfer per employee via gatewayBackend-A + Backend-BIdempotent; fan-out across the gateway.
75Payroll-event consumer in EWA + LendingBackend-CConsumes payroll.deductions_taken.v1; routes to RecordEwaRepaymentUseCase / RecordRepaymentUseCase.

Exit: payroll is no longer the bottleneck for repayment automation. Pilot can run one or two payroll cycles end-to-end.

Sprint C4 — Week 14 (Days 76–80) — Soak preparation + sign-off form

DayItemOwnerAcceptance
76–77End-to-end staging deploy: API, ledger, gateway, bank-sandbox (or Dashen sandbox)SREStaging mirrors production topology.
78Pilot data seeded: 1 employer, 5 fake employees, 1 FI partnerBackend-ASeed script.
79Soak begins — Day 1AllDaily-recon cron runs; drift = 0 observed.
80GL-22 — sign-off form populatedEngineering leadEach of 19 pilot-blocker items has verifier + date + evidence link.

Sprint C5 — Week 15 (Days 81–85) — Soak continues

DayItemOwnerAcceptance
81–85Soak days 3–7All on-callDaily drift = 0; no incidents.

Sprint C6 — Week 16 (Days 86–90) — Sign-off + first pilot enrolment

DayItemOwnerAcceptance
86Soak day 7 — green light decisionAll leadsIf drift was 0 all 7 days AND all 19 pilot-blockers signed off → green light.
87First pilot employer onboarded (1 employer, ≤ 5 employees)AllFirst real disbursement.
88–90Buffer + incident-response standbyOn-callWatching closely for anomalies.

Exit of Phase C:

  • 19 pilot-blockers closed + signed off.
  • 7-day soak clean.
  • First pilot employer is on the platform with real money flowing.
  • Three Phase-C-only items remain unfinished and are deliberately deferred: BNPL, Savings, Equb, advanced fraud, mobile apps.

Parallel tracks not on the critical path

These run alongside the sprints above. They don't block pilot but should not be ignored.

TrackOwnerNotes
Delete legacy Prisma modelsBackend-COnce nothing reads Wallet*, BNPLPurchase, Payroll*, Equb*, SavingGoal, BillPayment, Expense — drop the tables in a forward migration. Do this in Phase B once payroll-skeleton has its own clean schema. (BNPLPartner already collapsed into Merchant.)
services/notifications LiveBackend-B (Go)Pick Africa's Talking; wire SMS as the first channel; consume *_settled + *_failed outbox events. Phase B effort.
Event catalog docEngineering leaddocs/architecture/event-catalog.md — promised in ADR-011 follow-ups. Phase B.
Admin tooling MVPBackend-AA apps/api/src/admin/ module with: replay-webhook endpoint, force-resync disbursement, view ledger account, view outbox event. Phase B.

Decisions to make at the start of Phase A

These are the choices that the user / leadership needs to make before the sprint starts. Each is a 1-line decision:

  1. Who's the pilot employer? (Determines KYC tier-1 data shape.)
  2. Which partner bank do we open the rail with first? (Determines GL-10 adapter implementation.)
  3. Which paging provider? (Determines GL-08 cost + integration shape.)
  4. Which secrets manager? (Determines GL-16 ops cost.)
  5. Which cloud (or self-host)? (Determines GL-18 backup strategy.)
  6. Who's the security lead? (Signs the sign-off form.)
  7. Who's the finance/ops lead? (Signs the sign-off form.)

Without these decisions on Day 1, Phase A slips into Phase B.

What this plan deliberately does NOT include

  • BNPL implementation. No code today. Out of 90-day scope. Plan as a Q4 program.
  • Savings + Equb. Same. Q1 2027 conversation.
  • Mobile apps. Web is sufficient for pilot. Quarter after pilot.
  • Advanced fraud detection (behavioural analytics, device fingerprinting). Pilot uses static velocity ceilings + sanctions enum. Q4.
  • Frontend integration of admin-web, employee-web, fi-web, merchant-web. Employer-web first; the others sequentially over Q4–Q1 2027.
  • Real-time event consumers beyond payroll. BI pipeline, fraud-event consumer, regulator-reporting consumer — all Q4.
  • Multi-currency settlement. Schema supports it; pilot is ETB-only.
  • Multi-region DR. Single-region in pilot; multi-region is Q2 2027.

Risk register for this plan

RiskLikelihoodImpactMitigation
Authz fixes (GL-01..03) reveal additional missing checksHighMediumSprint A1 has buffer; treat the first finding as the leading indicator and look for siblings.
LookupAccount delivery requires Dashen sandbox cooperationMediumHigh (blocks GL-04)Start Dashen-side spec conversation Day 1 of Phase A.
Statement-pull (GL-10) needs partner SFTP onboardingMediumHighSame — start the partner conversation Phase A Week 1.
Alerting (GL-08) takes longer than 5 daysHighMediumSprint B6 is the buffer; pull from there.
7-day soak surfaces real driftHighCritical (blocks pilot)This IS the point of soak. Plan 2 weeks buffer past Day 90 for drift hunting. Do NOT treat Day 90 as a fixed pilot date.
Team turnover during the 90 daysLowCriticalEach task in this plan is self-contained; a new engineer can pick up mid-program with the linked architecture docs.
Regulator (NBE) asks for evidence before pilotMediumHighPhase A Day 16: write ADR-014. Phase C: KYC + sanctions. Have artefacts ready for the first NBE conversation.

Outcomes at Day 90

If this plan executes:

  • 19 of 22 go-live blockers closed.
  • 7-day soak completed on staging.
  • 1 employer + ≤ 5 employees in production with real money flowing.
  • All 4 incident runbooks battle-tested in at least one drill.
  • A daily reconciliation cadence reporting drift = 0 in production.
  • An employer-web frontend that real customers can use.
  • A payroll-domain skeleton that enables scaling past pilot.

If this plan slips by 30 days (which is realistic for a 5-person team):

  • Pilot lands at Day 120 instead.
  • Phase C "parallel tracks" all drop off.
  • BNPL conversation moves to Q1 2027.

If this plan slips by 60 days:

  • Re-set expectations with leadership / investors / partners. Don't fake it.

What the user has asked me to do (and what they have not)

The user asked for a deep architectural audit + 8 deliverable documents. This is the eighth and final document. None of the work above has been executed. Files in working tree from prior sessions remain unstaged per the standing instruction. The audit recommends the work; the user decides which (if any) to authorise.

Cross-references