Target Platform Architecture
Status: Forward-looking blueprint. This document describes where DemozPay is going, not only what exists today. Every service and infrastructure item is tagged with a maturity label so "now" is never confused with "later."
Maturity legend:
- π’ Today β exists in code now
- π΅ MVP β needed to launch
- π‘ Post-MVP β first customers / growth
- βͺ Long-term β scale / future products
Operating model (read first): DemozPay is a payroll-first fintech platform. Payroll builds the trust and data that unlock financial products. Partner banks custody all funds; DemozPay never holds customer money. The double-entry Ledger is DemozPay's internal record of truth, reconciled against partner-bank statements. There is no internal wallet holding value (ADR-014 β orchestrator, not custodian).
1. Executive Summaryβ
DemozPay's thesis β payroll creates trust; trust unlocks financial products; banks custody, DemozPay orchestrates β dictates the architecture: a federation of independently-replaceable services around a single money-truth ledger, with payroll as the first product and more products reusing the same money, identity, and compliance rails.
We are pre-launch and deliberately launch-first: ship payroll on a small, comprehensible footprint; keep every service boundary expressed as a stable contract from day one; extract services into independent deployments only when load, team size, or language needs justify it. The contract is permanent; the deployment topology evolves.
Three decisions anchor the design:
- Contracts are the permanent architecture; deployments are not. Every boundary has a gRPC proto (sync) and a registered event schema (async) now, even while services share one deployable. This is what lets Payroll move NestJS β Go later with zero impact on consumers.
- Partner banks custody; the Ledger records truth. No internal wallet. The double-entry Ledger is DemozPay's authoritative record of what moved; it is reconciled against partner-bank statements, which are the custody ground truth.
- One money core, many products. Ledger + Bank Gateway are the shared rails. Payroll today; EWA, Lending, Equb, and later Savings/Merchant/Cards reuse the same rails without platform redesign.
What launches (π΅ MVP): a thin Gateway/BFF, Identity, Tenancy, Workforce, Payroll (NestJS), KYC/Screening, the Ledger (Go) and Bank Gateway (Go) actually deployed with a bank-sandbox adapter, in-process Notifications, and Kafka carrying real cross-context events. Everything else (Payments-Orchestrator, Reconciliation-as-a-service, Treasury, Fraud, more products) is contract-ready but added post-MVP.
2. Target Architecture Diagramβ
Labels show what exists when. Solid = MVP; dashed = later.
employer-web π’ admin-web π’ employee-web π’ (fi/merchant-web βͺ frozen until their product launches)
β β β
βββββββ HTTPS/REST ββββββββββββββββ
βΌ
ββββββββββββββββββββββββββββββββ
β API GATEWAY / BFF π΅ β thin: auth, routing, rate-limit,
β (NestJS at MVP) β idempotency-key, API versioning
βββββββββββββββββ¬βββββββββββββββ
β gRPC (sync, request path)
ββββββββββββββ¬ββββββββββββΌββββββββββββββββ¬βββββββββββββββββββ
βΌ βΌ βΌ βΌ βΌ
ββββββββββ βββββββββββ ββββββββββββ ββββββββββββββ βββββββββββββββββββββ
βIDENTITYβ β TENANCY β β WORKFORCEβ β COMPLIANCE β β PAYROLL π’π΅ β
β& Accessβ β /Org β β(employee)β β KYC+Screen β β (NestJS β Go βͺ) β
βπ΅ NestJSβ βπ΅ NestJSβ βπ΅ NestJS β βπ΅ NestJS β β behind gRPC proto β
β β Go βͺ β βββββββββββ ββββββββββββ βββββββ¬βββββββ βββββββββββ¬ββββββββββ
ββββββββββ β gRPC (KYC gate) β gRPC (post)
β βΌ
βββββββββββββββββββββββ βββββββββββββββββββββββββ
β β MONEY CORE β
β ββββββββββββββββββββ€ β
β βΌ β βββββββββββββββββββ β
ββββββββ΄βββββββ ββββββββββββββββ β β LEDGER π’π΅ β β record of truth
β PAYMENTS β β BANK GATEWAY β β β (Go) β β (NOT custody)
β ORCHESTRATORβββΆβ π’π΅ (Go) ββββββββΌββΆβ double-entry, β β
β π‘ light β β abstraction β β β immutable β β
β (no Temporalβ β over partner β β βββββββββββββββββββ β
β at MVP) β β FIs β βββββββββββββββββββββββββ
βββββββββββββββ ββββββββ¬ββββββββ
β adapter interface (same port)
βββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββ
βΌ βΌ βΌ
ββββββββββββββ ββββββββββββββββ ββββββββββββββββββ
β bank-sandbox β π’π΅ β Dashen / CBE β π‘βͺ β Telebirr / β βͺ
β (dev adapter) β (real adapters) β EthSwitch β
ββββββββββββββ ββββββββββββββββ ββββββββββββββββββ
β partner banks HOLD the funds; Bank Gateway moves money via their APIs β
βββββββββββββββββ KAFKA / REDPANDA π΅ (events + schema registry) ββββββββββββββββββ
β demoz.payroll.run.approved.v1 Β· demoz.ledger.entry.posted.v1 Β· β
β demoz.disbursement.settled.v1 Β· demoz.kyc.approved.v1 β
ββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¬βββββββββββββββββββ
βΌ π΅ βΌ π‘ / βͺ
ββββββββββββββββ ββββββββββββββββββββββββββββββββββββ
βNOTIFICATIONS β π΅ in-process (TS) β π‘ service β RECONCILIATION π‘ Β· FRAUD βͺ Β· β
β SMS β β TREASURY βͺ Β· PRODUCTS: EWA/Lendingβ
ββββββββββββββββ β /Equb π’(flag) Β· Savings/Cards βͺ β
ββββββββββββββββββββββββββββββββββββ
Platform: Postgres π΅ Β· Outbox π΅ Β· OTel/Prometheus/Grafana π‘ Β· Vault π‘ Β· K8s π‘ Β· Mesh βͺ
3. Service Map (maturity + language path)β
| Service | Maturity | Language path | Why it exists |
|---|---|---|---|
| API Gateway / BFF | π΅ MVP | NestJS β (Go/Envoy βͺ) | Single front door; decouples frontends from internal topology |
| Identity & Access | π΅ MVP | NestJS β Go optional βͺ | Auth/sessions/MFA; the trust spine |
| Tenancy / Org | π΅ MVP | NestJS | Orgs (business/FI/merchant), members, roles, SoD |
| Workforce (Employee) | π΅ MVP | NestJS | Employee/contract data β the trust layer |
| Payroll | π’ today, π΅ MVP | NestJS β Go βͺ behind gRPC | First product; Go migration only after prod validation |
| Ledger | π’ built, π΅ deploy at MVP | Go | Record of money truth (not custody); double-entry, immutable |
| Bank Gateway | π’ built, π΅ deploy at MVP | Go | Abstraction over partner FIs; bank-sandbox now, real banks later |
| KYC / Screening | π’ today, π΅ MVP | NestJS β Go βͺ (screening) | Gate every product; AML/sanctions |
| Notifications | π΅ in-process | TS in-proc β Go service π‘ | SMS at launch; extract when fan-out grows |
| Payments Orchestrator | π‘ Post-MVP | Go, lightweight β Temporal βͺ | Saga coordination once flows span multiple services |
| Reconciliation | π‘ Post-MVP | Go | Daily Ledger β bank-statement matching as a standalone job |
| EWA / Lending / Equb | π’ built, flag-gated | NestJS | Launch each when the product launches |
| Treasury | βͺ Long-term | Go | Liquidity across multiple partner banks |
| Fraud / Risk | βͺ Long-term | Go | Real-time scoring once volume warrants |
| Wallet | βͺ future product (not a dependency) | β | Only if DemozPay ever offers stored value; not the operating model today |
| Savings / Merchant / QR / Bill / Cards / FX / Remittance | βͺ Long-term | Go/TS | New products on the stable rails |
4. Bounded Context Map (custody-correct)β
TRUST/IDENTITY π΅ MONEY π΅ COMPLIANCE π΅ PRODUCT
IdentityΒ·Tenancy ββ Ledger (record of truth) ββ KYCΒ·Screening Payroll π΅
Β·Workforce β Bank Gateway (movement β Β·Fraud βͺ EWA/Lending/
β via partner FIs) β Β·Audit Equb π’(flag)
β Payments-Orchestrator π‘ β Savings/Cards βͺ
β Treasury βͺ β
ββ Partner banks CUSTODY funds β
Custody statement (the rule that prevents drift): customer funds live in partner financial institutions, never in DemozPay. The Ledger is DemozPay's internal double-entry record of what should be and what moved; Reconciliation continuously proves the Ledger against partner-bank statements, which are the custody ground truth. There is no internal wallet holding value. (The wallet:member:β¦ identifiers in the Equb code are ledger read-model keys, not stored balances β ADR-014.)
Aggregate ownership: an aggregate lives in exactly one context. Employee lives in Workforce β Payroll references it by ID, never imports it. LedgerAccount/Entry live in Ledger β products read a projection, never the table.
5. Communication Matrixβ
The strict rule:
- gRPC β synchronous, in the user's request path, caller blocked on the answer.
- Kafka β asynchronous, reaction to a fact that already happened; producer doesn't know consumers.
- REST β external boundary only (frontends β gateway, partner-facing public API). Never internal service-to-service.
- Webhook β inbound from external parties (partner banks); verified (HMAC now, mTLS later) and immediately converted to an internal event.
| From β To | Mechanism | Why |
|---|---|---|
| Frontend β Gateway/BFF | REST (+WS) | Browser-native; versioned public surface |
| BFF β any service | gRPC | In request path, needs the answer |
| Payroll β Ledger (post entries) | gRPC | Must confirm balanced + idempotent before proceeding |
| Payroll β Compliance (KYC ok?) | gRPC | Gate decision blocks the flow |
| Payments-Orch β Bank Gateway (initiate) | gRPC | Needs accept/reject synchronously |
| Bank β Bank Gateway (settlement) | Webhook β event | External, async; convert to disbursement.settled |
| Ledger β world ("entry posted") | Kafka | Fact; many react (recon, notify, products) |
| Run approved β EWA/Lending recover | Kafka | Reaction, not in payroll's path |
| Anything β Notifications | Kafka | Pure fan-out, must not block business tx |
| Anything β Audit | Kafka (from outbox) | Async durable record |
MVP note: the mechanisms exist at MVP, but if a "service" is still a module in the shared deployable, its inbound calls are in-process behind the same gRPC-shaped port β so extraction later is a transport swap, not a redesign.
Anti-pattern bans (these create distributed monoliths):
- β No synchronous call chain deeper than 2 (AβBβC). Break with events.
- β No service queries another service's DB. Ever.
- β No REST between internal services.
- β No shared library containing another context's domain logic.
6. Event Architectureβ
π΅ At MVP: demoz.<context>.<aggregate>.<event>.v<n> naming; one producer per topic; transactional outbox β Kafka; real consumers wired (the current "publisher with zero consumers" is fixed); protobuf schemas in a Schema Registry with CI backward-compat gates; at-least-once + idempotent consumers; partition key = aggregate ID; per-consumer DLQ + alert.
π‘ Post-MVP: event catalog/registry site; replay tooling; CDC (Debezium) replacing the poller if throughput needs it.
βͺ Long-term: tiered retention for ledger/audit logs; cross-region topic mirroring.
Ownership: the producing context owns the topic, its schema, and its evolution. One producer per topic. Consumers never write to another context's topic.
Versioning: additive changes = same major; breaking = new .v2 topic, dual-publish during migration, retire v1 after consumers move.
Delivery semantics: at-least-once + idempotent consumers (effective-once). Money consumers MUST be idempotent. Ordering: per-aggregate via partition key; cross-aggregate order is never assumed.
7. gRPC Contract Strategyβ
packages/contracts π’ is the law and the highest-governed asset in the repo. Contract-first: proto PR β buf generate (Go + TS committed) β implement. buf breaking-change gate in CI. Consumer-driven contract tests are the safety net that makes the Payroll NestJS β Go swap safe. Every RPC carries tenant + actor + idempotency metadata. This is the mechanism behind every "β Go" in this document.
8. Kafka Strategyβ
| Concern | Decision |
|---|---|
| Broker | π΅ Redpanda (Kafka API, light ops) β βͺ managed Kafka at regional scale |
| Topics | demoz.<context>.<aggregate>.<event>.v<n>; one owner |
| Partitions | 6β12 per topic; key by aggregate ID; over-partition money topics |
| Schema | protobuf in registry, BACKWARD compat enforced in CI |
| Producer | Transactional outbox only β never produce directly from business code |
| Consumer | Idempotent (dedup on event ID), bounded retry β DLQ topic |
| Retention | 7d hot for ops; infinite/tiered for ledger + audit (replay + compliance) |
| DLQ | per-consumer *.dlq + replay tooling + alert |
| Exactly-once | Not pursued cross-service; idempotent consumers = effective-once |
Do not stand up a Kafka cluster you don't consume β wire the real consumers at MVP or it isn't an event architecture.
9. Database Ownership Matrixβ
Rule (from MVP onward): one schema owner; no cross-context FK; no cross-context join; references by ID resolved via gRPC or event-fed projections.
| Stage | Reality |
|---|---|
| π’ Today | One shared Prisma schema across all domains (the central debt; cross-domain FKs exist) |
| π΅ MVP | Per-context schemas in one Postgres instance β no cross-context FK/join; references by ID. (Cheap to do pre-launch; the spine of the migration.) |
| π‘ Post-MVP | Separate databases per extracted service (Ledger, Gateway, Payroll-Go, Notifications) |
| βͺ Long-term | Ledger partitioned by (tenant, month), read replicas; shard by tenant at extreme scale |
| Service | Database (target) | Owned tables | Cross-context refs (by ID only) |
|---|---|---|---|
| Identity | iam_db | users, sessions, credentials | β |
| Tenancy | tenancy_db | orgs, members, roles | userId |
| Workforce | workforce_db | employees, contracts | orgId, userId |
| Ledger | ledger_db | accounts, transactions, entries | tenantId, external refs |
| Payments-Orch | payments_db | sagas, disbursement_intents | runId, accountId, ledgerTxId |
| Bank Gateway | gateway_db | disbursements, webhook_events, recon_input | partner refs |
| Payroll | payroll_db | runs, entries, rules, mandates | employeeId, orgId |
| KYC/Screening | kyc_db | submissions, screenings | userId, orgId |
| Products (EWA/Lending/Equb) | <product>_db | product aggregates | employeeId, ledgerAccountId |
| Notifications | notif_db | messages, delivery | recipient refs |
| Audit | audit_db | append-only audit log | all (by ID) |
10. Deployment Architecture (progressive β no enterprise infra early)β
| Stage | Deployment |
|---|---|
| π’ Today | Docker Compose (but Go services aren't in it β fix at MVP) |
| π΅ MVP | Compose or a single small managed container host / tiny K8s; services co-deployed; Ledger + Bank Gateway run as real processes; secrets via cloud secret manager (not Vault yet); rolling deploys |
| π‘ Post-MVP | Small managed Kubernetes; namespaces by domain; HPA on CPU + Kafka lag; blue/green for Ledger; per-service CI (Nx affected) |
| βͺ Long-term | Vault, service mesh (Linkerd) only if mTLS-everywhere / traffic-shifting demands it, canary (Argo), multi-region |
Explicitly deferred: large K8s, service mesh, Temporal, Argo Rollouts. None are MVP. Add each when a concrete pain justifies it.
11. Security Architecture (progressive)β
π΅ MVP: OAuth2/OIDC + better-auth sessions at the gateway; centralized auth at the edge; HMAC on internal gRPC; Postgres RLS forced per context, proven under a non-superuser role in CI (today it's only tested under a superuser β launch blocker); cloud secret manager; TLS in transit; PII minimization in events (stop leaking email/PK).
π‘ Post-MVP: Vault dynamic creds; field-level PII encryption; crypto-shred for erasure.
βͺ Long-term: mTLS service identity (supersedes HMAC); OPA policy; PCI-scoped isolated cluster if/when cards ship; NBE data-residency alignment.
12. Observability Architecture (progressive)β
- π΅ MVP: structured JSON logs (trace-correlated), basic health/metrics, a few business alerts (disbursement success rate, DLQ depth, recon breaks).
- π‘ Post-MVP: full OpenTelemetry tracing across gRPC + Kafka headers, Prometheus + Grafana, SLO alerts, business dashboards.
- βͺ Long-term: Tempo/Loki at scale, chaos/GameDays.
Don't stand up the full stack before there's traffic to observe.
13. FinTech Compliance Considerationsβ
Source of truth = Ledger (record), reconciled against partner-bank statements (custody truth). Immutability (no DELETE, reversals as new entries) π’ enforced in ledger. Idempotency on every money command π΅. Audit trail in the same tx via outbox π΅. Reconciliation π‘ proves money actually moved. SoD/maker-checker π΅ (CrossKindSoDPolicy). Custody clarity: orchestrator, not custodian (ADR-014) β Equb's simulated pool stays honestly labeled until a real partner-bank escrow account exists. Fraud hooks βͺ. NBE/AML reporting βͺ from the audit log.
14. Scaling Strategyβ
Bottleneck order: Ledger writes (partition, replicas, batch postings per run, shard by tenant later) β Payroll compute (parallel per employee β the reason to migrate to Go βͺ) β Bank Gateway (IO-bound; bulkhead per partner) β Kafka (partition by aggregate). All services stateless; tenant is the shard key. Redis for sessions/read-models β never money truth. Build none of the sharding now; design the keys so it's possible later.
Target scale assumptions: ~5M users, ~200k businesses, thousands of payroll runs, millions of ledger entries/day.
15. Disaster Recovery Strategyβ
- π΅ MVP: daily snapshots + WAL archiving; tested restores; Kafka RF β₯ 3.
- π‘ Post-MVP: money-service RPO β 0 via streaming replica; documented RTOs; per-failure runbooks.
- βͺ Long-term: active-passive region; ledger rebuild-from-log; GameDays.
The Reconciliation against the partner bank is the ultimate backstop β even after a failure, the bank statement is ground truth for what moved.
16. Technology Recommendations (with migration paths)β
| Concern | MVP | Evolves to | Migration path |
|---|---|---|---|
| Payroll | NestJS π΅ | Go βͺ | Behind stable gRPC proto; swap impl, consumers unaffected |
| Identity | NestJS π΅ | Go optional βͺ | Only if auth throughput demands; contract stable |
| Ledger | Go π’ | Go (Rust only if proven) | Already Go |
| Bank Gateway | Go π’ | Go | Already Go; new bank = new adapter, same port |
| Sync | gRPC + buf π΅ | same | β |
| Async | Redpanda π΅ | managed Kafka βͺ | Same Kafka API |
| Orchestration | lightweight (DB state machine + outbox) π‘ | Temporal βͺ | Introduce only when workflow complexity/scale justifies |
| Deploy | Compose / small host π΅ | small K8s π‘ β mesh βͺ | Progressive |
| Secrets | cloud secret mgr π΅ | Vault π‘ | β |
| Observability | logs + basic metrics π΅ | OTel + Grafana π‘ | β |
Hold the two-language ceiling (TS + Go; ADR-010). Rust is hypothetical, ledger-only, and only if Go can't meet latency.
17. New / Updated ADRsβ
- ADR-018 β Progressive extraction behind stable contracts (refines ADR-001; the launch-first principle).
- ADR-019 β Database-per-service; per-context schemas at MVP, separate DBs on extraction (supersedes shared Prisma schema).
- ADR-020 β Contract-first integration (buf + schema registry, CI gates).
- ADR-021 β Lightweight saga orchestration at MVP; Temporal deferred until justified.
- ADR-022 β Kafka naming / ownership / versioning / DLQ / replay (extends ADR-017).
- ADR-023 β API Gateway + BFF; REST at edge only.
- ADR-024 β Custody model: partner banks hold funds; Ledger is the record; no internal wallet (hardens ADR-006 + ADR-014). Prevents the Wallet drift from recurring.
- ADR-025 β Bank Gateway adapter contract; bank-sandbox is a dev adapter, real FIs plug into the same port.
18. Risksβ
| Risk | Severity | Mitigation |
|---|---|---|
| Shared-schema carve-out done after launch (with live money) | Critical | Do per-context schemas pre-launch |
| Half-extracted services (Go tier built but undeployed, Kafka unconsumed) | Critical | "Not a service until deployed + wired + tested"; fix at MVP |
| RLS validated only under superuser | Critical | Prove under non-superuser in CI before launch |
| Distributed-monolith (sync chains, cross-context SQL) | High | Enforce boundary rules in CI (Nx) |
| Over-engineering MVP infra | High | Progressive adoption; this doc's maturity labels |
| Wallet/custody confusion with regulators/partners | High | ADR-024; honest Equb labeling; never claim custody |
| Saga money bugs | High | Idempotency + reconciliation backstop now; Temporal later |
19. Trade-offsβ
Launch-first means accepting temporary co-deployment to gain speed, paying it back via stable contracts (cheap extraction later). Database-per-service loses cross-domain joins (replaced by gRPC/projections) to gain independence. Lightweight orchestration over Temporal trades some durability tooling for far less ops burden β acceptable while flows are simple, revisited when they aren't. Partner-bank custody trades direct control of funds for not being a regulated custodian β which is the entire DemozPay thesis, not a compromise.
20. Implementation Roadmapβ
Phase 0 β Launch MVP (now β ~8 wks). Carve shared schema β per-context schemas (Identity, Tenancy, Workforce, Payroll, KYC). Deploy Ledger + Bank Gateway (Go) for real; bank-sandbox adapter behind the Gateway abstraction. Wire Kafka with real consumers (payroll β repayment, β notify, β recon-input) + schema registry. Payroll stays NestJS behind a gRPC contract. Fix launch blockers: enable the disabled payroll consumer, fix webhook RLS, prove RLS under non-superuser, fix the gateway auth gap. No K8s / mesh / Temporal / Vault.
Phase 1 β First customers (~3β6 mo). Extract Payroll β Go behind its proto (the polyglot proof). Extract Notifications (real service; delete the stub) and Reconciliation. Add Payments-Orchestrator (lightweight). Separate DBs for extracted services. Add OTel tracing, Grafana. Launch EWA/Lending/Equb as their products validate (code exists, flag-gated).
Phase 2 β Growth. Small managed Kubernetes, namespaces, HPA, blue/green for Ledger, Vault, per-service CI. Extract Screening/Fraud as load warrants. Introduce Temporal if saga complexity justifies.
Phase 3 β Regional scale. DR (tested restores, RF β₯ 3), active-passive region, add a second/third real bank adapter (Dashen/CBE/Telebirr/EthSwitch) behind the Gateway β and introduce Treasury to manage liquidity across them (its first real justification). New products (Savings, Merchant/QR, Bill) on stable rails.
Phase 4 β Millions of users. Shard Ledger by tenant; tiered Kafka; CQRS read models; multi-region active-active with data residency; mesh if needed; PCI-scoped cluster if Cards ship; Wallet considered as a product only if stored-value ever becomes the model.
Closingβ
This blueprint keeps the 10-year platform vision intact while telling the truth about today: payroll on NestJS, no internal wallet, partner banks holding funds, the Ledger recording truth, the Bank Gateway as the swappable abstraction, and infrastructure adopted only as customers and load justify it. The permanent part is the contracts; everything else extracts on a schedule a startup can actually fund.
Related: SYSTEM_OVERVIEW.md Β· HANDBOOK.md Β· MONEY_FLOWS.md Β· BANK_ORCHESTRATION.md Β· ../adr/