09 — Common mistakes (and how to debug them)
A bestiary of pitfalls a new contributor will hit. Bookmark this page; come back when something doesn't work.
"My query returns zero rows even though there's data"
Most likely cause: you're outside a PrismaTransactionRunner
call, so app.tenant_id is not set, so RLS hides every row.
Diagnostic:
-- Inside psql:
SELECT current_setting('app.tenant_id', true);
-- If this returns NULL or empty string, RLS will hide all rows.
Fix:
// ❌:
const rows = await prisma.ewa_request.findMany({ ... });
// ✅:
await runner.runInTransaction(async (tx) => {
const rows = await tx.ewa_request.findMany({ ... });
});
Or, for local exploration via psql:
SET LOCAL app.tenant_id = 'your-business-id';
SELECT * FROM ewa_request;
"POST /api/ewa/requests returns 500 with a session cookie"
Most likely cause: the user has no Organization membership, so
req.user.businessId is undefined, so the tenant context is
empty, so the use case fails when it tries to set the GUC.
Diagnostic:
# In your local DB:
psql "$DATABASE_URL" -c '
SELECT u.id, u.email, s."activeOrganizationId"
FROM "User" u
LEFT JOIN "Session" s ON s."userId" = u.id;
'
If activeOrganizationId is NULL, that's the issue.
Fix (today, manual): create a Business → it creates an
Organization with the same id. Then insert a Member row linking
your User to the Organization. Then call:
curl -X POST http://localhost:8000/api/auth/organization/set-active \
-H "Cookie: $COOKIE" \
-H 'content-type: application/json' \
-d '{"organizationId": "$BUSINESS_ID"}'
Fix (eventual): Member auto-creation on first sign-up is
Planned. Until then, the manual dance is required.
"Boot fails with 'DATABASE_URL is required'"
Cause: no .env file, or the variable is empty.
Fix: cp .env.example .env and fill it in. See
02-running-locally.md §"Step 3".
"Boot fails with 'JWT_SECRET must be at least 16 characters'"
Cause: the Zod schema enforces a minimum length to prevent accidentally shipping a 4-character dev secret.
Fix:
openssl rand -base64 32
Paste into .env for both JWT_SECRET and
BETTER_AUTH_SECRET.
"Postgres isn't reachable"
Symptoms: boot fails with postgres unreachable, or
/api/readyz shows postgres down.
Diagnostics:
docker compose ps # is the container up?
docker compose logs postgres | tail -50 # any errors?
psql "$DATABASE_URL" -c 'SELECT 1;' # can YOU reach it?
Common fixes:
pnpm docker:upif not running.- Port conflict on 5432 → change the host port in
docker-compose.ymland updateDATABASE_URL. - Stale data →
pnpm docker:down && pnpm docker:up.
"Migration fails: relation already exists"
Cause: half-applied migration state.
Fix (dev only):
pnpm prisma:reset # wipes and re-applies
Never prisma:reset against a non-local DB. There's no soft delete;
all data is gone.
"Migration fails: 'tenant_isolation policy missing on X'"
Cause: the RLS verify guard at the end of
20260526030000_apply_tenant_rls/migration.sql raised because
either:
- A new tenant table was added but not registered in the
tenant_tablesarray, OR - The migration partially applied and was retried.
Diagnostic:
SELECT relname, relrowsecurity, relforcerowsecurity
FROM pg_class
WHERE relkind = 'r'
AND relname IN ('ewa_request', 'loan', 'outbox_event',
'idempotency_record', 'audit_entry');
Fix: add the missing table to BOTH tenant_tables arrays in
the migration. See 08-how-to-add-things.md
§"Add a new tenant table".
"The outbox publisher logs say 'drained 0 rows' but I see rows"
Cause: the publisher is running under the API role, which is subject to RLS. With no tenant context, it sees zero rows.
Diagnostic:
# Did you set OUTBOX_DATABASE_URL?
grep OUTBOX_DATABASE_URL .env
If not, the publisher uses DATABASE_URL (the API role) and logs a
loud WARN at boot.
Fix:
- Provision the BYPASSRLS role:
psql "$DATABASE_URL" -v publisher_password="'devpass'" \-f infra/sql/00_create_outbox_publisher_role.sql
- Add to
.env:OUTBOX_DATABASE_URL="postgresql://demozpay_outbox_publisher:devpass@localhost:5432/demoz_pay_dev?schema=public" - Restart the API.
"TypeScript can't find @demoz-pay/ewa"
Cause: Nx project graph cache stale, or the path alias isn't registered.
Diagnostic:
pnpm nx graph # opens an interactive graph view
cat tsconfig.base.json | jq '.compilerOptions.paths'
Fix:
pnpm nx reset # clears Nx cache
pnpm install # re-link workspace deps
"ESLint error: 'cross-domain import not allowed'"
Cause: ADR-011 — domain packages can't import each other. You
tried to import from @demoz-pay/lending inside @demoz-pay/ewa
or vice versa.
Fix: communicate via outbox events instead. Read 07-rules-and-patterns.md §"Pattern C".
"I get a 401 on every route except /healthz"
Cause: this is correct behavior. AuthGuard is registered as
APP_GUARD. Every route is protected unless decorated @Public().
Diagnostic: the route should not be public; you need to sign in.
Fix: sign up via /api/auth/sign-up/email and use the returned
cookie. See 02-running-locally.md §"Step 7".
"I get a 500 on /api/auth/sign-up/email"
Most likely causes:
- DB not migrated — better-auth tables don't exist.
- Fix:
pnpm prisma:migrate.
- Fix:
BETTER_AUTH_SECRETis too short.- Fix: rotate to a ≥16-char secret.
- Email/password conflict (user already exists).
- Fix: use a different email or look up the user in DB.
"The phone-OTP send-otp endpoint succeeds but I never get a code"
Cause: SMS_PROVIDER=logger (the default) prints the OTP to
the API log instead of sending an SMS. This is intentional in dev.
Fix: check the API log for a line like:
[DEV-OTP] phone=+251xxxxxxx code=123456
For production, you need an AfricasTalking adapter
(Planned). SMS_PROVIDER=africastalking is reserved for that
implementation.
"An audit row didn't get written"
Cause: the use case bypassed the runner, so the audit emitter wasn't given a tx handle, so it never inserted.
Diagnostic: search the use case for prisma.<table>.create()
calls outside runner.runInTransaction(...).
Fix: wrap all state changes in runner.runInTransaction. The
audit + outbox + state changes commit together or roll back together.
"Tests pass locally but CI fails"
Most likely causes:
- Hidden file-system case-sensitivity. macOS is case-insensitive; Linux is. Verify import paths match file casing exactly.
- Non-deterministic test order. Tests that depend on database state from a previous test will fail when test order changes.
- Time-zone differences. CI runs in UTC; your machine doesn't.
Always use
.toUTC()or compare with explicit timezone.
"I committed something I shouldn't have"
If not pushed yet:
git reset --soft HEAD~1 # uncommit but keep changes
# now you can edit / split / re-commit
If pushed but no one has pulled:
git revert HEAD # creates an "undo" commit
git push
If pushed and merged and people pulled:
- Don't try to rewrite history.
- Open a new PR that undoes the change.
- Talk to the team.
"Where do I find logs from the API in production?"
Today: no production deployment exists. Logs go to stdout.
When production lands, logs will flow through Pino → JSON → log collector (TBD: Loki / Datadog / equivalent). Trace IDs from OpenTelemetry will be the join key between traces and logs.
"The Go ledger says 'connection refused'"
Cause: services/ledger isn't running.
Diagnostic:
# In services/ledger:
go run ./cmd/ledger # requires Go 1.22+ AND buf generate already run
Fix: the gRPC ledger probe in the API is advisory, not fatal — the API will boot and serve everything except money operations. If you need money operations, you need the ledger running.
Easier path for testing the schema + DB invariants without a Go toolchain:
./services/ledger/test/verify.sh
This brings up a hermetic Postgres on :5434, applies migrations,
and runs the integration tests (which exercise the store layer
directly — no Go server needed).
"I want to roll back a migration"
Don't. Forward-only migrations are the rule (informally; not yet an explicit ADR but implied by ADR-009).
The right approach when a migration is wrong:
- Write a new migration that compensates (
drop_column→add_column_back). - If data was lost, restore from backup.
prisma migrate reset is dev-only.
"I'm reading code and I don't understand why something exists"
Order of investigation:
- Search the ADRs.
grep -ri 'thing-i-dont-understand' docs/adr/ - Read the closest README. Most packages have one.
- Look at the test. Tests are documentation of expected behavior.
- Read the commit message.
git log --follow --pretty=format:'%h %s' -- path/to/file - Ask in the team channel. Don't suffer in silence.
If you find a piece of code that doesn't explain itself and has no ADR / no comment / no test, add a comment in the PR that introduces your change. The codebase improves through readers.
"I'm overwhelmed by the codebase"
That's the right reaction. The codebase looks like it does ~30% of what its docs claim (per the status matrix). Most of the heavy work is in the invariants, the tenant model, and the ledger schema — narrow surface, deep correctness.
Start small:
- Read
apps/api/src/main.tsend-to-end. Just that one file. - Trace one request through the layers (e.g.
GET /api/healthz). - Look at one use case (
packages/ewa/backend/application/ request-ewa.usecase.ts). - Look at the test for it.
Then come back to this onboarding suite and start
07-rules-and-patterns.md. You'll
get further.
Continue reading
You've finished the suite. From here:
../adr/for the why behind every non-obvious choice.../../INTERN_PROGRAM.mdfor the 16-week plan if you're an intern.../../README.mdand../../CLAUDE.mdfor quick-reference cheat sheets.- The codebase itself — there's no substitute.