Runbook: Bank statement parse failed
Owner: integration team Severity: MEDIUM-HIGH — reconciliation cannot complete for the affected partner+period until parsing succeeds. If you can't reconcile, the drift detector goes blind for that account; sustained for multiple days breaks the S4.6 soak gate.
Symptom
Any one of:
- The
DashenIngester.Parsecall returnserror(not per-line errors, but a top-level failure). The caller logs"statement parse failed: <reason>"at ERROR. IngestResult.Errorscontains > 10% ofLinesRead(per-row failures — the file parsed but most rows were unusable).- No rows appear in
bank_statement_linefor a partner after an ingestion run that we know was triggered. - Reconciliation matcher reports zero candidates for a partner+period that should have settlement traffic (downstream symptom of zero ingested lines).
The two failure modes are distinct and need different responses:
- Top-level parse error → the file doesn't fit the contract (missing headers, empty payload, encoding bug). Nothing was inserted.
- High per-line error rate → the file's shape matches but the data is bad (negative amounts, missing references, non-RFC-3339 dates). Some rows might have been inserted; most were dropped.
Likely causes
- Partner shipped a new statement format (added/renamed/removed columns, switched delimiter to TSV, changed date format).
- File is not the file we asked for — operator pulled the wrong day's report, or downloaded the HTML "summary" page instead of the CSV export.
- Encoding mismatch — partner shipped UTF-16 BOM-ed CSV; our parser expects UTF-8.
- The CSV is technically valid but values are out-of-policy — negative amounts (refund/chargeback lines), zero amounts (placeholder rows), empty
partner_referencecolumns for fees. - Quoting bug at the partner side — embedded commas inside a description field with broken escaping confuses the row splitter.
- An ingester regression on our side — a release inadvertently tightened the schema.
Diagnosis steps
- Capture the raw file. The ingester preserves
RawPayloadat the line level after a successful parse, but a top-level failure means there's nothing in the DB yet. Find the source artifact (S3 bucket / partner SFTP drop) and download the exact bytes the ingester saw. Keep them — they're evidence for the partner if this turns out to be their bug. - Identify the failure mode by reading the error:
"statement missing required columns"→ cause 1 (columns)."empty statement payload"→ cause 2 (wrong file)."read header"followed by a binary-looking byte sequence → cause 3 (encoding)."value_date %q"/"partner_reference is empty"/"amount_santim must be positive"per-row → cause 4 (data quality). Count them: if > 10% of rows, it's not noise."csv read"per-row witherror: bare \" in non-quoted-field→ cause 5 (quoting).
- Run the ingester locally against the captured file.
cd services/integration-gateway# one-off harness — replace with `cat path/to/file.csv | go run ./cmd/test-ingest`# once that tool exists; until then, paste into a *_test.go and run.go test -v -run TestDashenIngester_ParsesWellFormedCSV ./internal/reconciliation/
- For cause 1 (new columns): diff against the previously-known-good file (keep one per partner in
infra/test-fixtures/statements/). Document the new schema; the columns that matter arevalue_date,partner_reference,amount_santim— anything else is optional and tolerated. - For cause 3 (encoding):
file path/to/file.csvreports the detected encoding.iconv -f UTF-16 -t UTF-8reproduces the parse-ready form. - For cause 4 (data quality): extract the rejected lines from the per-row errors. Pattern-match the reasons; group by reason; ask the partner why those rows exist.
Mitigation
- Top-level failure that means NO rows inserted: do nothing in the DB. Reconciliation for this period is stale; flag the operator that today's run was skipped.
- High per-line error rate: roll back the partial ingestion. The store side enforces ON CONFLICT idempotency on
(tenant_id, partner, partner_reference), so a successful re-ingest with a fixed parser is safe — it'll add the missing rows and skip the ones already present. The flagged-but-not-inserted rows are still missing; once you understand WHY they were rejected, decide whether to ingest them or to leave them rejected. - Encoding issue (cause 3): convert the file once with
iconvand re-run. Don't ship a UTF-16 path into the parser; that's a permanent invitation to break. - Wrong file (cause 2): re-pull the correct file. Don't try to coerce the wrong one.
DO NOT patch the parser to silently accept malformed rows. The ingester's strict reject-and-flag behaviour is the whole reconciliation value-add — a "fixed" parser that silently drops rows would make drift detection blind.
Resolution
- Cause 1: write a follow-up test fixture against the new format, ship a parser update, deploy. Add a release-time fixture check that asserts a known-good file from each active partner still parses.
- Cause 2: harden the operator workflow that fetches the file — usually a
Makefiletarget or a Lambda. Pin to a specific URL pattern that's unambiguous about the artifact type. - Cause 3: add encoding detection at the boundary;
golang.org/x/text/encoding/charmaplets us auto-detect + convert. - Cause 4: file partner-side ticket. Negative/zero amount lines are often "informational" entries that don't belong in our reconciliation set — adjust the contract docs to filter them at ingest time rather than per-row reject.
- Cause 5: file partner-side ticket; in the meantime, ingest with
csv.Reader.LazyQuotes = true(already set in our ingester) and a sanity check that the partner-reference field still parses cleanly. - Cause 6: revert the regression. Add the test case that would have caught it.
Escalation
- 4 hours without diagnosis → page the platform on-call.
-
24 hours with no successful ingest for a partner → notify finance/ops; the drift detector is blind for that partner.
-
48 hours during the S4.6 soak window → release is blocked; notify engineering lead + finance lead.
- Cause 1 (new format) during partner-side-driven schema migration → no escalation needed; this is normal partner ops. File a ticket; ship the parser update on the next release.
Related
- Ingester:
services/integration-gateway/internal/reconciliation/dashen_ingester.go. - Per-line error model:
LineErrorinservices/integration-gateway/internal/reconciliation/types.go. - Storage:
bank_statement_linetable — migrationservices/integration-gateway/migrations/0002_bank_statement_line.up.sql. The ON CONFLICT on(tenant_id, partner, partner_reference)is the load-bearing idempotency guard for re-ingest. - Matcher (consumes successfully-ingested rows):
services/integration-gateway/internal/reconciliation/matcher.go. Once ingestion succeeds, the next matcher run picks up the new rows automatically.