Skip to main content

Runbook: Bank statement parse failed

Owner: integration team Severity: MEDIUM-HIGH — reconciliation cannot complete for the affected partner+period until parsing succeeds. If you can't reconcile, the drift detector goes blind for that account; sustained for multiple days breaks the S4.6 soak gate.

Symptom

Any one of:

  • The DashenIngester.Parse call returns error (not per-line errors, but a top-level failure). The caller logs "statement parse failed: <reason>" at ERROR.
  • IngestResult.Errors contains > 10% of LinesRead (per-row failures — the file parsed but most rows were unusable).
  • No rows appear in bank_statement_line for a partner after an ingestion run that we know was triggered.
  • Reconciliation matcher reports zero candidates for a partner+period that should have settlement traffic (downstream symptom of zero ingested lines).

The two failure modes are distinct and need different responses:

  • Top-level parse error → the file doesn't fit the contract (missing headers, empty payload, encoding bug). Nothing was inserted.
  • High per-line error rate → the file's shape matches but the data is bad (negative amounts, missing references, non-RFC-3339 dates). Some rows might have been inserted; most were dropped.

Likely causes

  1. Partner shipped a new statement format (added/renamed/removed columns, switched delimiter to TSV, changed date format).
  2. File is not the file we asked for — operator pulled the wrong day's report, or downloaded the HTML "summary" page instead of the CSV export.
  3. Encoding mismatch — partner shipped UTF-16 BOM-ed CSV; our parser expects UTF-8.
  4. The CSV is technically valid but values are out-of-policy — negative amounts (refund/chargeback lines), zero amounts (placeholder rows), empty partner_reference columns for fees.
  5. Quoting bug at the partner side — embedded commas inside a description field with broken escaping confuses the row splitter.
  6. An ingester regression on our side — a release inadvertently tightened the schema.

Diagnosis steps

  1. Capture the raw file. The ingester preserves RawPayload at the line level after a successful parse, but a top-level failure means there's nothing in the DB yet. Find the source artifact (S3 bucket / partner SFTP drop) and download the exact bytes the ingester saw. Keep them — they're evidence for the partner if this turns out to be their bug.
  2. Identify the failure mode by reading the error:
    • "statement missing required columns" → cause 1 (columns).
    • "empty statement payload" → cause 2 (wrong file).
    • "read header" followed by a binary-looking byte sequence → cause 3 (encoding).
    • "value_date %q" / "partner_reference is empty" / "amount_santim must be positive" per-row → cause 4 (data quality). Count them: if > 10% of rows, it's not noise.
    • "csv read" per-row with error: bare \" in non-quoted-field → cause 5 (quoting).
  3. Run the ingester locally against the captured file.
    cd services/integration-gateway
    # one-off harness — replace with `cat path/to/file.csv | go run ./cmd/test-ingest`
    # once that tool exists; until then, paste into a *_test.go and run.
    go test -v -run TestDashenIngester_ParsesWellFormedCSV ./internal/reconciliation/
  4. For cause 1 (new columns): diff against the previously-known-good file (keep one per partner in infra/test-fixtures/statements/). Document the new schema; the columns that matter are value_date, partner_reference, amount_santim — anything else is optional and tolerated.
  5. For cause 3 (encoding): file path/to/file.csv reports the detected encoding. iconv -f UTF-16 -t UTF-8 reproduces the parse-ready form.
  6. For cause 4 (data quality): extract the rejected lines from the per-row errors. Pattern-match the reasons; group by reason; ask the partner why those rows exist.

Mitigation

  1. Top-level failure that means NO rows inserted: do nothing in the DB. Reconciliation for this period is stale; flag the operator that today's run was skipped.
  2. High per-line error rate: roll back the partial ingestion. The store side enforces ON CONFLICT idempotency on (tenant_id, partner, partner_reference), so a successful re-ingest with a fixed parser is safe — it'll add the missing rows and skip the ones already present. The flagged-but-not-inserted rows are still missing; once you understand WHY they were rejected, decide whether to ingest them or to leave them rejected.
  3. Encoding issue (cause 3): convert the file once with iconv and re-run. Don't ship a UTF-16 path into the parser; that's a permanent invitation to break.
  4. Wrong file (cause 2): re-pull the correct file. Don't try to coerce the wrong one.

DO NOT patch the parser to silently accept malformed rows. The ingester's strict reject-and-flag behaviour is the whole reconciliation value-add — a "fixed" parser that silently drops rows would make drift detection blind.

Resolution

  • Cause 1: write a follow-up test fixture against the new format, ship a parser update, deploy. Add a release-time fixture check that asserts a known-good file from each active partner still parses.
  • Cause 2: harden the operator workflow that fetches the file — usually a Makefile target or a Lambda. Pin to a specific URL pattern that's unambiguous about the artifact type.
  • Cause 3: add encoding detection at the boundary; golang.org/x/text/encoding/charmap lets us auto-detect + convert.
  • Cause 4: file partner-side ticket. Negative/zero amount lines are often "informational" entries that don't belong in our reconciliation set — adjust the contract docs to filter them at ingest time rather than per-row reject.
  • Cause 5: file partner-side ticket; in the meantime, ingest with csv.Reader.LazyQuotes = true (already set in our ingester) and a sanity check that the partner-reference field still parses cleanly.
  • Cause 6: revert the regression. Add the test case that would have caught it.

Escalation

  • 4 hours without diagnosis → page the platform on-call.
  • 24 hours with no successful ingest for a partner → notify finance/ops; the drift detector is blind for that partner.

  • 48 hours during the S4.6 soak window → release is blocked; notify engineering lead + finance lead.

  • Cause 1 (new format) during partner-side-driven schema migration → no escalation needed; this is normal partner ops. File a ticket; ship the parser update on the next release.
  • Ingester: services/integration-gateway/internal/reconciliation/dashen_ingester.go.
  • Per-line error model: LineError in services/integration-gateway/internal/reconciliation/types.go.
  • Storage: bank_statement_line table — migration services/integration-gateway/migrations/0002_bank_statement_line.up.sql. The ON CONFLICT on (tenant_id, partner, partner_reference) is the load-bearing idempotency guard for re-ingest.
  • Matcher (consumes successfully-ingested rows): services/integration-gateway/internal/reconciliation/matcher.go. Once ingestion succeeds, the next matcher run picks up the new rows automatically.