Notes and Applied Theory Small Data Ethics

By Laini Byfield

Practice

How to run small data systems day-to-day without turning people into collateral damage.

Data lineageMergingCommsAppeals
Operational guardrails

By pipeline stage

Stage 1 — Intake
  • Document source, fields, and intended purpose before loading
  • Validate schema; detect drift early rather than at payout time
  • Tag each load with a load date and source version
Stage 2 — Matching and merging
  • Prefer stable identifiers; define fallback rules in writing
  • Flag collisions, duplicates, and ambiguous matches for review
  • Quantify match uncertainty; set review thresholds before running
Stage 3 — Scoring and eligibility
  • Make rules explainable in plain language before they go live
  • Log rule version and cutoff assumptions with every run
  • Define what can be appealed and what evidence counts
Stage 4 — Communication
  • Avoid false certainty — name what may update later
  • Provide a “how to correct” path, not just outcome notices
  • Use safe reporting; small-n suppression where re-identification is plausible
The four gate questions

A practical checklist

Necessity

Is the data necessary for this purpose — what can be removed?

Contestability

Can a person contest the outcome — what is the timeline?

Traceability

Can you trace an outcome to its source file and rule version?

Repair

What are the known error modes — what is the repair plan?

These four questions are a condensed version of the ETHICMAP cycle applied to a single run. See the full ETHICMAP cycle →

If you cannot trace an outcome to its source file and rule version, you cannot contest it, repair it, or learn from it. Documentation is not overhead. It is the floor of ethical operations.