c-02-pairwise-harness
Pairwise Harness
Compare outputs pairwise without letting order bias quietly win.
mid
python
~35 minREADME
# Pairwise Harness Implement the tiny eval helper in `src/pairwise.py` until `tests/validate.py` passes. ## Good / Bad / Ugly - **Good**: passes shuffled and edge-case examples without network calls. - **Bad**: hard-codes the sample fixture and calls it done. Cute. No. - **Ugly**: production data is partial, so validation must reject ambiguity before scoring. Run with `npm run challenge -- pairwise-harness --track eval-writing`.
Hints
- Good passes under shuffled examples and explicit edge cases.
- Bad hard-codes the sample rows. Very brave, very detectable.
- Ugly production data is partial; validate before scoring.
Acceptance
- `npm run challenge -- pairwise-harness --track eval-writing` exits 0
- Validator exercises good, bad, and ugly cases
- Implementation avoids network calls and nondeterminism