Phase 77
fidelity-principles
Fidelity principles to the published canon, versioned as YAML
Codifies as YAML the doctrinal principles previously scattered through code (canon-only rule, no-impersonation, epistemic humility, required citations, conscience matters). Each principle has an id, severity (hard/soft), applies_to per agent, source citing an official publication, and an optional regex tier for cheap detection. The jw-finetune judge and the jw-agents fidelity_wrap decorator both consult it at runtime. Lazy import from jw-agents to avoid the cycle (jw-eval already depends on jw-agents).
What was delivered
- 5 builtin principles: PF001-canon-only, PF002-cite-before-paraphrase, PF003-citation-required, PF010-no-impersonation, PF012-respect-conscience.
- Pydantic loader with id-based override: a user YAML overrides a builtin sharing the same id.
- violations_for(text, principles) runs forbidden_phrases + forbidden_regex in one pass (case-insensitive).
- Judge.score_qa_pair() accepts principles=; a hard hit adds RejectionCode.principle_hard_violation.
- fidelity_wrap(principles=…) filters by agent_name, stamps metadata principle_hard/soft, respects on_fail warn/reject/annotate.
- Lazy import of jw_eval.principles from jw_agents to avoid the dependency cycle.
- Hard severity = block; soft = annotate; no hidden policies.