Specs y planes
Fase 31 — Exportador de hoja de estudio (PDF / DOCX / Anki) — Implementation Plan
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Add an AgentResult → StudySheet → {markdown|pdf|docx|apkg} export pipeline. Markdown always works; PDF/DOCX/Anki are opt-in via extras. Single IR. Stable Anki GUIDs. Pluggable Jinja templates.
Architecture: New module jw_core.exporters inside packages/jw-core (no new workspace package). One IR (StudySheet) consumed by 4 exporters. Lazy imports of heavy deps. CLI command jw export + MCP tool export_study_sheet.
Tech Stack: Python 3.13 · Pydantic v2 (IR) · Jinja2 (PDF templates) · WeasyPrint (PDF, optional) · python-docx (DOCX, optional) · genanki (Anki, optional) · Typer (CLI) · FastMCP (tool).
Spec: docs/superpowers/specs/2026-05-30-fase-31-exporter-design.md.
File map
Creates:
packages/jw-core/src/jw_core/exporters/__init__.pypackages/jw-core/src/jw_core/exporters/ir.pypackages/jw-core/src/jw_core/exporters/errors.pypackages/jw-core/src/jw_core/exporters/markdown.pypackages/jw-core/src/jw_core/exporters/templates_resolver.pypackages/jw-core/src/jw_core/exporters/pdf.pypackages/jw-core/src/jw_core/exporters/docx.pypackages/jw-core/src/jw_core/exporters/anki.pypackages/jw-core/src/jw_core/templates/__init__.pypackages/jw-core/src/jw_core/templates/study_sheet/__init__.pypackages/jw-core/src/jw_core/templates/study_sheet/plain.html.j2packages/jw-core/src/jw_core/templates/study_sheet/study-sheet.html.j2packages/jw-core/tests/test_exporter_ir.pypackages/jw-core/tests/test_exporter_markdown.pypackages/jw-core/tests/test_exporter_templates.pypackages/jw-core/tests/test_exporter_pdf.pypackages/jw-core/tests/test_exporter_docx.pypackages/jw-core/tests/test_exporter_anki.pypackages/jw-cli/src/jw_cli/commands/export.pypackages/jw-cli/tests/test_export_command.pydocs/guias/exportador-hoja-de-estudio.md
Modifies:
packages/jw-core/pyproject.toml(extras[pdf],[docx],[anki]; Jinja2 as hard dep)packages/jw-cli/src/jw_cli/main.py(registerexportcommand)packages/jw-cli/src/jw_cli/commands/__init__.pypackages/jw-mcp/src/jw_mcp/server.py(registerexport_study_sheettool)docs/ROADMAP.md(add Fase 31 section)docs/VISION_AUDIT.md(add row for #11)docs/README.md(link new guide)
Task 1: Scaffold jw_core.exporters module + errors + extras
Files:
-
Create:
packages/jw-core/src/jw_core/exporters/__init__.py -
Create:
packages/jw-core/src/jw_core/exporters/errors.py -
Modify:
packages/jw-core/pyproject.toml -
Step 1: Add the extras and Jinja2 to pyproject
Edit packages/jw-core/pyproject.toml:
- Append to
dependencies = [...]:"jinja2>=3.1.3", - Add new section:
[project.optional-dependencies] pdf = [ "weasyprint>=62.3", ] docx = [ "python-docx>=1.1.0", ] anki = [ "genanki>=0.13.1,<1.0", ]
(If [project.optional-dependencies] already exists, only append the three keys.)
- Step 2: Create the errors module
# packages/jw-core/src/jw_core/exporters/errors.py
"""Exporter exceptions.
Every exporter that requires an optional extra raises `MissingDependencyError`
with a copy-pasteable install hint when its dependency is not importable.
"""
from __future__ import annotations
class ExportError(Exception):
"""Base class for everything raised by the exporters module."""
class MissingDependencyError(ExportError):
"""Raised when an optional dependency (weasyprint/python-docx/genanki) is missing."""
- Step 3: Create the package init
# packages/jw-core/src/jw_core/exporters/__init__.py
"""Convert AgentResult into printable study sheets and Anki decks.
Public API:
from jw_core.exporters import StudySheet
from jw_core.exporters.markdown import export_markdown
from jw_core.exporters.pdf import export_pdf # needs [pdf]
from jw_core.exporters.docx import export_docx # needs [docx]
from jw_core.exporters.anki import export_apkg # needs [anki]
Design: every exporter consumes a `StudySheet` (the single IR). The
`AgentResult → StudySheet` conversion lives in `ir.from_agent_result`.
Heavy dependencies (weasyprint, python-docx, genanki) are imported lazily
inside each exporter function, so importing this package never fails when
the extras are not installed.
"""
from jw_core.exporters.errors import ExportError, MissingDependencyError
from jw_core.exporters.ir import CitationIR, StudySection, StudySheet
__all__ = [
"CitationIR",
"ExportError",
"MissingDependencyError",
"StudySection",
"StudySheet",
]
- Step 4: Verify install
Run: uv sync --all-packages
Expected: no errors. Importing jw_core.exporters should succeed without [pdf]/[docx]/[anki] installed.
- Step 5: Commit
git add packages/jw-core/src/jw_core/exporters packages/jw-core/pyproject.toml
git commit -m "feat(exporters): scaffold jw_core.exporters module with extras"
Task 2: IR — StudySheet + from_agent_result
Files:
-
Create:
packages/jw-core/src/jw_core/exporters/ir.py -
Create:
packages/jw-core/tests/test_exporter_ir.py -
Step 1: Write the failing tests
# packages/jw-core/tests/test_exporter_ir.py
"""Tests for jw_core.exporters.ir — the StudySheet IR and AgentResult conversion."""
from __future__ import annotations
import pytest
from jw_agents.base import AgentResult, Citation, Finding
from jw_core.exporters.ir import CitationIR, StudySection, StudySheet
def _sample_result() -> AgentResult:
return AgentResult(
query="Es la Trinidad bíblica?",
agent_name="apologetics",
findings=[
Finding(
summary="La Biblia presenta a Jehová como el único Dios verdadero.",
citation=Citation(
url="https://wol.jw.org/es/wol/d/r4/lp-s/1101989140",
title="¿Qué enseña la Biblia sobre la Trinidad?",
kind="article",
metadata={"source": "topic_index"},
),
excerpt="Jehová es uno solo (Deuteronomio 6:4).",
metadata={"source": "topic_index"},
),
Finding(
summary="Jesús siempre se distinguió de su Padre.",
citation=Citation(
url="https://wol.jw.org/es/wol/b/r4/lp-s/nwt/E/2024/43/14",
title="Juan 14:28",
kind="verse",
),
),
],
warnings=["Cobertura parcial en idiomas LSN."],
metadata={"language": "es"},
)
def test_studysheet_construct_directly() -> None:
sheet = StudySheet(
title="Demo",
sections=[StudySection(heading="Punto 1", body="Contenido.")],
)
assert sheet.title == "Demo"
assert len(sheet.sections) == 1
assert sheet.language == "es"
def test_citation_ir_defaults() -> None:
cite = CitationIR(url="https://wol.jw.org/x")
assert cite.title == ""
assert cite.kind == ""
assert cite.short_label == ""
def test_from_agent_result_minimal() -> None:
sheet = StudySheet.from_agent_result(_sample_result())
assert sheet.title == "Es la Trinidad bíblica?"
assert "apologetics" in sheet.subtitle.lower() or "apologé" in sheet.subtitle.lower()
assert sheet.language == "es"
assert len(sheet.sections) == 2
def test_from_agent_result_explicit_title_wins() -> None:
sheet = StudySheet.from_agent_result(_sample_result(), title="Mi título")
assert sheet.title == "Mi título"
def test_from_agent_result_truncates_long_title() -> None:
long_q = "Por qué " + "muy largo " * 50
sheet = StudySheet.from_agent_result(
AgentResult(query=long_q, agent_name="apologetics")
)
assert len(sheet.title) <= 80
def test_from_agent_result_warnings_become_footer() -> None:
sheet = StudySheet.from_agent_result(_sample_result())
assert "Cobertura parcial" in sheet.footer_note
assert "Advertencias" in sheet.footer_note
def test_from_agent_result_no_citations_when_disabled() -> None:
sheet = StudySheet.from_agent_result(_sample_result(), include_citations=False)
assert all(section.citations == [] for section in sheet.sections)
def test_from_agent_result_keeps_excerpt() -> None:
sheet = StudySheet.from_agent_result(_sample_result())
assert sheet.sections[0].excerpt.startswith("Jehová es uno solo")
def test_from_agent_result_empty_findings() -> None:
empty = AgentResult(query="vacío", agent_name="apologetics", findings=[])
sheet = StudySheet.from_agent_result(empty)
assert len(sheet.sections) == 1
assert "sin resultados" in sheet.sections[0].heading.lower()
def test_from_agent_result_accepts_dict() -> None:
"""`from_agent_result` must accept the dict form (AgentResult.to_dict())."""
raw = _sample_result().to_dict()
sheet = StudySheet.from_agent_result(raw)
assert sheet.title == "Es la Trinidad bíblica?"
assert len(sheet.sections) == 2
def test_citation_short_label_is_built() -> None:
sheet = StudySheet.from_agent_result(_sample_result())
labels = [c.short_label for s in sheet.sections for c in s.citations]
assert any(labels) # at least one non-empty short label
- Step 2: Run test to verify it fails
Run: uv run pytest packages/jw-core/tests/test_exporter_ir.py -v
Expected: FAIL — module ir missing.
- Step 3: Implement the IR
# packages/jw-core/src/jw_core/exporters/ir.py
"""StudySheet — the single intermediate representation consumed by every exporter.
Conversion `AgentResult → StudySheet` happens here and ONLY here. Every
exporter consumes a StudySheet directly, never an AgentResult.
Why a separate IR:
- Decouples "what to render" from "how to render".
- Lets us swap the upstream shape (AgentResult, future agents, scraped
data) without rewriting four exporters.
- Tests for exporters are fully synthetic (no agent execution needed).
"""
from __future__ import annotations
from typing import Any, TYPE_CHECKING, Literal
from pydantic import BaseModel, Field
if TYPE_CHECKING:
from jw_agents.base import AgentResult
CitationStyle = Literal["inline-paren", "footnote", "bibliography"]
_MAX_TITLE = 80
_MAX_HEADING = 100
_AGENT_SUBTITLES = {
"apologetics": "Análisis apologético",
"verse_explainer": "Explicación del versículo",
"research_topic": "Investigación temática",
"meeting_helper": "Preparación de reunión",
"workbook_helper": "Guía de actividad",
"conversation_assistant": "Asistente de conversación",
"presentation_builder": "Presentación",
"public_talk_outline": "Discurso público — bosquejo",
"reverse_citation_lookup": "Cita inversa",
"study_conductor": "Conductor del estudio",
"student_part_helper": "Parte del estudiante",
"letter_composer": "Composición de carta",
"life_topics": "Tema de vida",
}
class CitationIR(BaseModel):
"""Citation normalized for every exporter."""
url: str
title: str = ""
kind: str = ""
short_label: str = ""
metadata: dict[str, Any] = Field(default_factory=dict)
class StudySection(BaseModel):
"""One section of the study sheet."""
heading: str
body: str
excerpt: str = ""
citations: list[CitationIR] = Field(default_factory=list)
class StudySheet(BaseModel):
"""Intermediate representation. All exporters consume this."""
title: str
subtitle: str = ""
language: str = "es"
sections: list[StudySection] = Field(default_factory=list)
footer_note: str = ""
metadata: dict[str, Any] = Field(default_factory=dict)
@classmethod
def from_agent_result(
cls,
result: "AgentResult | dict[str, Any]",
*,
title: str | None = None,
language: str | None = None,
include_citations: bool = True,
) -> StudySheet:
"""Single conversion AgentResult (or its dict form) → StudySheet."""
if isinstance(result, dict):
data = result
else:
data = result.to_dict()
# ── title ──
if title:
final_title = title
else:
md_title = data.get("metadata", {}).get("title")
final_title = md_title or data.get("query", "(sin título)")
if len(final_title) > _MAX_TITLE:
final_title = final_title[: _MAX_TITLE - 1].rstrip() + "…"
# ── subtitle ──
agent_name = data.get("agent_name", "")
subtitle = _AGENT_SUBTITLES.get(agent_name, agent_name)
# ── language ──
lang = language or data.get("metadata", {}).get("language", "es")
# ── sections ──
sections: list[StudySection] = []
for f in data.get("findings", []):
summary = (f.get("summary") or "").strip()
heading = summary.splitlines()[0] if summary else "(sin resumen)"
if len(heading) > _MAX_HEADING:
heading = heading[: _MAX_HEADING - 1].rstrip() + "…"
citations: list[CitationIR] = []
if include_citations:
cite_raw = f.get("citation") or {}
if cite_raw.get("url"):
citations.append(_citation_from_dict(cite_raw))
sections.append(
StudySection(
heading=heading,
body=summary,
excerpt=(f.get("excerpt") or "").strip(),
citations=citations,
)
)
if not sections:
sections.append(
StudySection(
heading="(sin resultados)",
body="El agente no devolvió resultados.",
)
)
# ── footer (warnings + provenance) ──
warnings = data.get("warnings", []) or []
footer_parts: list[str] = []
if warnings:
footer_parts.append("Advertencias: " + " · ".join(warnings))
footer_parts.append("Generado por jw-agent-toolkit.")
footer_note = "\n".join(footer_parts)
return cls(
title=final_title,
subtitle=subtitle,
language=lang,
sections=sections,
footer_note=footer_note,
metadata=data.get("metadata", {}),
)
def _citation_from_dict(raw: dict[str, Any]) -> CitationIR:
"""Map a serialized Citation dict to CitationIR, building a short_label."""
title = (raw.get("title") or "").strip()
kind = (raw.get("kind") or "").strip()
meta = raw.get("metadata") or {}
# Build a compact label. Verses prefer the title (e.g. "Juan 3:16");
# articles use truncated title; default = URL host + last path segment.
short = ""
if kind == "verse" and title:
short = title
elif title:
short = title if len(title) <= 60 else title[:59] + "…"
else:
url = raw.get("url", "")
short = url.rsplit("/", 1)[-1] if url else ""
return CitationIR(
url=raw.get("url", ""),
title=title,
kind=kind,
short_label=short,
metadata=meta,
)
- Step 4: Run tests until green
Run: uv run pytest packages/jw-core/tests/test_exporter_ir.py -v
Expected: 10 passed.
- Step 5: Commit
git add packages/jw-core/src/jw_core/exporters/ir.py packages/jw-core/tests/test_exporter_ir.py
git commit -m "feat(exporters): StudySheet IR + from_agent_result conversion"
Task 3: Markdown exporter (3 citation styles)
Files:
-
Create:
packages/jw-core/src/jw_core/exporters/markdown.py -
Create:
packages/jw-core/tests/test_exporter_markdown.py -
Step 1: Write the failing tests
# packages/jw-core/tests/test_exporter_markdown.py
"""Tests for jw_core.exporters.markdown."""
from __future__ import annotations
from pathlib import Path
import pytest
from jw_core.exporters.ir import CitationIR, StudySection, StudySheet
from jw_core.exporters.markdown import export_markdown, render_markdown
def _sheet() -> StudySheet:
return StudySheet(
title="Trinidad",
subtitle="Análisis apologético",
language="es",
sections=[
StudySection(
heading="Jehová es el único Dios",
body="La Biblia es clara: hay un solo Dios verdadero.",
excerpt="Deuteronomio 6:4 — Escucha, Israel.",
citations=[
CitationIR(
url="https://wol.jw.org/es/wol/d/r4/lp-s/1101989140",
title="¿Qué enseña la Biblia sobre la Trinidad?",
kind="article",
short_label="Trinidad — folleto",
)
],
),
StudySection(
heading="Jesús no es el Padre",
body="Jesús siempre se distinguió del Padre.",
citations=[
CitationIR(
url="https://wol.jw.org/es/wol/b/r4/lp-s/nwt/E/2024/43/14",
title="Juan 14:28",
kind="verse",
short_label="Juan 14:28",
)
],
),
],
footer_note="Generado por jw-agent-toolkit.",
)
def test_render_markdown_has_title() -> None:
out = render_markdown(_sheet())
assert out.startswith("# Trinidad")
assert "## Jehová es el único Dios" in out
def test_render_footnote_style_default() -> None:
out = render_markdown(_sheet(), citation_style="footnote")
# Footnote markers appear in body
assert "[^1]" in out
assert "[^2]" in out
# Footnote definitions appear at the end
assert "[^1]:" in out
assert "wol.jw.org" in out
def test_render_inline_paren_style() -> None:
out = render_markdown(_sheet(), citation_style="inline-paren")
assert "(Trinidad — folleto, https://wol.jw.org" in out
assert "[^1]" not in out # no footnotes when inline
def test_render_bibliography_style() -> None:
out = render_markdown(_sheet(), citation_style="bibliography")
assert "## Fuentes" in out or "## Bibliografía" in out
assert "Juan 14:28" in out
def test_render_includes_excerpt_as_blockquote() -> None:
out = render_markdown(_sheet())
assert "> Deuteronomio 6:4" in out
def test_render_includes_footer() -> None:
out = render_markdown(_sheet())
assert "Generado por jw-agent-toolkit" in out
def test_render_empty_sections() -> None:
sheet = StudySheet(title="Vacío", sections=[])
out = render_markdown(sheet)
assert "# Vacío" in out
def test_export_markdown_writes_file(tmp_path: Path) -> None:
out = tmp_path / "demo.md"
written = export_markdown(_sheet(), out=out)
assert written == out
assert out.exists()
assert out.read_text(encoding="utf-8").startswith("# Trinidad")
def test_export_markdown_creates_parent_dirs(tmp_path: Path) -> None:
out = tmp_path / "deep" / "nested" / "demo.md"
export_markdown(_sheet(), out=out)
assert out.exists()
def test_render_escapes_dangerous_chars_in_body() -> None:
sheet = StudySheet(
title="Inj",
sections=[StudySection(heading="x", body="text with [bracket] and (paren)")],
)
out = render_markdown(sheet)
# Brackets and parens get escaped in body to avoid accidental markdown links
assert "\\[bracket\\]" in out or "[bracket]" in out # accept either escape policy
- Step 2: Run test to verify it fails
Run: uv run pytest packages/jw-core/tests/test_exporter_markdown.py -v
Expected: FAIL — markdown module missing.
- Step 3: Implement the markdown exporter
# packages/jw-core/src/jw_core/exporters/markdown.py
"""Markdown exporter.
Three citation styles:
- inline-paren: "...text (label, url)."
- footnote: "...text[^1]." with definitions at the end.
- bibliography: body without inline cites; numbered list at the end.
Pure-Python, no external dependencies. CommonMark-compatible output.
"""
from __future__ import annotations
import re
from pathlib import Path
from jw_core.exporters.ir import CitationIR, StudySection, StudySheet
CitationStyleStr = str # 'inline-paren' | 'footnote' | 'bibliography'
def export_markdown(
sheet: StudySheet,
*,
out: Path,
citation_style: CitationStyleStr = "footnote",
) -> Path:
"""Render `sheet` as Markdown and write it to `out`. Returns `out`."""
out.parent.mkdir(parents=True, exist_ok=True)
out.write_text(render_markdown(sheet, citation_style=citation_style), encoding="utf-8")
return out
def render_markdown(
sheet: StudySheet,
*,
citation_style: CitationStyleStr = "footnote",
) -> str:
"""Pure-string render of `sheet`. Easier to test than file I/O."""
lines: list[str] = []
lines.append(f"# {sheet.title}")
if sheet.subtitle:
lines.append(f"## {sheet.subtitle}")
lines.append(f"_idioma: {sheet.language}_")
lines.append("")
# Collect global footnotes when citation_style == "footnote"
footnote_defs: list[str] = []
bibliography: list[CitationIR] = []
counter = [0]
for section in sheet.sections:
lines.append(f"## {_escape_heading(section.heading)}")
body = _escape_body(section.body)
if citation_style == "inline-paren":
body = _append_inline_citations(body, section.citations)
elif citation_style == "footnote":
body, fns = _attach_footnote_markers(body, section.citations, counter)
footnote_defs.extend(fns)
elif citation_style == "bibliography":
bibliography.extend(section.citations)
lines.append(body)
if section.excerpt:
lines.append("")
for excerpt_line in section.excerpt.splitlines():
lines.append(f"> {excerpt_line}")
lines.append("")
if citation_style == "footnote" and footnote_defs:
lines.append("")
lines.extend(footnote_defs)
if citation_style == "bibliography" and bibliography:
lines.append("")
lines.append("## Fuentes")
for i, cite in enumerate(bibliography, 1):
lines.append(f"{i}. [{cite.short_label or cite.title or cite.url}]({cite.url})")
if sheet.footer_note:
lines.append("")
lines.append("---")
lines.append(f"_{sheet.footer_note}_")
return "\n".join(lines).rstrip() + "\n"
# ── helpers ──
_DANGEROUS_MD = re.compile(r"([\[\]\(\)])")
def _escape_heading(text: str) -> str:
"""Headings only need # escaping; brackets etc. are usually fine but we strip newlines."""
return text.replace("\n", " ").strip()
def _escape_body(text: str) -> str:
"""Escape brackets/parens to avoid accidental markdown link injection."""
return _DANGEROUS_MD.sub(r"\\\1", text)
def _append_inline_citations(body: str, citations: list[CitationIR]) -> str:
if not citations:
return body
parens = ", ".join(f"{c.short_label or c.title or 'fuente'}, {c.url}" for c in citations)
if body.endswith("."):
return f"{body[:-1]} ({parens})."
return f"{body} ({parens})"
def _attach_footnote_markers(
body: str,
citations: list[CitationIR],
counter: list[int],
) -> tuple[str, list[str]]:
"""Append [^N] markers to the body and return the footnote definitions."""
if not citations:
return body, []
markers: list[str] = []
defs: list[str] = []
for cite in citations:
counter[0] += 1
n = counter[0]
markers.append(f"[^{n}]")
label = cite.short_label or cite.title or cite.url
defs.append(f"[^{n}]: [{label}]({cite.url})")
marker_str = "".join(markers)
if body.endswith("."):
body = body[:-1] + marker_str + "."
else:
body = body + marker_str
return body, defs
- Step 4: Run tests until green
Run: uv run pytest packages/jw-core/tests/test_exporter_markdown.py -v
Expected: 10 passed.
- Step 5: Commit
git add packages/jw-core/src/jw_core/exporters/markdown.py packages/jw-core/tests/test_exporter_markdown.py
git commit -m "feat(exporters): markdown exporter with 3 citation styles"
Task 4: Template resolver + Jinja2 templates
Files:
-
Create:
packages/jw-core/src/jw_core/templates/__init__.py -
Create:
packages/jw-core/src/jw_core/templates/study_sheet/__init__.py -
Create:
packages/jw-core/src/jw_core/templates/study_sheet/plain.html.j2 -
Create:
packages/jw-core/src/jw_core/templates/study_sheet/study-sheet.html.j2 -
Create:
packages/jw-core/src/jw_core/exporters/templates_resolver.py -
Create:
packages/jw-core/tests/test_exporter_templates.py -
Step 1: Write the failing tests
# packages/jw-core/tests/test_exporter_templates.py
"""Tests for the template resolver."""
from __future__ import annotations
from pathlib import Path
import pytest
from jw_core.exporters.errors import ExportError
from jw_core.exporters.templates_resolver import (
list_builtin_templates,
render_html,
resolve_template_path,
)
from jw_core.exporters.ir import StudySection, StudySheet
def _sheet() -> StudySheet:
return StudySheet(
title="T",
sections=[StudySection(heading="h", body="b")],
)
def test_list_builtin_templates_includes_two() -> None:
names = list_builtin_templates()
assert "plain.html.j2" in names
assert "study-sheet.html.j2" in names
def test_resolve_builtin_template() -> None:
p = resolve_template_path("plain.html.j2")
assert p.exists()
assert p.name == "plain.html.j2"
def test_resolve_user_override(tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
user_dir = tmp_path / ".jw-agent-toolkit" / "templates"
user_dir.mkdir(parents=True)
user_tpl = user_dir / "plain.html.j2"
user_tpl.write_text("<html>USER</html>", encoding="utf-8")
monkeypatch.setenv("HOME", str(tmp_path))
p = resolve_template_path("plain.html.j2")
# User override wins
assert p == user_tpl
def test_resolve_missing_raises() -> None:
with pytest.raises(ExportError):
resolve_template_path("does-not-exist.html.j2")
def test_render_html_contains_title_and_body() -> None:
html = render_html(_sheet(), template_name="plain.html.j2")
assert "T" in html
assert "<html" in html.lower()
def test_render_html_escapes_html_in_body() -> None:
sheet = StudySheet(
title="T",
sections=[StudySection(heading="h", body="<script>alert(1)</script>")],
)
html = render_html(sheet, template_name="plain.html.j2")
assert "<script>" not in html
assert "<script>" in html
- Step 2: Run test to verify it fails
Run: uv run pytest packages/jw-core/tests/test_exporter_templates.py -v
Expected: FAIL — templates_resolver missing.
- Step 3: Implement resolver + templates
# packages/jw-core/src/jw_core/templates/__init__.py
"""Packaged Jinja2 templates for the exporters module."""
# packages/jw-core/src/jw_core/templates/study_sheet/__init__.py
"""Study-sheet HTML templates rendered by jw_core.exporters.pdf."""
{# packages/jw-core/src/jw_core/templates/study_sheet/plain.html.j2 #}
<!doctype html>
<html lang="{{ sheet.language }}">
<head>
<meta charset="utf-8">
<title>{{ sheet.title }}</title>
<style>
@page { margin: 2cm; }
body { font-family: "Inter", -apple-system, BlinkMacSystemFont, sans-serif; line-height: 1.5; color: #222; }
h1 { font-size: 24pt; margin-bottom: 0; }
h2.subtitle { font-size: 14pt; color: #666; margin-top: 4pt; font-weight: 400; }
h3 { font-size: 14pt; margin-top: 20pt; }
.body { font-size: 11pt; word-wrap: break-word; }
.excerpt { border-left: 3px solid #999; padding-left: 10pt; color: #555; margin: 8pt 0; font-style: italic; }
.cite-list { font-size: 9pt; color: #666; margin-top: 4pt; }
a { color: #1a5fb4; word-wrap: break-word; }
.footer { margin-top: 30pt; border-top: 1px solid #ccc; padding-top: 6pt; font-size: 9pt; color: #888; }
</style>
</head>
<body>
<h1>{{ sheet.title }}</h1>
{% if sheet.subtitle %}<h2 class="subtitle">{{ sheet.subtitle }}</h2>{% endif %}
{% for section in sheet.sections %}
<section>
<h3>{{ section.heading }}</h3>
<div class="body">{{ section.body }}</div>
{% if section.excerpt %}<div class="excerpt">{{ section.excerpt }}</div>{% endif %}
{% if section.citations %}
<ul class="cite-list">
{% for c in section.citations %}
<li><a href="{{ c.url }}">{{ c.short_label or c.title or c.url }}</a></li>
{% endfor %}
</ul>
{% endif %}
</section>
{% endfor %}
{% if sheet.footer_note %}
<div class="footer">{{ sheet.footer_note }}</div>
{% endif %}
</body>
</html>
{# packages/jw-core/src/jw_core/templates/study_sheet/study-sheet.html.j2 #}
<!doctype html>
<html lang="{{ sheet.language }}">
<head>
<meta charset="utf-8">
<title>{{ sheet.title }}</title>
<style>
@page { margin: 1.8cm 2.5cm; }
body { font-family: "Charter", "Source Serif Pro", Georgia, serif; line-height: 1.55; color: #1a1a1a; }
h1 { font-size: 26pt; margin-bottom: 0; border-bottom: 2px solid #1a1a1a; padding-bottom: 6pt; }
h2.subtitle { font-size: 13pt; color: #555; margin-top: 6pt; font-weight: 400; font-style: italic; }
h3 { font-size: 14pt; margin-top: 22pt; color: #0a3a6a; }
.body { font-size: 11.5pt; word-wrap: break-word; text-align: justify; hyphens: auto; }
.excerpt { border-left: 4px solid #c9a64f; background: #faf7f0; padding: 6pt 10pt; margin: 10pt 0; color: #333; font-style: italic; }
.cite-list { font-size: 9pt; color: #555; margin-top: 6pt; list-style: square; }
.cite-list li { margin-bottom: 2pt; }
a { color: #0a3a6a; word-wrap: break-word; }
.footer { margin-top: 36pt; border-top: 1px solid #aaa; padding-top: 8pt; font-size: 9pt; color: #777; text-align: center; }
</style>
</head>
<body>
<h1>{{ sheet.title }}</h1>
{% if sheet.subtitle %}<h2 class="subtitle">{{ sheet.subtitle }}</h2>{% endif %}
{% for section in sheet.sections %}
<section>
<h3>{{ section.heading }}</h3>
<div class="body">{{ section.body }}</div>
{% if section.excerpt %}<div class="excerpt">{{ section.excerpt }}</div>{% endif %}
{% if section.citations %}
<ul class="cite-list">
{% for c in section.citations %}
<li><a href="{{ c.url }}">{{ c.short_label or c.title or c.url }}</a></li>
{% endfor %}
</ul>
{% endif %}
</section>
{% endfor %}
{% if sheet.footer_note %}
<div class="footer">{{ sheet.footer_note }}</div>
{% endif %}
</body>
</html>
# packages/jw-core/src/jw_core/exporters/templates_resolver.py
"""Resolve Jinja2 templates, honoring user overrides at ~/.jw-agent-toolkit/templates/.
Lookup order:
1. ~/.jw-agent-toolkit/templates/<name> (user override)
2. jw_core.templates.study_sheet.<name> (packaged default)
"""
from __future__ import annotations
from pathlib import Path
from jinja2 import Environment, FileSystemLoader, StrictUndefined, select_autoescape
from jw_core.exporters.errors import ExportError
from jw_core.exporters.ir import StudySheet
def _packaged_dir() -> Path:
return Path(__file__).parent.parent / "templates" / "study_sheet"
def _user_dir() -> Path:
return Path.home() / ".jw-agent-toolkit" / "templates"
def list_builtin_templates() -> list[str]:
"""Return names of packaged Jinja2 templates."""
return sorted(p.name for p in _packaged_dir().glob("*.html.j2"))
def resolve_template_path(name: str) -> Path:
"""Return the path of the template, user override wins. Raises if missing."""
candidate = _user_dir() / name
if candidate.exists():
return candidate
candidate = _packaged_dir() / name
if candidate.exists():
return candidate
raise ExportError(
f"Template {name!r} not found (looked in {_user_dir()} and {_packaged_dir()})"
)
def render_html(sheet: StudySheet, *, template_name: str = "plain.html.j2") -> str:
"""Render `sheet` to HTML using the given Jinja2 template."""
path = resolve_template_path(template_name)
env = Environment(
loader=FileSystemLoader(path.parent),
autoescape=select_autoescape(["html", "j2"]),
undefined=StrictUndefined,
trim_blocks=True,
lstrip_blocks=True,
)
template = env.get_template(path.name)
return template.render(sheet=sheet)
- Step 4: Run tests until green
Run: uv run pytest packages/jw-core/tests/test_exporter_templates.py -v
Expected: 6 passed.
- Step 5: Verify templates are packaged
Run:
uv run python -c "
from jw_core.exporters.templates_resolver import list_builtin_templates
print(list_builtin_templates())
"
Expected: ['plain.html.j2', 'study-sheet.html.j2'].
If empty, edit packages/jw-core/pyproject.toml and add to [tool.hatch.build.targets.wheel]:
[tool.hatch.build.targets.wheel.shared-data]
"src/jw_core/templates" = "jw_core/templates"
or ensure force-include covers the templates dir.
- Step 6: Commit
git add packages/jw-core/src/jw_core/templates packages/jw-core/src/jw_core/exporters/templates_resolver.py packages/jw-core/tests/test_exporter_templates.py packages/jw-core/pyproject.toml
git commit -m "feat(exporters): Jinja2 template resolver with user-override + 2 built-in themes"
Task 5: PDF exporter (WeasyPrint)
Files:
-
Create:
packages/jw-core/src/jw_core/exporters/pdf.py -
Create:
packages/jw-core/tests/test_exporter_pdf.py -
Step 1: Write the failing test (skipped if weasyprint missing)
# packages/jw-core/tests/test_exporter_pdf.py
"""Tests for jw_core.exporters.pdf.
Skipped if weasyprint is not installed (the [pdf] extra is optional).
"""
from __future__ import annotations
import importlib.util
from pathlib import Path
import pytest
from jw_core.exporters.errors import MissingDependencyError
from jw_core.exporters.ir import CitationIR, StudySection, StudySheet
WEASY_AVAILABLE = importlib.util.find_spec("weasyprint") is not None
pytestmark = pytest.mark.skipif(
not WEASY_AVAILABLE,
reason="weasyprint not installed (install jw-core[pdf])",
)
def _sheet() -> StudySheet:
return StudySheet(
title="Trinidad",
subtitle="Análisis apologético",
sections=[
StudySection(
heading="Jehová es uno",
body="La Biblia es clara: hay un solo Dios.",
excerpt="Deuteronomio 6:4",
citations=[
CitationIR(
url="https://wol.jw.org/x",
title="Trinidad",
kind="article",
short_label="Trinidad",
)
],
)
],
footer_note="Generado por jw-agent-toolkit.",
)
def test_export_pdf_writes_valid_file(tmp_path: Path) -> None:
from jw_core.exporters.pdf import export_pdf
out = tmp_path / "demo.pdf"
written = export_pdf(_sheet(), out=out)
assert written == out
assert out.exists()
head = out.read_bytes()[:4]
assert head == b"%PDF"
def test_export_pdf_study_sheet_theme(tmp_path: Path) -> None:
from jw_core.exporters.pdf import export_pdf
out = tmp_path / "demo.pdf"
export_pdf(_sheet(), out=out, theme="study-sheet")
assert out.read_bytes()[:4] == b"%PDF"
def test_export_pdf_creates_parent_dirs(tmp_path: Path) -> None:
from jw_core.exporters.pdf import export_pdf
out = tmp_path / "deep" / "nested" / "demo.pdf"
export_pdf(_sheet(), out=out)
assert out.exists()
def test_export_pdf_unknown_theme_raises(tmp_path: Path) -> None:
from jw_core.exporters.errors import ExportError
from jw_core.exporters.pdf import export_pdf
out = tmp_path / "x.pdf"
with pytest.raises(ExportError):
export_pdf(_sheet(), out=out, theme="nope")
# Always-on test: even when weasyprint IS installed, simulate missing dep
def test_missing_dependency_when_weasyprint_absent(
monkeypatch: pytest.MonkeyPatch,
tmp_path: Path,
) -> None:
import builtins
real_import = builtins.__import__
def _ban_weasy(name: str, *a, **kw):
if name == "weasyprint" or name.startswith("weasyprint."):
raise ImportError("simulated")
return real_import(name, *a, **kw)
monkeypatch.setattr(builtins, "__import__", _ban_weasy)
from jw_core.exporters.pdf import export_pdf
with pytest.raises(MissingDependencyError):
export_pdf(_sheet(), out=tmp_path / "x.pdf")
- Step 2: Run test to verify it fails
Run: uv run pytest packages/jw-core/tests/test_exporter_pdf.py -v
Expected: FAIL — module pdf missing (or all skipped if weasyprint not installed; install with uv pip install weasyprint for the rest of this task).
- Step 3: Implement the PDF exporter
# packages/jw-core/src/jw_core/exporters/pdf.py
"""PDF exporter via WeasyPrint.
Renders the StudySheet through a Jinja2 template (theme) into HTML, then
WeasyPrint converts the HTML to PDF.
Themes available out of the box:
- "plain" — minimalist, sans-serif.
- "study-sheet" — serif notebook style.
User can override the template by dropping a file with the same name
under ~/.jw-agent-toolkit/templates/.
"""
from __future__ import annotations
from pathlib import Path
from typing import Literal
from jw_core.exporters.errors import ExportError, MissingDependencyError
from jw_core.exporters.ir import StudySheet
from jw_core.exporters.templates_resolver import render_html
Theme = Literal["plain", "study-sheet"]
_THEME_TO_TEMPLATE: dict[str, str] = {
"plain": "plain.html.j2",
"study-sheet": "study-sheet.html.j2",
}
def export_pdf(
sheet: StudySheet,
*,
out: Path,
theme: Theme = "study-sheet",
) -> Path:
"""Render `sheet` as PDF and write it to `out`. Returns `out`.
Requires the [pdf] extra. Raises `MissingDependencyError` otherwise.
"""
try:
from weasyprint import HTML # noqa: PLC0415 (lazy by design)
except ImportError as exc:
raise MissingDependencyError(
"weasyprint is required for PDF export. "
"Install with: pip install 'jw-core[pdf]'"
) from exc
if theme not in _THEME_TO_TEMPLATE:
raise ExportError(f"Unknown PDF theme {theme!r}. Available: {sorted(_THEME_TO_TEMPLATE)}")
template_name = _THEME_TO_TEMPLATE[theme]
html_body = render_html(sheet, template_name=template_name)
out.parent.mkdir(parents=True, exist_ok=True)
HTML(string=html_body).write_pdf(target=str(out))
return out
- Step 4: Run tests until green
If weasyprint is installed: uv run pytest packages/jw-core/tests/test_exporter_pdf.py -v
Expected: 5 passed.
If not installed: 4 skipped + 1 passed (the missing-dep test).
- Step 5: Commit
git add packages/jw-core/src/jw_core/exporters/pdf.py packages/jw-core/tests/test_exporter_pdf.py
git commit -m "feat(exporters): PDF exporter via WeasyPrint with 2 themes"
Task 6: DOCX exporter (python-docx)
Files:
-
Create:
packages/jw-core/src/jw_core/exporters/docx.py -
Create:
packages/jw-core/tests/test_exporter_docx.py -
Step 1: Write the failing test
# packages/jw-core/tests/test_exporter_docx.py
"""Tests for jw_core.exporters.docx."""
from __future__ import annotations
import importlib.util
import zipfile
from pathlib import Path
import pytest
from jw_core.exporters.errors import MissingDependencyError
from jw_core.exporters.ir import CitationIR, StudySection, StudySheet
DOCX_AVAILABLE = importlib.util.find_spec("docx") is not None
pytestmark = pytest.mark.skipif(
not DOCX_AVAILABLE,
reason="python-docx not installed (install jw-core[docx])",
)
def _sheet() -> StudySheet:
return StudySheet(
title="Trinidad",
subtitle="Análisis",
sections=[
StudySection(
heading="Jehová es uno",
body="La Biblia es clara.",
excerpt="Deut 6:4",
citations=[
CitationIR(url="https://wol.jw.org/x", short_label="Folleto Trinidad")
],
)
],
footer_note="Generado por jw-agent-toolkit.",
)
def test_export_docx_writes_valid_zip(tmp_path: Path) -> None:
from jw_core.exporters.docx import export_docx
out = tmp_path / "demo.docx"
written = export_docx(_sheet(), out=out)
assert written == out
assert out.exists()
# DOCX is a ZIP
assert zipfile.is_zipfile(out)
with zipfile.ZipFile(out) as zf:
names = zf.namelist()
assert "word/document.xml" in names
def test_export_docx_contains_title_and_heading(tmp_path: Path) -> None:
from jw_core.exporters.docx import export_docx
out = tmp_path / "demo.docx"
export_docx(_sheet(), out=out)
with zipfile.ZipFile(out) as zf:
xml = zf.read("word/document.xml").decode("utf-8")
assert "Trinidad" in xml
assert "Jehová es uno" in xml
def test_export_docx_includes_citation_hyperlink(tmp_path: Path) -> None:
from jw_core.exporters.docx import export_docx
out = tmp_path / "demo.docx"
export_docx(_sheet(), out=out)
with zipfile.ZipFile(out) as zf:
rels = zf.read("word/_rels/document.xml.rels").decode("utf-8")
assert "wol.jw.org" in rels
def test_export_docx_creates_parent_dirs(tmp_path: Path) -> None:
from jw_core.exporters.docx import export_docx
out = tmp_path / "deep" / "x.docx"
export_docx(_sheet(), out=out)
assert out.exists()
def test_missing_dependency_when_pythondocx_absent(
monkeypatch: pytest.MonkeyPatch, tmp_path: Path
) -> None:
import builtins
real_import = builtins.__import__
def _ban(name: str, *a, **kw):
if name == "docx" or name.startswith("docx."):
raise ImportError("simulated")
return real_import(name, *a, **kw)
monkeypatch.setattr(builtins, "__import__", _ban)
from jw_core.exporters.docx import export_docx
with pytest.raises(MissingDependencyError):
export_docx(_sheet(), out=tmp_path / "x.docx")
- Step 2: Run test to verify it fails
Run: uv run pytest packages/jw-core/tests/test_exporter_docx.py -v
Expected: FAIL — docx exporter module missing.
- Step 3: Implement DOCX exporter
# packages/jw-core/src/jw_core/exporters/docx.py
"""DOCX exporter via python-docx.
Uses python-docx's programmatic API directly (no template — DOCX templating
adds complexity without value at our structure level).
"""
from __future__ import annotations
from pathlib import Path
from jw_core.exporters.errors import MissingDependencyError
from jw_core.exporters.ir import CitationIR, StudySheet
def export_docx(sheet: StudySheet, *, out: Path) -> Path:
"""Render `sheet` as DOCX and write it to `out`. Returns `out`.
Requires the [docx] extra. Raises `MissingDependencyError` otherwise.
"""
try:
from docx import Document # noqa: PLC0415 (lazy)
from docx.oxml.ns import qn # noqa: PLC0415
from docx.oxml import OxmlElement # noqa: PLC0415
except ImportError as exc:
raise MissingDependencyError(
"python-docx is required for DOCX export. "
"Install with: pip install 'jw-core[docx]'"
) from exc
doc = Document()
# Title
doc.add_heading(sheet.title, level=0)
if sheet.subtitle:
p = doc.add_paragraph()
run = p.add_run(sheet.subtitle)
run.italic = True
# Sections
for section in sheet.sections:
doc.add_heading(section.heading, level=2)
doc.add_paragraph(section.body)
if section.excerpt:
p = doc.add_paragraph(section.excerpt)
p.style = doc.styles["Intense Quote"]
for cite in section.citations:
_add_citation_paragraph(doc, cite, qn, OxmlElement)
if sheet.footer_note:
doc.add_paragraph()
sep = doc.add_paragraph("—" * 30)
sep.alignment = 1 # center
p = doc.add_paragraph()
run = p.add_run(sheet.footer_note)
run.italic = True
run.font.size = run.font.size # no-op to anchor formatting
out.parent.mkdir(parents=True, exist_ok=True)
doc.save(str(out))
return out
def _add_citation_paragraph(doc, cite: CitationIR, qn, OxmlElement) -> None:
"""Add a paragraph holding a hyperlink to the citation URL."""
p = doc.add_paragraph()
p.paragraph_format.left_indent = p.paragraph_format.left_indent # no-op
label = cite.short_label or cite.title or cite.url
# Add a real hyperlink relationship.
part = p.part
rid = part.relate_to(
cite.url,
"http://schemas.openxmlformats.org/officeDocument/2006/relationships/hyperlink",
is_external=True,
)
hyperlink = OxmlElement("w:hyperlink")
hyperlink.set(qn("r:id"), rid)
new_run = OxmlElement("w:r")
r_pr = OxmlElement("w:rPr")
color = OxmlElement("w:color")
color.set(qn("w:val"), "0A3A6A")
r_pr.append(color)
u = OxmlElement("w:u")
u.set(qn("w:val"), "single")
r_pr.append(u)
new_run.append(r_pr)
t = OxmlElement("w:t")
t.text = f" • {label}"
new_run.append(t)
hyperlink.append(new_run)
p._p.append(hyperlink)
- Step 4: Run tests until green
Run: uv run pytest packages/jw-core/tests/test_exporter_docx.py -v
Expected: 5 passed (if python-docx installed).
- Step 5: Commit
git add packages/jw-core/src/jw_core/exporters/docx.py packages/jw-core/tests/test_exporter_docx.py
git commit -m "feat(exporters): DOCX exporter via python-docx with hyperlink citations"
Task 7: Anki exporter (genanki) with stable GUIDs
Files:
-
Create:
packages/jw-core/src/jw_core/exporters/anki.py -
Create:
packages/jw-core/tests/test_exporter_anki.py -
Step 1: Write the failing test
# packages/jw-core/tests/test_exporter_anki.py
"""Tests for jw_core.exporters.anki."""
from __future__ import annotations
import importlib.util
import zipfile
from pathlib import Path
import pytest
from jw_core.exporters.errors import MissingDependencyError
from jw_core.exporters.ir import CitationIR, StudySection, StudySheet
ANKI_AVAILABLE = importlib.util.find_spec("genanki") is not None
pytestmark = pytest.mark.skipif(
not ANKI_AVAILABLE,
reason="genanki not installed (install jw-core[anki])",
)
def _sheet() -> StudySheet:
return StudySheet(
title="Trinidad — repaso",
sections=[
StudySection(
heading="Jehová es uno",
body="La Biblia presenta un solo Dios verdadero.",
citations=[
CitationIR(url="https://wol.jw.org/x", short_label="Folleto Trinidad"),
CitationIR(url="https://wol.jw.org/y", short_label="Juan 17:3"),
],
),
StudySection(
heading="Jesús no es el Padre",
body="Jesús siempre se distinguió del Padre.",
),
],
)
def test_export_apkg_writes_valid_zip(tmp_path: Path) -> None:
from jw_core.exporters.anki import export_apkg
out = tmp_path / "deck.apkg"
written = export_apkg(_sheet(), out=out)
assert written == out
assert out.exists()
assert zipfile.is_zipfile(out)
def test_export_apkg_default_one_note_per_section(tmp_path: Path) -> None:
from jw_core.exporters.anki import build_deck, export_apkg
deck = build_deck(_sheet(), per_citation_cards=False)
assert len(deck.notes) == 2 # one per section
def test_export_apkg_per_citation_cards(tmp_path: Path) -> None:
from jw_core.exporters.anki import build_deck
deck = build_deck(_sheet(), per_citation_cards=True)
# 2 section notes + 2 extra (citations of first section only — second section has 0)
assert len(deck.notes) == 4
def test_export_apkg_guid_stable_across_runs(tmp_path: Path) -> None:
from jw_core.exporters.anki import build_deck
d1 = build_deck(_sheet())
d2 = build_deck(_sheet())
g1 = sorted(n.guid for n in d1.notes)
g2 = sorted(n.guid for n in d2.notes)
assert g1 == g2
def test_export_apkg_guid_changes_when_content_changes(tmp_path: Path) -> None:
from jw_core.exporters.anki import build_deck
d1 = build_deck(_sheet())
sheet2 = _sheet()
sheet2.sections[0].heading = "Otro encabezado"
d2 = build_deck(sheet2)
g1 = sorted(n.guid for n in d1.notes)
g2 = sorted(n.guid for n in d2.notes)
assert g1 != g2
def test_export_apkg_deck_id_stable(tmp_path: Path) -> None:
from jw_core.exporters.anki import build_deck
d1 = build_deck(_sheet())
d2 = build_deck(_sheet())
assert d1.deck_id == d2.deck_id
def test_export_apkg_creates_parent_dirs(tmp_path: Path) -> None:
from jw_core.exporters.anki import export_apkg
out = tmp_path / "deep" / "deck.apkg"
export_apkg(_sheet(), out=out)
assert out.exists()
def test_missing_dependency_when_genanki_absent(
monkeypatch: pytest.MonkeyPatch, tmp_path: Path
) -> None:
import builtins
real_import = builtins.__import__
def _ban(name: str, *a, **kw):
if name == "genanki" or name.startswith("genanki."):
raise ImportError("simulated")
return real_import(name, *a, **kw)
monkeypatch.setattr(builtins, "__import__", _ban)
from jw_core.exporters.anki import export_apkg
with pytest.raises(MissingDependencyError):
export_apkg(_sheet(), out=tmp_path / "x.apkg")
- Step 2: Run test to verify it fails
Run: uv run pytest packages/jw-core/tests/test_exporter_anki.py -v
Expected: FAIL — anki exporter module missing.
- Step 3: Implement Anki exporter
# packages/jw-core/src/jw_core/exporters/anki.py
"""Anki exporter via genanki.
GUID strategy (stable across re-exports):
guid = sha256(sheet.title + section.heading + section.body[:200])
This means re-exporting the same StudySheet after a typo fix UPDATES the
existing note in Anki instead of duplicating it. Only meaningful changes
to heading/body produce a new note.
Deck and model IDs are also derived from sheet.title via sha256, so the
same deck always lands in the same place in Anki's tree.
"""
from __future__ import annotations
import hashlib
from pathlib import Path
from jw_core.exporters.errors import MissingDependencyError
from jw_core.exporters.ir import CitationIR, StudySection, StudySheet
_MODEL_NAME = "jw-agent-toolkit study sheet"
def export_apkg(
sheet: StudySheet,
*,
out: Path,
deck_name: str | None = None,
per_citation_cards: bool = False,
) -> Path:
"""Render `sheet` as an Anki package (.apkg) and write it to `out`."""
try:
import genanki # noqa: PLC0415
except ImportError as exc:
raise MissingDependencyError(
"genanki is required for Anki export. "
"Install with: pip install 'jw-core[anki]'"
) from exc
deck = build_deck(sheet, deck_name=deck_name, per_citation_cards=per_citation_cards)
out.parent.mkdir(parents=True, exist_ok=True)
genanki.Package(deck).write_to_file(str(out))
return out
def build_deck(
sheet: StudySheet,
*,
deck_name: str | None = None,
per_citation_cards: bool = False,
):
"""Build (but don't write) the genanki.Deck. Useful for tests."""
try:
import genanki # noqa: PLC0415
except ImportError as exc:
raise MissingDependencyError(
"genanki is required for Anki export."
) from exc
model_id = _id_from(_MODEL_NAME)
deck_id = _id_from(sheet.title)
model = genanki.Model(
model_id=model_id,
name=_MODEL_NAME,
fields=[{"name": "Front"}, {"name": "Back"}],
templates=[
{
"name": "card",
"qfmt": "{{Front}}",
"afmt": '{{FrontSide}}<hr id="answer">{{Back}}',
}
],
)
name = deck_name or sheet.title
deck = genanki.Deck(deck_id=deck_id, name=name)
for section in sheet.sections:
deck.add_note(_section_note(genanki, model, sheet, section))
if per_citation_cards and len(section.citations) >= 1:
for cite in section.citations:
deck.add_note(_citation_note(genanki, model, sheet, section, cite))
return deck
# ── helpers ──
def _id_from(text: str) -> int:
"""Derive a stable 31-bit positive int ID from text via sha256."""
digest = hashlib.sha256(text.encode("utf-8")).digest()
return int.from_bytes(digest[:4], "big") & 0x7FFFFFFF
def _guid(*parts: str) -> str:
raw = "|".join(parts).encode("utf-8")
return hashlib.sha256(raw).hexdigest()[:32]
def _section_note(genanki, model, sheet: StudySheet, section: StudySection):
"""Build the main note for a section."""
front = section.heading
back_parts: list[str] = [section.body.replace("\n", "<br>")]
if section.excerpt:
back_parts.append(f"<blockquote>{section.excerpt}</blockquote>")
if section.citations:
items = "".join(
f'<li><a href="{c.url}">{c.short_label or c.title or c.url}</a></li>'
for c in section.citations
)
back_parts.append(f"<ul>{items}</ul>")
back = "".join(back_parts)
return genanki.Note(
model=model,
fields=[front, back],
guid=_guid(sheet.title, section.heading, section.body[:200]),
)
def _citation_note(
genanki,
model,
sheet: StudySheet,
section: StudySection,
cite: CitationIR,
):
"""Build an extra note focused on a single citation (when per_citation_cards=True)."""
front = cite.short_label or cite.title or cite.url
back = f'{section.heading}<br><a href="{cite.url}">{cite.url}</a>'
return genanki.Note(
model=model,
fields=[front, back],
guid=_guid(sheet.title, section.heading, "cite", cite.url),
)
- Step 4: Run tests until green
Run: uv run pytest packages/jw-core/tests/test_exporter_anki.py -v
Expected: 8 passed (if genanki installed).
- Step 5: Commit
git add packages/jw-core/src/jw_core/exporters/anki.py packages/jw-core/tests/test_exporter_anki.py
git commit -m "feat(exporters): Anki exporter via genanki with stable GUIDs"
Task 8: CLI command jw export
Files:
-
Create:
packages/jw-cli/src/jw_cli/commands/export.py -
Create:
packages/jw-cli/tests/test_export_command.py -
Modify:
packages/jw-cli/src/jw_cli/main.py -
Step 1: Write the failing test
# packages/jw-cli/tests/test_export_command.py
"""End-to-end tests for `jw export`."""
from __future__ import annotations
import importlib.util
import json
from pathlib import Path
import pytest
from typer.testing import CliRunner
from jw_cli.main import app
RUNNER = CliRunner()
def _agent_result_json() -> dict:
return {
"query": "Es la Trinidad bíblica?",
"agent_name": "apologetics",
"warnings": [],
"metadata": {"language": "es"},
"findings": [
{
"summary": "Jehová es el único Dios verdadero.",
"excerpt": "",
"metadata": {},
"citation": {
"url": "https://wol.jw.org/x",
"title": "Trinidad",
"kind": "article",
"metadata": {},
},
}
],
}
def _write(tmp_path: Path) -> Path:
p = tmp_path / "result.json"
p.write_text(json.dumps(_agent_result_json()), encoding="utf-8")
return p
def test_export_markdown_smoke(tmp_path: Path) -> None:
src = _write(tmp_path)
out = tmp_path / "demo.md"
result = RUNNER.invoke(app, ["export", str(src), "--format", "markdown", "--out", str(out)])
assert result.exit_code == 0, result.stdout
assert out.exists()
text = out.read_text(encoding="utf-8")
assert "Trinidad" in text or "trinidad" in text.lower()
def test_export_unknown_format_fails(tmp_path: Path) -> None:
src = _write(tmp_path)
result = RUNNER.invoke(app, ["export", str(src), "--format", "bogus", "--out", "/tmp/x"])
assert result.exit_code != 0
def test_export_missing_input_fails() -> None:
result = RUNNER.invoke(app, ["export", "/does/not/exist.json", "--format", "markdown", "--out", "/tmp/x.md"])
assert result.exit_code != 0
def test_export_title_override(tmp_path: Path) -> None:
src = _write(tmp_path)
out = tmp_path / "demo.md"
result = RUNNER.invoke(
app,
["export", str(src), "--format", "markdown", "--out", str(out), "--title", "MiHoja"],
)
assert result.exit_code == 0
assert out.read_text(encoding="utf-8").startswith("# MiHoja")
@pytest.mark.skipif(
importlib.util.find_spec("weasyprint") is None,
reason="weasyprint not installed",
)
def test_export_pdf_smoke(tmp_path: Path) -> None:
src = _write(tmp_path)
out = tmp_path / "demo.pdf"
result = RUNNER.invoke(app, ["export", str(src), "--format", "pdf", "--out", str(out)])
assert result.exit_code == 0, result.stdout
assert out.read_bytes()[:4] == b"%PDF"
- Step 2: Run test to verify it fails
Run: uv run pytest packages/jw-cli/tests/test_export_command.py -v
Expected: FAIL — export command not registered.
- Step 3: Implement the command
# packages/jw-cli/src/jw_cli/commands/export.py
"""`jw export` — convert AgentResult JSON into markdown/pdf/docx/apkg."""
from __future__ import annotations
import json
import sys
from pathlib import Path
from typing import Annotated
import typer
from jw_core.exporters.errors import ExportError, MissingDependencyError
from jw_core.exporters.ir import StudySheet
from jw_core.exporters.markdown import export_markdown
def export_cmd(
source: Annotated[
str,
typer.Argument(help="Path to a JSON file with AgentResult.to_dict(), or '-' for stdin."),
],
format: Annotated[
str,
typer.Option("--format", "-f", help="Output format: markdown | pdf | docx | apkg"),
] = "markdown",
out: Annotated[
Path,
typer.Option("--out", "-o", help="Output path."),
] = Path("out.md"),
title: Annotated[
str | None, typer.Option("--title", help="Override the sheet title.")
] = None,
language: Annotated[
str | None, typer.Option("--language", "-l", help="Override the sheet language.")
] = None,
citation_style: Annotated[
str,
typer.Option(
"--citation-style",
help="inline-paren | footnote | bibliography",
),
] = "footnote",
include_citations: Annotated[
bool, typer.Option("--include-citations/--no-citations")
] = True,
theme: Annotated[
str, typer.Option("--theme", help="PDF theme: plain | study-sheet")
] = "study-sheet",
per_citation_cards: Annotated[
bool,
typer.Option(
"--per-citation-cards/--no-per-citation-cards",
help="Anki: emit one extra card per citation.",
),
] = False,
) -> None:
"""Convert an AgentResult JSON into a printable study sheet or Anki deck."""
# Load AgentResult JSON.
if source == "-":
try:
payload = json.loads(sys.stdin.read())
except json.JSONDecodeError as exc:
typer.secho(f"Invalid JSON on stdin: {exc}", fg=typer.colors.RED, err=True)
raise typer.Exit(code=2)
else:
path = Path(source)
if not path.exists():
typer.secho(f"File not found: {path}", fg=typer.colors.RED, err=True)
raise typer.Exit(code=2)
payload = json.loads(path.read_text(encoding="utf-8"))
sheet = StudySheet.from_agent_result(
payload,
title=title,
language=language,
include_citations=include_citations,
)
try:
if format == "markdown":
written = export_markdown(sheet, out=out, citation_style=citation_style)
elif format == "pdf":
from jw_core.exporters.pdf import export_pdf # lazy
written = export_pdf(sheet, out=out, theme=theme) # type: ignore[arg-type]
elif format == "docx":
from jw_core.exporters.docx import export_docx
written = export_docx(sheet, out=out)
elif format == "apkg":
from jw_core.exporters.anki import export_apkg
written = export_apkg(sheet, out=out, per_citation_cards=per_citation_cards)
else:
typer.secho(
f"Unknown format {format!r}. Use: markdown | pdf | docx | apkg",
fg=typer.colors.RED,
err=True,
)
raise typer.Exit(code=2)
except MissingDependencyError as exc:
typer.secho(str(exc), fg=typer.colors.RED, err=True)
raise typer.Exit(code=3)
except ExportError as exc:
typer.secho(f"Export failed: {exc}", fg=typer.colors.RED, err=True)
raise typer.Exit(code=4)
typer.secho(f"Wrote {written} ({written.stat().st_size} bytes)", fg=typer.colors.GREEN)
- Step 4: Register in
main.py
Edit packages/jw-cli/src/jw_cli/main.py:
-
Add to the import block:
from jw_cli.commands import export -
After existing
app.command(...)lines, add:app.command(name="export")(export.export_cmd) -
Step 5: Run test until green
Run: uv run pytest packages/jw-cli/tests/test_export_command.py -v
Expected: 4-5 passed (1 PDF test skipped if weasyprint missing).
- Step 6: Smoke test the CLI
echo '{"query":"demo","agent_name":"apologetics","findings":[],"warnings":[],"metadata":{}}' \
| uv run jw export - --format markdown --out /tmp/demo.md
cat /tmp/demo.md
Expected: file printed with # demo header.
- Step 7: Commit
git add packages/jw-cli/src/jw_cli/commands/export.py packages/jw-cli/src/jw_cli/main.py packages/jw-cli/tests/test_export_command.py
git commit -m "feat(cli): jw export command for markdown/pdf/docx/apkg"
Task 9: MCP tool export_study_sheet
Files:
-
Modify:
packages/jw-mcp/src/jw_mcp/server.py -
Step 1: Register the tool
Find the section of server.py that registers existing @app.tool() handlers and append:
from jw_core.exporters.errors import ExportError, MissingDependencyError
from jw_core.exporters.ir import StudySheet
from jw_core.exporters.markdown import export_markdown
from pathlib import Path
from typing import Literal
@app.tool()
def export_study_sheet(
agent_result: dict[str, Any],
format: Literal["markdown", "pdf", "docx", "apkg"],
out_path: str,
title: str | None = None,
language: str | None = None,
citation_style: Literal["inline-paren", "footnote", "bibliography"] = "footnote",
include_citations: bool = True,
theme: Literal["plain", "study-sheet"] = "study-sheet",
per_citation_cards: bool = False,
) -> dict[str, Any]:
"""Convert an AgentResult dict into a printable study sheet (md/pdf/docx/apkg).
Returns {"out": str, "format": str, "bytes_written": int} on success,
or {"error": "..."} on failure.
"""
sheet = StudySheet.from_agent_result(
agent_result,
title=title,
language=language,
include_citations=include_citations,
)
out = Path(out_path).expanduser()
try:
if format == "markdown":
written = export_markdown(sheet, out=out, citation_style=citation_style)
elif format == "pdf":
from jw_core.exporters.pdf import export_pdf
written = export_pdf(sheet, out=out, theme=theme)
elif format == "docx":
from jw_core.exporters.docx import export_docx
written = export_docx(sheet, out=out)
elif format == "apkg":
from jw_core.exporters.anki import export_apkg
written = export_apkg(sheet, out=out, per_citation_cards=per_citation_cards)
else:
return {"error": f"unknown format {format!r}"}
except MissingDependencyError as exc:
return {"error": str(exc)}
except ExportError as exc:
return {"error": f"export failed: {exc}"}
return {
"out": str(written),
"format": format,
"bytes_written": written.stat().st_size,
}
(If from typing import Any or Path are already imported at the top of the file, skip the redundant imports — just place the function with the existing ones.)
- Step 2: Smoke-test the tool registration
Run:
uv run python -c "
from jw_mcp.server import app
tools = [t.name for t in app._tools.values()] if hasattr(app, '_tools') else []
print('Has export_study_sheet:', 'export_study_sheet' in tools)
"
(The exact FastMCP introspection may vary; alternatively start the server and list tools via the MCP protocol.)
- Step 3: Commit
git add packages/jw-mcp/src/jw_mcp/server.py
git commit -m "feat(mcp): export_study_sheet tool wrapping the exporters module"
Task 10: Documentation, ROADMAP, VISION_AUDIT
Files:
-
Create:
docs/guias/exportador-hoja-de-estudio.md -
Modify:
docs/ROADMAP.md -
Modify:
docs/VISION_AUDIT.md -
Modify:
docs/README.md -
Step 1: Write the user guide
# Exportador de hoja de estudio (PDF / DOCX / Anki / Markdown)
> Fase 31 — convierte cualquier `AgentResult` en un entregable imprimible o
> un mazo Anki de repaso espaciado. Markdown siempre disponible; los demás
> formatos son opt-in vía extras.
## Instalación
```bash
# baseline (markdown siempre)
uv sync --all-packages
# con extras opcionales
uv pip install 'jw-core[pdf]' # WeasyPrint
uv pip install 'jw-core[docx]' # python-docx
uv pip install 'jw-core[anki]' # genanki
WeasyPrint requiere librerías nativas (cairo, pango). Ver https://doc.courtbouillon.org/weasyprint/stable/first_steps.html para instrucciones por plataforma.
Uso (CLI)
# 1) Generar el AgentResult
uv run jw apologetics "Trinidad" --json > /tmp/trinity.json
# 2) Convertir
uv run jw export /tmp/trinity.json --format markdown --out hoja.md
uv run jw export /tmp/trinity.json --format pdf --out hoja.pdf --theme study-sheet
uv run jw export /tmp/trinity.json --format docx --out hoja.docx
uv run jw export /tmp/trinity.json --format apkg --out mazo.apkg --per-citation-cards
Pipeline en una sola línea:
uv run jw apologetics "Trinidad" --json | uv run jw export - -f pdf -o /tmp/x.pdf
Estilos de cita
--citation-style inline-paren— citas entre paréntesis dentro del cuerpo.--citation-style footnote(default) — marcadores[^1]con definiciones al final.--citation-style bibliography— cuerpo limpio + lista de fuentes al final.
Plantillas personalizadas
Coloca un Jinja2 con el mismo nombre que un template built-in en
~/.jw-agent-toolkit/templates/ para sobrescribirlo:
~/.jw-agent-toolkit/templates/study-sheet.html.j2
El resolver siempre prefiere la versión del usuario.
Anki — re-export idempotente
El GUID de cada tarjeta deriva de sha256(title + heading + body[:200]).
Re-exportar el mismo AgentResult y reimportar el .apkg en Anki:
actualiza las notas existentes, no duplica.
MCP
{
"tool": "export_study_sheet",
"arguments": {
"agent_result": { ... },
"format": "pdf",
"out_path": "~/Documents/hoja.pdf",
"theme": "study-sheet",
"citation_style": "footnote"
}
}
Devuelve {"out": "...", "format": "...", "bytes_written": N} o {"error": "..."}.
Diseño
Una IR única (StudySheet) intermedia. Cuatro exporters consumen la IR; nunca un
AgentResult directamente. Las dependencias pesadas se importan lazy, así que
importar jw_core.exporters nunca falla aunque falten los extras.
- [ ] **Step 2: Update ROADMAP and VISION_AUDIT**
Edit `docs/ROADMAP.md`:
- Append a section "## Fase 31 — Exportador hoja de estudio (PDF / DOCX / Anki)" with a one-paragraph summary and a link to the spec.
Edit `docs/VISION_AUDIT.md`:
- Locate the row for item `#11` (or the most semantically close to "exportador"). Mark its status as ✅ implemented in Fase 31 and add the path `jw_core.exporters`.
Edit `docs/README.md`:
- Add a bullet under the "Guías" section linking to `guias/exportador-hoja-de-estudio.md`.
- [ ] **Step 3: Commit**
```bash
git add docs/guias/exportador-hoja-de-estudio.md docs/ROADMAP.md docs/VISION_AUDIT.md docs/README.md
git commit -m "docs(fase-31): exporter user guide + roadmap + vision audit"
Task 11: Full regression
- Step 1: Run the entire test suite
uv run pytest -q
Expected: every previous test still green; new tests added (≈45 new tests).
- Step 2: Check no module imports fail without extras
uv run python -c "
import jw_core.exporters
from jw_core.exporters import StudySheet
from jw_core.exporters.markdown import export_markdown
print('jw_core.exporters imports cleanly without extras.')
"
Expected: clean import.
- Step 3: Lint / format
uv run ruff check packages/jw-core/src/jw_core/exporters packages/jw-cli/src/jw_cli/commands/export.py
uv run ruff format packages/jw-core/src/jw_core/exporters packages/jw-cli/src/jw_cli/commands/export.py
Expected: no lint errors, no diff after format (or diff applied).
- Step 4: Type-check (if mypy / pyright configured)
uv run mypy packages/jw-core/src/jw_core/exporters 2>&1 || true
Expected: no new errors (lazy imports may yield “module not installed” — acceptable when extras are absent).
- Step 5: Final commit if anything changed
git status
# if anything pending after lint/format:
git add -A
git commit -m "chore(fase-31): lint and format pass"
Self-review
- ✅ No LLM in critical path: every exporter is deterministic, no model calls.
- ✅ Citations verifiable: URL is preserved verbatim from
Finding.citation.url. All exporters render URL as hyperlink. - ✅ Local-first: all output paths are local. No telemetry, no network.
- ✅ No network in tests: every test uses synthetic StudySheets. WeasyPrint reads only the inline HTML string.
- ✅ en/es/pt:
StudySheet.languagepropagates to<html lang="">. CLI accepts--language. - ✅ Spanish prose, English identifiers: docstrings/comments in English (matching the rest of the codebase), user-facing copy and the guide in Spanish.
- ✅ GPL-3.0 / Hatchling / src layout / Python 3.13: respected throughout.
- ✅ Single conversion
AgentResult → StudySheet: only inir.from_agent_result. - ✅ Stable Anki GUIDs: sha256-derived; re-export updates instead of duplicating.
- ✅ Pluggable templates: user override at
~/.jw-agent-toolkit/templates/wins. - ✅ Optional extras are truly optional: importing
jw_core.exporterswithout[pdf]/[docx]/[anki]never errors; each exporter raisesMissingDependencyErrorwith copy-pasteable install hint.
Edge cases covered
- Empty findings → one placeholder section.
- Long query → title truncated.
- HTML injection in body → escaped by Jinja2
autoescape=Trueand by markdown escape. - Citation with no title →
short_labelfallback to last URL segment. - Re-export same content → identical GUIDs and deck_id (proven by test).
- Re-export with content changed → different GUIDs (proven by test).
- Bad template name →
ExportErrorwith both lookup paths in the message. - Missing extra →
MissingDependencyErrorwith install hint.
Execution choice
This plan is structured for superpowers:executing-plans (one developer, sequential). Each task is independently committable; the test suite is green at every commit. For subagent-driven-development, Tasks 5, 6 and 7 (the three optional exporters) can be dispatched in parallel after Task 4 (templates) lands — they share no state beyond the IR.
Recommended sequence:
- Solo execution: Tasks 1 → 2 → 3 → 4.
- Optional parallelization: Tasks 5, 6, 7 in parallel.
- Solo execution: Tasks 8 → 9 → 10 → 11.
Open questions
None blocking. Two non-blocking calls to make during implementation:
- PDF font fallback for non-Latin scripts: ship Noto Sans CJK inside the package, or document the install? Decision deferred — start with system fallback, revisit if a user files an issue.
- Anki model evolution: if we want to add a third field later (e.g. “Source”), we’ll need a migration plan because the model ID is derived from the model name. Out of scope for v1.
Editar esta página en docs/superpowers/plans/2026-05-30-fase-31-exporter-plan.md