Independent Project Not affiliated with, sponsored by, or endorsed by the Watch Tower Bible and Tract Society or Jehovah's Witnesses.
jw-agent-toolkit
ES

Release F57-F66 · 13 phases in a single continuous session

Ten packages, 2.6k tests, 110 MCP tools

Thirteen phases that move from "jw-core multilingual complete" to "toolkit with two new packages and live-meeting shipped". F57 ships clean-room jw-meeting-media (WOL HTML parser from DevTools + Tauri presenter). F58 adds the JW-pure Bible Knowledge Graph (607 B.C.E., NOT 587/586) in jw-brain. F61 lands opt-in persistent memory with Fernet. F62 external loaders (historical Watchtowers with marker, Office docs with markitdown). F64 whisperX with diarization + opt-in speaker name mapping. F66 exposes the brain via MCP + drift fix. Plus 7 MVP+1 follow-ups (drag-drop, external monitor, multi-congregation, geocoords, headwords, auto-recap).

Phases

13

New tests

237+

Commits

72+

Phase 57

jw-meeting-media (clean-room)

Live-meeting layer with a WOL HTML parser designed from DevTools

✅ Shipped 🧪 48 tests ⊜ 13 commits T1 New layer + biblical KG
Technical guide →

New subpackage that delivers the missing 'live-meeting' layer: automatic weekly mwb/w program discovery + media downloads + Tauri presenter. Strict clean-room implementation: zero AGPL-3.0 M³ .ts/.vue files opened. HTML parser designed by inspecting real WOL DOM.

What shipped

  • MeetingProgramClient + BeautifulSoup parser over semantic WOL HTML.
  • MediaResolver wraps PubMediaClient for video/audio refs.
  • Idempotent Downloader by sha256 with <root>/<lang>/<year>/<week>/ cache.
  • MeetingStorage sqlite + Thumbnailer (Pillow + ffmpeg).
  • PresenterManager FSM with multi-session in-memory.
  • CLI: jw meeting {discover, download, list}.
  • REST: /presenter/sessions/{sid}/{state, play, pause, next, prev, stop}.
  • Tauri 2.x window 'presenter' vanilla JS with keyboard shortcuts.
  • 4 MCP tools meeting_*.

Phase 58

bible-knowledge-graph (JW-pure)

Biblical KG from Insight + NWT, NOT academic theographic

✅ Shipped 🧪 39 tests ⊜ 12 commits T1 New layer + biblical KG
Technical guide →

Builds a biblical knowledge graph in jw-brain from pure JW sources (Insight on the Scriptures + NWT/NWTsty). Schema extended with Period and Passage. Strict JW chronology: 607 B.C.E. for the destruction of Jerusalem triple-anchored in code, comments and guide. Watch Tower attribution visible in the guide.

What shipped

  • Extended TJ schema: Period + Passage nodes + 5 temporal edges.
  • 10 JW chronology periods (607 B.C.E. for Jerusalem, NOT 587/586).
  • Procedural BibleLoader (NO LLM): import_periods + import_insight(jwpub).
  • Initial MVP PERSON_HEADWORDS (28) + PLACE_HEADWORDS (16).
  • Port BibleRef.from_wol_url to Python (parity with jw-core-js F56.5).
  • Synthetic fixture it_mini.jwpub reusing JwpubBuilder (F50).
  • CLI: jw brain import-bible --insight <jwpub> --symbol it --meps-language.
  • Helper DuckDBBackend.query_persons_in_book (E2E: Abraham in Genesis ✓).
  • Watch Tower Bible and Tract Society of Pennsylvania attribution.

Phase 61

letta-memory (opt-in)

Persistent conversational memory with opt-in Fernet

✅ Shipped 🧪 31 tests ⊜ 7 commits T2 Memory, loaders, diarized ASR
Technical guide →

Module jw_agents.memory with MemoryStore Protocol + 3 backends (FakeMemoryStore, default SqliteMemoryStore with opt-in Fernet via JW_MEMORY_KEY, opt-in LettaMemoryStore for multi-device). Wire-up in conversation_assistant preserves compatibility (memory=None → legacy behavior). Pattern inherited from F25 RevisitStore.

What shipped

  • MemoryStore Protocol + MemoryRecord frozen dataclass.
  • FakeMemoryStore in-memory (default tests).
  • SqliteMemoryStore at ~/.jw-agent-toolkit/memory.db.
  • Opt-in Fernet via env JW_MEMORY_KEY (F25 precedent).
  • Opt-in LettaMemoryStore with letta-client (memory-letta extra).
  • build_memory_store() env-driven factory.
  • Wire-up conversation_assistant with memory: MemoryStore | None.
  • MCP tools: memory_record, memory_recall, memory_forget_session.

Phase 62

marker-markitdown (loaders)

OCR historical Watchtowers + brotherhood Office docs

✅ Shipped 🧪 9 tests ⊜ 8 commits T2 Memory, loaders, diarized ASR
Technical guide →

Two new loaders in jw-rag: marker for historical PDFs (scanned pre-EPUB Watchtowers/Awake) and markitdown for Office docs shared in the brotherhood (.docx/.pptx/.xlsx). Both opt-in via extras. sha256 idempotency + automatic JW signature detection (Watch Tower, jw.org, Watchtower).

What shipped

  • pdf_marker.ingest_pdf with marker-pdf (Apache-2.0).
  • docs_markitdown.ingest_office_doc for .docx/.pptx/.xlsx (MIT).
  • sha256 idempotency: source_id pdf:<hash8> and doc:<ext>:<hash8>.
  • JW signature regex: watch tower|jw.org|atalaya|kingdom hall|...
  • metadata.is_jw=True when signature matches.
  • GPU/LLM opt-in: env JW_MARKER_USE_GPU + JW_MARKER_USE_LLM.
  • pyproject extras: [pdf-marker], [doc-markitdown], [loaders-all].
  • MCP tools ingest_pdf + ingest_office_doc; CLI jw rag ingest-pdf|ingest-office.

Phase 64

whisperX-asr (diarization)

Word-level timestamps + speaker diarization

✅ Shipped 🧪 16 tests ⊜ 6 commits T2 Memory, loaders, diarized ASR
Technical guide →

WhisperX provider (BSD-4) opt-in via [asr-whisperx] extras. NOT added to DEFAULT_ASR_CHAIN (3 GB models), explicitly selectable via JW_ASR_PROVIDER=whisperx or name='whisperx'. transcribe_diarized returns DiarizedResult (extends TranscriptionResult) with per-segment speaker_id + optional enrich_with_bible_refs using parse_reference. Requires HF_TOKEN for the pyannote diarization model.

What shipped

  • DiarizedSegment(speaker_id, bible_refs) extends TranscriptionSegment.
  • DiarizedResult(speaker_count) extends TranscriptionResult.
  • WhisperXProvider with is_available + transcribe + transcribe_diarized.
  • Per-segment BibleRef enrichment via parse_all_references.
  • WhisperXDiarizationError when HF_TOKEN is missing.
  • Lazy model cache (_asr_model, _align_model, _diarize_model).
  • CLI: jw audio transcribe --diarize --bible-refs.
  • MCP tool: transcribe_audio_diarized.

Phase 66

mcp-jw-brain

Brain exposed via MCP + _EXPECTED_TOOLS drift fix

✅ Shipped 🧪 5 tests ⊜ 5 commits T3 Brain MCP + drift fix
Technical guide →

Exposes second-brain operations as MCP tools. F49 had already landed the wrappers (second_brain_status/compile/query/lint/snapshot with brain_path: str signature); F66 only covers Task 4 (E2E tests) and Task 5 (docs). Plus a fix of the pre-existing _EXPECTED_TOOLS drift (missing get_trace from F43 + translate_preserving_refs from F54).

What shipped

  • Audit: the 5 second_brain_* tools were already registered (F49).
  • E2E tests with DuckDB temp brain + monkeypatched sandbox registry.
  • Fix _EXPECTED_TOOLS drift: get_trace + translate_preserving_refs.
  • Doc 'Phase 66 — Second Brain tools' in docs/referencia/jw-mcp.md.
  • Degraded mode: dict {'error': '...'} if brain not configured.

Phase 57.14

F57.14 · drag-and-drop UI

Sidebar with queue + reorder + dropzone

✅ Shipped 🧪 17 tests ⊜ 4 commits T4 MVP+1 follow-ups
Technical guide →

Tauri presenter window with side UI: sidebar lists all queue items with thumbnails. Native HTML5 drag-drop for reordering (smart cursor follow: if you drag the active item, the cursor follows it; if you shift the range, it adjusts). Dropzone to add external files from the system. Vanilla JS without libraries.

What shipped

  • PresenterManager: reorder(from, to), add_item(item), jump_to(index).
  • REST endpoints /presenter/sessions/{sid}/{reorder, add, jump}.
  • Sidebar HTML/CSS: queue-panel with draggable queue-list + dropzone.
  • JS handlers: dragstart/dragover/drop + cursor follow logic.
  • renderQueue() with hash to avoid DOM rebuild on each poll.

Phase 57.15

F57.15 · automatic external monitor

Tauri commands list_monitors + move_presenter_to_monitor

✅ Shipped ⊜ 3 commits T4 MVP+1 follow-ups
Technical guide →

External monitor detection and move-to-monitor + fullscreen via Tauri 2.x Windows API. UI dropdown selector (🖥 icon) in sidebar; fullscreen checkbox; click outside closes menu. Graceful fallback: with only 1 monitor or if detection fails, the buttons stay greyed out without crashing.

What shipped

  • Rust: list_monitors() → Vec<MonitorInfo> with name/width/height/x/y/scale/is_primary.
  • Rust: move_presenter_to_monitor(name, fullscreen) repositions + fullscreen + focus.
  • JS: refreshMonitorList() invokes tauri.core.invoke('list_monitors').
  • UI: dropdown #monitor-menu with #monitor-list + #fullscreen-checkbox.
  • Hide selector when !window.__TAURI__ (vite dev mode).

Phase 57.16

F57.16 · multi-congregation

TOML registry + CLI subcommands + --congregation flag

✅ Shipped 🧪 22 tests ⊜ 4 commits T4 MVP+1 follow-ups
Technical guide →

Support for multiple simultaneous congregations. Registry ~/.jw-agent-toolkit/meetings/congregations.toml. Each cong has its own storage namespaced under <root>/<name>/. Resolution rules: name → lookup; no name + 1 entry → auto; no name + multiple → ValueError. Backwards-compat: no registry → Congregation('default') uses legacy cache root with no migration needed.

What shipped

  • Congregation dataclass + load_registry + save_congregation + remove.
  • resolve_congregation with 4 rules including backwards-compat.
  • CLI: jw meeting congregation {add, list, remove}.
  • Flag --congregation/-c in discover/download/list.
  • Factored _cache_root_for(name).
  • MCP tools meeting_list_congregations + meeting_add_congregation.
  • Optional congregation param in existing meeting_* tools.

Phase 58.13

F58.13 · place geocoords

16 curated places with lat/lon + region + modern_name + eras

✅ Shipped 🧪 7 tests ⊜ 2 commits T4 MVP+1 follow-ups
Technical guide →

Curated catalog analogous to period_catalog. PlaceGeoData with slug/region/modern_name/latitude/longitude/eras_active. 16 main places: Jerusalem (31.78N, 35.24E, Judea, modern='Jerusalem/Al-Quds'), Babylon (32.54N, 44.42E, modern='Hillah, Iraq'), Rome, Athens, Ephesus, Bethlehem, Nazareth, etc. BibleLoader.import_insight enriches automatically.

What shipped

  • PlaceGeoData frozen dataclass.
  • ALL_PLACES tuple with 16 entries.
  • get_place_geodata(slug) → PlaceGeoData | None.
  • BibleLoader._process_entry place branch enriches with geodata.

Phase 58.14

F58.14 · headwords expansion

250 persons + 150 places canon × ES+EN + audit CLI

✅ Shipped 🧪 27 tests ⊜ 3 commits T4 MVP+1 follow-ups
Technical guide →

Built-in catalog expanded to 475 person entries (~250 canon figures × ES+EN variants) and 259 place entries (~150 places × ES+EN). headword_extractor for audit-only: jw brain learn-headwords --insight <jwpub> extracts headwords from the user's JWPUB and persists them LOCALLY in <brain>/extracted_headwords.json — NOT redistributed. Reports % coverage of the built-in catalog.

What shipped

  • EXPANDED_PERSON_HEADWORDS frozenset 475 entries.
  • EXPANDED_PLACE_HEADWORDS frozenset 259 entries.
  • Zero person/place overlap verified (amón king vs ammon land).
  • classify_entry_kind: union built-in + expanded.
  • extract_headwords_from_jwpub + persist_to_brain + load_extracted_headwords.
  • CLI jw brain learn-headwords with coverage stats.
  • Legal catalog: only factual public names of the biblical canon.

Phase 61.8

F61.8 · auto-recap

Procedural cross-session memory summarization

✅ Shipped 🧪 5 tests ⊜ 3 commits T4 MVP+1 follow-ups
Technical guide →

New agent recap_previous_session does NOT use an LLM (architectural decision). Groups MemoryStore records by session_id, sorts by timestamp desc, returns findings with short summary + excerpts_by_kind in metadata. Useful when starting a new conversation: 'let's continue with yesterday's session X'.

What shipped

  • Procedural agent recap_session.recap_previous_session().
  • Filters out the current session; sorts by last_timestamp desc.
  • Configurable max_excerpts_per_kind.
  • MCP tool recap_previous_session.

Phase 64.7

F64.7 · speaker names mapping

Voiceprint sqlite + opt-in Fernet + cosine similarity

✅ Shipped 🧪 11 tests ⊜ 2 commits T4 MVP+1 follow-ups
Technical guide →

Opt-in speaker_id → real name mapping. VoiceprintStore sqlite with voice embeddings (numpy float32 typically 192-dim). Opt-in Fernet via JW_VOICEPRINT_KEY (F61 precedent). SpeakerNameMapper with cosine similarity and similarity_threshold default 0.75. DiarizedSegment extended with optional speaker_name without breaking. whisperx pyannote integration is deferred to F64.8.

What shipped

  • VoiceprintStore at ~/.jw-agent-toolkit/voiceprints.db.
  • Opt-in Fernet via env JW_VOICEPRINT_KEY.
  • Voiceprint(name, embedding, enrolled_at_iso).
  • SpeakerNameMapper.identify(embedding) → name | None.
  • Backward-compatible DiarizedSegment.speaker_name.