Specs y planes
Fase 41 — plugin-sdk Implementation Plan
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Build jw_core.plugins — a five-group entry-point discovery layer that lets third parties extend agents, parsers, embedders, VLM providers and Gen providers without forking the monorepo.
Architecture: New subpackage packages/jw-core/src/jw_core/plugins/ exposing get_plugins(group), verify_plugin(name, group), clear_plugin_cache(). Discovery via importlib.metadata.entry_points. Conflict policy defaults to NAMESPACED. Fail-soft by default, fail-hard via JW_PLUGINS_STRICT=1. Five surfaces integrate: jw-eval (default_agent_registry), jw-rag (_instantiate_registry), jw-mcp (register_tools), jw-cli (jw plugins {list,verify,disable}). Test fixture package installed at test time via subprocess uv pip install -e.
Tech Stack: Python 3.13 · importlib.metadata · packaging.requirements / packaging.version · pydantic/dataclasses · pytest + monkeypatch · subprocess + uv pip install -e for fixture install · typer (CLI).
Spec: docs/superpowers/specs/2026-05-31-fase-41-plugin-sdk-design.md.
File map
Creates:
packages/jw-core/src/jw_core/plugins/__init__.pypackages/jw-core/src/jw_core/plugins/errors.pypackages/jw-core/src/jw_core/plugins/contracts.pypackages/jw-core/src/jw_core/plugins/policy.pypackages/jw-core/src/jw_core/plugins/registry.pypackages/jw-core/src/jw_core/plugins/verify.pypackages/jw-core/src/jw_core/plugins/factory.pypackages/jw-core/tests/test_plugins_errors.pypackages/jw-core/tests/test_plugins_contracts.pypackages/jw-core/tests/test_plugins_policy.pypackages/jw-core/tests/test_plugins_registry.pypackages/jw-core/tests/test_plugins_verify.pypackages/jw-core/tests/test_plugins_factory.pypackages/jw-core/tests/test_plugins_e2e.pypackages/jw-core/tests/conftest_plugins.pypackages/jw-core/tests/fixtures/plugin_sample/pyproject.tomlpackages/jw-core/tests/fixtures/plugin_sample/README.mdpackages/jw-core/tests/fixtures/plugin_sample/src/plugin_sample/__init__.pypackages/jw-core/tests/fixtures/plugin_sample/src/plugin_sample/agent.pypackages/jw-core/tests/fixtures/plugin_sample/src/plugin_sample/parser.pypackages/jw-core/tests/fixtures/plugin_sample/src/plugin_sample/embedder.pypackages/jw-core/tests/fixtures/plugin_sample/src/plugin_sample/vlm.pypackages/jw-core/tests/fixtures/plugin_sample/src/plugin_sample/gen.pypackages/jw-cli/src/jw_cli/commands/plugins.pydocs/plugin-sdk/overview.mddocs/plugin-sdk/security.mddocs/plugin-sdk/capabilities.mddocs/plugin-sdk/authoring.md
Modifies:
packages/jw-core/src/jw_core/__init__.py— re-export__version__(already exists), addfrom jw_core.plugins import ...(lazy).packages/jw-eval/src/jw_eval/cli.py— merge plugins intodefault_agent_registry.packages/jw-rag/src/jw_rag/embed_providers/factory.py— merge embedder plugins into_instantiate_registry.packages/jw-mcp/src/jw_mcp/server.py— register plugin tools.packages/jw-cli/src/jw_cli/main.py— registerpluginssubcommand.packages/jw-cli/src/jw_cli/commands/__init__.py— exportpluginscommand..github/workflows/ci.yml— addplugin-sdkoffline job.docs/VISION_AUDIT.md— Fase 41 row.docs/ROADMAP.md— Fase 41 section.docs/README.md— link plugin-sdk guides.
Task 1: Scaffold jw_core.plugins and errors
Files:
-
Create:
packages/jw-core/src/jw_core/plugins/__init__.py -
Create:
packages/jw-core/src/jw_core/plugins/errors.py -
Create:
packages/jw-core/tests/test_plugins_errors.py -
Step 1: Write the failing test
# packages/jw-core/tests/test_plugins_errors.py
"""Tests for jw_core.plugins.errors."""
from __future__ import annotations
import pytest
from jw_core.plugins.errors import (
PluginConflictError,
PluginContractError,
PluginError,
PluginVersionMismatch,
)
def test_plugin_error_is_base() -> None:
assert issubclass(PluginConflictError, PluginError)
assert issubclass(PluginContractError, PluginError)
assert issubclass(PluginVersionMismatch, PluginError)
def test_plugin_conflict_error_carries_names() -> None:
err = PluginConflictError(
name="dup",
group="jw_agent_toolkit.agents",
dist_names=("pkg-a", "pkg-b"),
policy="reject",
)
assert err.name == "dup"
assert err.dist_names == ("pkg-a", "pkg-b")
assert "dup" in str(err)
assert "pkg-a" in str(err)
assert "pkg-b" in str(err)
def test_plugin_version_mismatch_carries_constraint() -> None:
err = PluginVersionMismatch(
plugin_name="foo",
constraint="jw-agent-toolkit>=99.0",
installed_version="0.1.0",
)
assert err.constraint == "jw-agent-toolkit>=99.0"
assert "99.0" in str(err)
assert "0.1.0" in str(err)
def test_plugin_contract_error_carries_missing() -> None:
err = PluginContractError(
plugin_name="foo",
group="jw_agent_toolkit.agents",
missing=["__call__"],
extra={"reason": "not callable"},
)
assert err.missing == ["__call__"]
assert "__call__" in str(err)
def test_can_raise_and_catch() -> None:
with pytest.raises(PluginError):
raise PluginConflictError("a", "g", ("x", "y"), "reject")
- Step 2: Run test to verify it fails
Run: uv run pytest packages/jw-core/tests/test_plugins_errors.py -v
Expected: FAIL — ModuleNotFoundError: No module named 'jw_core.plugins'.
- Step 3: Implement the package init and errors
# packages/jw-core/src/jw_core/plugins/__init__.py
"""jw_core.plugins — entry-point discovery for community extensions.
Public API:
from jw_core.plugins import (
get_plugins,
clear_plugin_cache,
verify_plugin,
PluginError,
PluginConflictError,
PluginContractError,
PluginVersionMismatch,
)
Five extension-point groups (PEP 621 entry points):
jw_agent_toolkit.agents
jw_agent_toolkit.parsers
jw_agent_toolkit.embedders
jw_agent_toolkit.vlm_providers
jw_agent_toolkit.gen_providers
"""
from __future__ import annotations
from jw_core.plugins.errors import (
PluginConflictError,
PluginContractError,
PluginError,
PluginVersionMismatch,
)
from jw_core.plugins.factory import clear_plugin_cache, get_plugins
from jw_core.plugins.verify import verify_plugin
__all__ = [
"PluginConflictError",
"PluginContractError",
"PluginError",
"PluginVersionMismatch",
"clear_plugin_cache",
"get_plugins",
"verify_plugin",
]
# packages/jw-core/src/jw_core/plugins/errors.py
"""Exception hierarchy for the plugin SDK."""
from __future__ import annotations
class PluginError(Exception):
"""Base for every plugin-SDK error."""
class PluginConflictError(PluginError):
"""Two plugins registered the same name and the conflict policy is REJECT."""
def __init__(
self,
name: str,
group: str,
dist_names: tuple[str, ...],
policy: str,
) -> None:
self.name = name
self.group = group
self.dist_names = dist_names
self.policy = policy
super().__init__(
f"plugin name conflict: {name!r} in group {group!r} "
f"claimed by distributions {list(dist_names)} (policy={policy})"
)
class PluginVersionMismatch(PluginError):
"""A plugin declares a jw-agent-toolkit constraint that the current install violates."""
def __init__(
self,
plugin_name: str,
constraint: str,
installed_version: str,
) -> None:
self.plugin_name = plugin_name
self.constraint = constraint
self.installed_version = installed_version
super().__init__(
f"plugin {plugin_name!r} requires {constraint!r} "
f"but installed jw-agent-toolkit version is {installed_version!r}"
)
class PluginContractError(PluginError):
"""A plugin fails the Protocol contract for its group."""
def __init__(
self,
plugin_name: str,
group: str,
missing: list[str],
extra: dict[str, str] | None = None,
) -> None:
self.plugin_name = plugin_name
self.group = group
self.missing = list(missing)
self.extra = dict(extra or {})
joined = ", ".join(missing) or "<none>"
super().__init__(
f"plugin {plugin_name!r} in group {group!r} missing required: [{joined}]"
)
- Step 4: Stub the downstream modules so
__init__imports work
We have to bootstrap factory.py and verify.py with empty stubs so the __init__ import succeeds during this task. Real implementation lands in Tasks 5/6.
# packages/jw-core/src/jw_core/plugins/factory.py
"""STUB — replaced in Task 6."""
from __future__ import annotations
from typing import Any
def get_plugins(group: str) -> dict[str, Any]: # noqa: ARG001
return {}
def clear_plugin_cache() -> None:
return None
# packages/jw-core/src/jw_core/plugins/verify.py
"""STUB — replaced in Task 5."""
from __future__ import annotations
from typing import Any
def verify_plugin(name: str, group: str) -> Any: # noqa: ARG001
raise NotImplementedError
- Step 5: Run test to verify it passes
Run: uv run pytest packages/jw-core/tests/test_plugins_errors.py -v
Expected: 5 passed.
- Step 6: Commit
git add packages/jw-core/src/jw_core/plugins packages/jw-core/tests/test_plugins_errors.py
git commit -m "feat(plugin-sdk): scaffold jw_core.plugins package + error hierarchy"
Task 2: Protocols (contracts.py)
Files:
-
Create:
packages/jw-core/src/jw_core/plugins/contracts.py -
Create:
packages/jw-core/tests/test_plugins_contracts.py -
Step 1: Write the failing test
# packages/jw-core/tests/test_plugins_contracts.py
"""Tests for jw_core.plugins.contracts — the 5 Protocols + EntryPointSpec."""
from __future__ import annotations
from typing import Any
import pytest
from jw_core.plugins.contracts import (
AgentPlugin,
EmbedderPlugin,
EntryPointSpec,
GenProviderPlugin,
ParserPlugin,
VLMProviderPlugin,
)
def test_entry_point_spec_namespaced_name() -> None:
spec = EntryPointSpec(
name="my_agent",
group="jw_agent_toolkit.agents",
module="my_pkg.mod",
attr="my_agent",
dist_name="my-pkg",
dist_version="1.0.0",
)
assert spec.namespaced_name == "my-pkg:my_agent"
def test_entry_point_spec_is_hashable() -> None:
spec = EntryPointSpec(
name="x",
group="g",
module="m",
attr="a",
dist_name="d",
dist_version="1",
)
{spec} # smoke: frozen+hashable
def test_agent_plugin_protocol_accepts_async_callable() -> None:
async def my_agent(**kwargs: Any) -> Any:
return {"ok": True, "kwargs": kwargs}
my_agent.__name__ = "my_agent"
assert isinstance(my_agent, AgentPlugin)
def test_agent_plugin_protocol_rejects_sync_only_object() -> None:
class NotAnAgent:
pass
assert not isinstance(NotAnAgent(), AgentPlugin)
def test_parser_plugin_protocol() -> None:
def parse(raw: bytes | str, *, source_url: str | None = None) -> Any:
return {"raw": raw, "source_url": source_url}
assert isinstance(parse, ParserPlugin)
def test_embedder_plugin_protocol_shape() -> None:
class FakeEmb:
name = "fake"
target = "cpu"
dim = 8
def is_available(self) -> bool:
return True
def embed(self, texts: list[str]) -> list[list[float]]:
return [[0.0] * self.dim for _ in texts]
assert isinstance(FakeEmb(), EmbedderPlugin)
def test_vlm_provider_protocol_shape() -> None:
class FakeVLM:
name = "fake-vlm"
def is_available(self) -> bool:
return True
def describe(self, image_bytes: bytes, *, language: str = "en") -> str:
return f"fake[{language}] len={len(image_bytes)}"
assert isinstance(FakeVLM(), VLMProviderPlugin)
def test_gen_provider_protocol_shape() -> None:
class FakeGen:
name = "fake-gen"
def is_available(self) -> bool:
return True
def generate(self, prompt: str, *, max_tokens: int = 128) -> str:
return f"fake[{max_tokens}]: {prompt}"
assert isinstance(FakeGen(), GenProviderPlugin)
def test_entry_point_spec_resolve_calls_loader(monkeypatch: pytest.MonkeyPatch) -> None:
sentinel = object()
calls: list[tuple[str, str]] = []
def fake_import(module: str) -> Any:
calls.append(("import", module))
class _M:
def __getattr__(self, name: str) -> Any:
calls.append(("get", name))
return sentinel
return _M()
monkeypatch.setattr("jw_core.plugins.contracts.import_module", fake_import)
spec = EntryPointSpec(
name="x",
group="g",
module="my.pkg",
attr="x",
dist_name="d",
dist_version="1",
)
got = spec.resolve()
assert got is sentinel
assert ("import", "my.pkg") in calls
assert ("get", "x") in calls
- Step 2: Run test to verify it fails
Run: uv run pytest packages/jw-core/tests/test_plugins_contracts.py -v
Expected: FAIL — cannot import name 'AgentPlugin'.
- Step 3: Implement contracts
# packages/jw-core/src/jw_core/plugins/contracts.py
"""Five Protocols + EntryPointSpec dataclass.
Protocols are `runtime_checkable` and intentionally structural — third-party
plugins don't need to import anything from jw-agent-toolkit, they just need to
match the shape.
"""
from __future__ import annotations
from dataclasses import dataclass
from importlib import import_module
from typing import Any, Protocol, runtime_checkable
# ---------------------------------------------------------------------------
# Entry-point spec
# ---------------------------------------------------------------------------
@dataclass(frozen=True)
class EntryPointSpec:
"""Lazy descriptor for one entry point.
The actual callable / object is resolved on demand via `.resolve()`.
Stays frozen + hashable so we can de-dup by identity in the registry.
"""
name: str
group: str
module: str
attr: str
dist_name: str
dist_version: str
@property
def namespaced_name(self) -> str:
"""Stable disambiguation key under the NAMESPACED conflict policy."""
return f"{self.dist_name}:{self.name}"
def resolve(self) -> Any:
"""Load the entry-point target. Importing is deferred until this point."""
mod = import_module(self.module)
return getattr(mod, self.attr)
# ---------------------------------------------------------------------------
# Protocols — one per extension-point group
# ---------------------------------------------------------------------------
@runtime_checkable
class AgentPlugin(Protocol):
"""A pluggable agent.
Required:
- `__name__: str` (attribute) — Python callables have this for free.
- `__call__(**kwargs) -> Awaitable[Any]` — async callable.
Optional (detected via hasattr at use-site, never required):
- `languages: list[str]`
- `version: str`
- `cost_estimate(**kwargs) -> int` (since v1.3, opt-in)
"""
__name__: str
def __call__(self, **kwargs: Any) -> Any: ...
@runtime_checkable
class ParserPlugin(Protocol):
"""A pluggable document parser.
Required:
- `__call__(raw, *, source_url=None) -> ParsedDocument-like`
Optional:
- `extensions: list[str]`
- `mime_types: list[str]`
"""
def __call__(
self,
raw: bytes | str,
*,
source_url: str | None = None,
) -> Any: ...
@runtime_checkable
class EmbedderPlugin(Protocol):
"""Mirrors jw_rag.embed_providers.factory.EmbedProvider for plugin registration."""
name: str
target: str
dim: int
def is_available(self) -> bool: ...
def embed(self, texts: list[str]) -> Any: ...
@runtime_checkable
class VLMProviderPlugin(Protocol):
"""Mirrors jw_core.vision.VLMProvider."""
name: str
def is_available(self) -> bool: ...
def describe(self, image_bytes: bytes, *, language: str = "en") -> str: ...
@runtime_checkable
class GenProviderPlugin(Protocol):
"""Mirrors jw_gen.GenerationProvider."""
name: str
def is_available(self) -> bool: ...
def generate(self, prompt: str, *, max_tokens: int = 128) -> str: ...
- Step 4: Run test to verify it passes
Run: uv run pytest packages/jw-core/tests/test_plugins_contracts.py -v
Expected: 9 passed.
- Step 5: Commit
git add packages/jw-core/src/jw_core/plugins/contracts.py packages/jw-core/tests/test_plugins_contracts.py
git commit -m "feat(plugin-sdk): contracts — 5 Protocols + EntryPointSpec"
Task 3: Conflict policy + env helpers (policy.py)
Files:
-
Create:
packages/jw-core/src/jw_core/plugins/policy.py -
Create:
packages/jw-core/tests/test_plugins_policy.py -
Step 1: Write the failing test
# packages/jw-core/tests/test_plugins_policy.py
"""Tests for jw_core.plugins.policy."""
from __future__ import annotations
import pytest
from jw_core.plugins.contracts import EntryPointSpec
from jw_core.plugins.errors import PluginConflictError
from jw_core.plugins.policy import (
ConflictPolicy,
apply_conflict_policy,
read_env_set,
read_policy_from_env,
)
def _spec(name: str, dist: str) -> EntryPointSpec:
return EntryPointSpec(
name=name,
group="jw_agent_toolkit.agents",
module=f"{dist}.mod",
attr=name,
dist_name=dist,
dist_version="1.0.0",
)
def test_conflict_policy_enum_values() -> None:
assert ConflictPolicy.FIRST_WINS.value == "first_wins"
assert ConflictPolicy.LAST_WINS.value == "last_wins"
assert ConflictPolicy.NAMESPACED.value == "namespaced"
assert ConflictPolicy.REJECT.value == "reject"
def test_read_env_set_missing(monkeypatch: pytest.MonkeyPatch) -> None:
monkeypatch.delenv("JW_PLUGINS_ALLOW_LIST", raising=False)
assert read_env_set("JW_PLUGINS_ALLOW_LIST") is None
def test_read_env_set_empty_treated_as_none(monkeypatch: pytest.MonkeyPatch) -> None:
monkeypatch.setenv("JW_PLUGINS_ALLOW_LIST", "")
assert read_env_set("JW_PLUGINS_ALLOW_LIST") is None
def test_read_env_set_parses_csv(monkeypatch: pytest.MonkeyPatch) -> None:
monkeypatch.setenv("JW_PLUGINS_ALLOW_LIST", "a, b ,c")
assert read_env_set("JW_PLUGINS_ALLOW_LIST") == {"a", "b", "c"}
def test_read_policy_from_env_default(monkeypatch: pytest.MonkeyPatch) -> None:
monkeypatch.delenv("JW_PLUGINS_CONFLICT_POLICY", raising=False)
assert read_policy_from_env() == ConflictPolicy.NAMESPACED
def test_read_policy_from_env_explicit(monkeypatch: pytest.MonkeyPatch) -> None:
monkeypatch.setenv("JW_PLUGINS_CONFLICT_POLICY", "first_wins")
assert read_policy_from_env() == ConflictPolicy.FIRST_WINS
def test_read_policy_from_env_invalid_falls_back(
monkeypatch: pytest.MonkeyPatch,
caplog: pytest.LogCaptureFixture,
) -> None:
monkeypatch.setenv("JW_PLUGINS_CONFLICT_POLICY", "weird")
with caplog.at_level("WARNING"):
assert read_policy_from_env() == ConflictPolicy.NAMESPACED
assert any("weird" in r.message for r in caplog.records)
def test_apply_first_wins_keeps_existing() -> None:
current = {"x": _spec("x", "pkg-a")}
new = _spec("x", "pkg-b")
out = apply_conflict_policy(current, new, ConflictPolicy.FIRST_WINS)
assert out["x"].dist_name == "pkg-a"
def test_apply_last_wins_replaces() -> None:
current = {"x": _spec("x", "pkg-a")}
new = _spec("x", "pkg-b")
out = apply_conflict_policy(current, new, ConflictPolicy.LAST_WINS)
assert out["x"].dist_name == "pkg-b"
def test_apply_namespaced_emits_both_under_qualified_names() -> None:
current = {"x": _spec("x", "pkg-a")}
new = _spec("x", "pkg-b")
out = apply_conflict_policy(current, new, ConflictPolicy.NAMESPACED)
# The bare "x" is removed; both live under their dist-qualified names.
assert "x" not in out
assert out["pkg-a:x"].dist_name == "pkg-a"
assert out["pkg-b:x"].dist_name == "pkg-b"
def test_apply_namespaced_no_conflict_keeps_bare_name() -> None:
current: dict = {}
new = _spec("x", "pkg-a")
out = apply_conflict_policy(current, new, ConflictPolicy.NAMESPACED)
assert "x" in out
assert "pkg-a:x" not in out
def test_apply_reject_raises() -> None:
current = {"x": _spec("x", "pkg-a")}
new = _spec("x", "pkg-b")
with pytest.raises(PluginConflictError) as exc_info:
apply_conflict_policy(current, new, ConflictPolicy.REJECT)
assert "pkg-a" in str(exc_info.value)
assert "pkg-b" in str(exc_info.value)
def test_apply_logs_warning_on_conflict(caplog: pytest.LogCaptureFixture) -> None:
current = {"x": _spec("x", "pkg-a")}
new = _spec("x", "pkg-b")
with caplog.at_level("WARNING"):
apply_conflict_policy(current, new, ConflictPolicy.FIRST_WINS)
assert any("conflict" in r.message.lower() for r in caplog.records)
- Step 2: Run test to verify it fails
Run: uv run pytest packages/jw-core/tests/test_plugins_policy.py -v
Expected: FAIL — cannot import name 'ConflictPolicy'.
- Step 3: Implement policy
# packages/jw-core/src/jw_core/plugins/policy.py
"""Conflict policies + env-var helpers."""
from __future__ import annotations
import logging
import os
from enum import StrEnum
from jw_core.plugins.contracts import EntryPointSpec
from jw_core.plugins.errors import PluginConflictError
logger = logging.getLogger(__name__)
class ConflictPolicy(StrEnum):
"""How to behave when two plugins register the same name in the same group."""
FIRST_WINS = "first_wins"
LAST_WINS = "last_wins"
NAMESPACED = "namespaced"
REJECT = "reject"
ENV_POLICY = "JW_PLUGINS_CONFLICT_POLICY"
ENV_ALLOW = "JW_PLUGINS_ALLOW_LIST"
ENV_DENY = "JW_PLUGINS_DENY_LIST"
ENV_DISABLED = "JW_PLUGINS_DISABLED"
ENV_STRICT = "JW_PLUGINS_STRICT"
def read_env_set(var: str) -> set[str] | None:
"""Parse a CSV env var. Missing or empty → None (no filter)."""
raw = os.getenv(var, "").strip()
if not raw:
return None
return {piece.strip() for piece in raw.split(",") if piece.strip()}
def read_policy_from_env() -> ConflictPolicy:
"""Resolve the conflict policy from env; default NAMESPACED."""
raw = os.getenv(ENV_POLICY, "").strip().lower()
if not raw:
return ConflictPolicy.NAMESPACED
try:
return ConflictPolicy(raw)
except ValueError:
logger.warning(
"ignoring invalid %s=%r — falling back to NAMESPACED", ENV_POLICY, raw
)
return ConflictPolicy.NAMESPACED
def apply_conflict_policy(
current: dict[str, EntryPointSpec],
new: EntryPointSpec,
policy: ConflictPolicy,
) -> dict[str, EntryPointSpec]:
"""Return an updated mapping after applying `policy` to (current, new).
`current` is a fresh dict to mutate-and-return; callers should treat the
return value as authoritative.
"""
out = dict(current)
existing = out.get(new.name)
if existing is None:
out[new.name] = new
return out
if existing.dist_name == new.dist_name and existing.module == new.module:
# Same plugin coming back through different scans — no real conflict.
return out
logger.warning(
"plugin name conflict in group %s: %r claimed by both %s and %s (policy=%s)",
new.group,
new.name,
existing.dist_name,
new.dist_name,
policy.value,
)
if policy is ConflictPolicy.FIRST_WINS:
return out
if policy is ConflictPolicy.LAST_WINS:
out[new.name] = new
return out
if policy is ConflictPolicy.NAMESPACED:
out.pop(new.name, None)
out[existing.namespaced_name] = existing
out[new.namespaced_name] = new
return out
if policy is ConflictPolicy.REJECT:
raise PluginConflictError(
name=new.name,
group=new.group,
dist_names=(existing.dist_name, new.dist_name),
policy=policy.value,
)
return out # pragma: no cover (exhaustive enum)
- Step 4: Run test to verify it passes
Run: uv run pytest packages/jw-core/tests/test_plugins_policy.py -v
Expected: 13 passed.
- Step 5: Commit
git add packages/jw-core/src/jw_core/plugins/policy.py packages/jw-core/tests/test_plugins_policy.py
git commit -m "feat(plugin-sdk): conflict policies + env helpers"
Task 4: Registry — discovery with monkey-patched entry points
Files:
-
Create:
packages/jw-core/src/jw_core/plugins/registry.py -
Create:
packages/jw-core/tests/conftest_plugins.py -
Create:
packages/jw-core/tests/test_plugins_registry.py -
Step 1: Write the autouse conftest for cache reset
# packages/jw-core/tests/conftest_plugins.py
"""Shared fixtures for plugin tests.
This file is auto-loaded via plain `conftest.py` re-export if the package
already has one; otherwise it's imported explicitly by individual modules.
The critical bit is the `_clear_plugin_cache` autouse fixture: without it,
`lru_cache` would leak across tests.
"""
from __future__ import annotations
import pytest
@pytest.fixture(autouse=True)
def _clear_plugin_cache() -> None:
from jw_core.plugins import clear_plugin_cache
clear_plugin_cache()
yield
clear_plugin_cache()
Hook it into packages/jw-core/tests/conftest.py (append, do not overwrite existing fixtures):
# packages/jw-core/tests/conftest.py — APPEND
from tests.conftest_plugins import _clear_plugin_cache # noqa: F401
If tests/conftest.py does not yet exist, create it with:
# packages/jw-core/tests/conftest.py
"""Shared pytest fixtures for jw-core tests."""
from tests.conftest_plugins import _clear_plugin_cache # noqa: F401
- Step 2: Write the failing test
# packages/jw-core/tests/test_plugins_registry.py
"""Tests for jw_core.plugins.registry — discovery with monkey-patched entry points."""
from __future__ import annotations
from importlib.metadata import EntryPoint
from typing import Any
import pytest
from jw_core.plugins.registry import _discover, _entry_points_for_group
def _ep(name: str, group: str, dist_name: str = "pkg-a", value: str | None = None) -> EntryPoint:
"""Build an EntryPoint pointing to a real callable in this test module."""
return EntryPoint(
name=name,
value=value or f"tests.fakes.agent_module:my_agent",
group=group,
)
def _patch_entry_points(
monkeypatch: pytest.MonkeyPatch,
mapping: dict[str, list[tuple[EntryPoint, str, str]]],
) -> None:
"""Patch importlib.metadata.entry_points + distribution lookups.
`mapping` is group → list of (ep, dist_name, dist_version).
"""
def fake_eps(*, group: str | None = None, **_: Any):
if group is None:
flat: list[EntryPoint] = []
for vals in mapping.values():
flat.extend(ep for ep, _, _ in vals)
return flat
return [ep for ep, _, _ in mapping.get(group, [])]
def fake_dist_for_ep(ep: EntryPoint) -> tuple[str, str]:
for vals in mapping.values():
for got_ep, name, ver in vals:
if got_ep is ep:
return name, ver
return "unknown", "0.0.0"
monkeypatch.setattr("jw_core.plugins.registry.entry_points", fake_eps)
monkeypatch.setattr(
"jw_core.plugins.registry._distribution_for_entry_point", fake_dist_for_ep
)
def test_entry_points_for_group_returns_list(monkeypatch: pytest.MonkeyPatch) -> None:
ep = _ep("foo", "jw_agent_toolkit.agents")
_patch_entry_points(monkeypatch, {"jw_agent_toolkit.agents": [(ep, "pkg-a", "1.0")]})
got = _entry_points_for_group("jw_agent_toolkit.agents")
assert [e.name for e in got] == ["foo"]
def test_discover_returns_dict_keyed_by_name(monkeypatch: pytest.MonkeyPatch) -> None:
ep = _ep("translation_helper", "jw_agent_toolkit.agents")
_patch_entry_points(
monkeypatch,
{"jw_agent_toolkit.agents": [(ep, "trans-pkg", "1.2.3")]},
)
out = _discover("jw_agent_toolkit.agents")
assert "translation_helper" in out
spec = out["translation_helper"]
assert spec.dist_name == "trans-pkg"
assert spec.dist_version == "1.2.3"
def test_discover_filtered_by_allow_list(monkeypatch: pytest.MonkeyPatch) -> None:
monkeypatch.setenv("JW_PLUGINS_ALLOW_LIST", "wanted")
eps = [
_ep("wanted", "jw_agent_toolkit.agents"),
_ep("not_wanted", "jw_agent_toolkit.agents"),
]
_patch_entry_points(
monkeypatch,
{"jw_agent_toolkit.agents": [(eps[0], "pkg-a", "1"), (eps[1], "pkg-b", "1")]},
)
out = _discover("jw_agent_toolkit.agents")
assert set(out.keys()) == {"wanted"}
def test_discover_filtered_by_deny_list(monkeypatch: pytest.MonkeyPatch) -> None:
monkeypatch.setenv("JW_PLUGINS_DENY_LIST", "banned")
eps = [
_ep("ok", "jw_agent_toolkit.agents"),
_ep("banned", "jw_agent_toolkit.agents"),
]
_patch_entry_points(
monkeypatch,
{"jw_agent_toolkit.agents": [(eps[0], "pkg-a", "1"), (eps[1], "pkg-b", "1")]},
)
out = _discover("jw_agent_toolkit.agents")
assert set(out.keys()) == {"ok"}
def test_discover_conflict_namespaced_default(monkeypatch: pytest.MonkeyPatch) -> None:
eps = [
_ep("dup", "jw_agent_toolkit.agents"),
_ep("dup", "jw_agent_toolkit.agents"),
]
_patch_entry_points(
monkeypatch,
{"jw_agent_toolkit.agents": [(eps[0], "pkg-a", "1"), (eps[1], "pkg-b", "1")]},
)
out = _discover("jw_agent_toolkit.agents")
assert "dup" not in out
assert "pkg-a:dup" in out
assert "pkg-b:dup" in out
def test_discover_first_wins_via_env(monkeypatch: pytest.MonkeyPatch) -> None:
monkeypatch.setenv("JW_PLUGINS_CONFLICT_POLICY", "first_wins")
eps = [
_ep("dup", "jw_agent_toolkit.agents"),
_ep("dup", "jw_agent_toolkit.agents"),
]
_patch_entry_points(
monkeypatch,
{"jw_agent_toolkit.agents": [(eps[0], "pkg-a", "1"), (eps[1], "pkg-b", "1")]},
)
out = _discover("jw_agent_toolkit.agents")
assert "dup" in out
assert out["dup"].dist_name == "pkg-a"
def test_discover_disabled_short_circuits(monkeypatch: pytest.MonkeyPatch) -> None:
monkeypatch.setenv("JW_PLUGINS_DISABLED", "1")
eps = [_ep("foo", "jw_agent_toolkit.agents")]
_patch_entry_points(monkeypatch, {"jw_agent_toolkit.agents": [(eps[0], "pkg-a", "1")]})
assert _discover("jw_agent_toolkit.agents") == {}
def test_discover_unknown_group_returns_empty(monkeypatch: pytest.MonkeyPatch) -> None:
_patch_entry_points(monkeypatch, {})
assert _discover("jw_agent_toolkit.bogus") == {}
- Step 3: Run test to verify it fails
Run: uv run pytest packages/jw-core/tests/test_plugins_registry.py -v
Expected: FAIL — cannot import name '_discover'.
- Step 4: Implement registry
# packages/jw-core/src/jw_core/plugins/registry.py
"""Discovery via `importlib.metadata.entry_points`.
Steps:
1. Read group from importlib.metadata.entry_points(group=...).
2. Resolve distribution name + version per EntryPoint.
3. Build EntryPointSpec for each.
4. Apply ALLOW_LIST/DENY_LIST filters.
5. Fold via the active ConflictPolicy.
Cache lives in factory.py — registry is pure functions, no module-level state.
"""
from __future__ import annotations
import logging
import os
from importlib.metadata import EntryPoint, distributions, entry_points
from jw_core.plugins.contracts import EntryPointSpec
from jw_core.plugins.policy import (
ENV_DISABLED,
apply_conflict_policy,
read_env_set,
read_policy_from_env,
)
logger = logging.getLogger(__name__)
GROUPS: tuple[str, ...] = (
"jw_agent_toolkit.agents",
"jw_agent_toolkit.parsers",
"jw_agent_toolkit.embedders",
"jw_agent_toolkit.vlm_providers",
"jw_agent_toolkit.gen_providers",
)
def _entry_points_for_group(group: str) -> list[EntryPoint]:
"""Tiny wrapper for test seam — return list[EntryPoint] for the group."""
return list(entry_points(group=group))
def _distribution_for_entry_point(ep: EntryPoint) -> tuple[str, str]:
"""Find which distribution declared `ep`.
Returns (dist_name, dist_version). Falls back to ("unknown", "0.0.0") when
the EntryPoint was constructed standalone (tests, dynamic registration).
"""
target_module = ep.value.split(":", 1)[0]
for dist in distributions():
try:
dist_eps = list(dist.entry_points)
except Exception: # pragma: no cover (defensive)
continue
for d_ep in dist_eps:
if (
d_ep.name == ep.name
and d_ep.group == ep.group
and d_ep.value.split(":", 1)[0] == target_module
):
return dist.metadata["Name"], dist.metadata["Version"]
return "unknown", "0.0.0"
def _discover(group: str) -> dict[str, EntryPointSpec]:
"""Discover all plugins for `group` post-policy. Pure: no caching."""
if os.getenv(ENV_DISABLED, "").strip() == "1":
return {}
allow = read_env_set("JW_PLUGINS_ALLOW_LIST")
deny = read_env_set("JW_PLUGINS_DENY_LIST") or set()
policy = read_policy_from_env()
out: dict[str, EntryPointSpec] = {}
for ep in _entry_points_for_group(group):
if allow is not None and ep.name not in allow:
continue
if ep.name in deny:
continue
try:
module, _, attr = ep.value.partition(":")
if not module or not attr:
logger.warning(
"skipping malformed entry point %r in group %r (value=%r)",
ep.name,
group,
ep.value,
)
continue
dist_name, dist_version = _distribution_for_entry_point(ep)
spec = EntryPointSpec(
name=ep.name,
group=group,
module=module,
attr=attr,
dist_name=dist_name,
dist_version=dist_version,
)
out = apply_conflict_policy(out, spec, policy)
except Exception as exc: # noqa: BLE001
logger.warning(
"failed to register plugin %r in group %r: %s", ep.name, group, exc
)
return out
- Step 5: Run test to verify it passes
Run: uv run pytest packages/jw-core/tests/test_plugins_registry.py -v
Expected: 8 passed.
- Step 6: Commit
git add packages/jw-core/src/jw_core/plugins/registry.py packages/jw-core/tests/test_plugins_registry.py packages/jw-core/tests/conftest_plugins.py packages/jw-core/tests/conftest.py
git commit -m "feat(plugin-sdk): registry — entry-point discovery with policy"
Task 5: Verify contracts + version + report (verify.py)
Files:
-
Modify:
packages/jw-core/src/jw_core/plugins/verify.py -
Modify:
packages/jw-core/src/jw_core/plugins/contracts.py(addVerifyReport) -
Create:
packages/jw-core/tests/test_plugins_verify.py -
Step 1: Append
VerifyReportto contracts.py
Append to packages/jw-core/src/jw_core/plugins/contracts.py:
# ---------------------------------------------------------------------------
# VerifyReport — structured outcome of verify_plugin()
# ---------------------------------------------------------------------------
@dataclass(frozen=True)
class VerifyReport:
"""Structured report from `verify_plugin(name, group)`."""
name: str
group: str
dist_name: str
dist_version: str
ok: bool
required_present: tuple[str, ...]
required_missing: tuple[str, ...]
optional_present: tuple[str, ...]
optional_missing: tuple[str, ...]
version_constraint: str | None
version_satisfied: bool
errors: tuple[str, ...]
- Step 2: Write the failing test
# packages/jw-core/tests/test_plugins_verify.py
"""Tests for jw_core.plugins.verify."""
from __future__ import annotations
from typing import Any
import pytest
from jw_core.plugins.contracts import EntryPointSpec
from jw_core.plugins.errors import (
PluginContractError,
PluginError,
PluginVersionMismatch,
)
from jw_core.plugins.verify import (
REQUIRED_BY_GROUP,
OPTIONAL_BY_GROUP,
_verify_spec,
verify_plugin,
)
# ---- shape-only test seams ------------------------------------------------
class _GoodAgent:
__name__ = "good_agent"
async def __call__(self, **kwargs: Any) -> Any:
return kwargs
class _BadAgent:
"""Lacks __call__ entirely."""
__name__ = "bad_agent"
class _GoodEmbedder:
name = "good_emb"
target = "cpu"
dim = 8
def is_available(self) -> bool:
return True
def embed(self, texts: list[str]) -> list[list[float]]:
return [[0.0] * self.dim for _ in texts]
def _spec(group: str, name: str = "x", dist_name: str = "demo", version: str = "1.0.0") -> EntryPointSpec:
return EntryPointSpec(
name=name,
group=group,
module="some.mod",
attr=name,
dist_name=dist_name,
dist_version=version,
)
def test_required_by_group_keys_match_known_groups() -> None:
assert set(REQUIRED_BY_GROUP) == {
"jw_agent_toolkit.agents",
"jw_agent_toolkit.parsers",
"jw_agent_toolkit.embedders",
"jw_agent_toolkit.vlm_providers",
"jw_agent_toolkit.gen_providers",
}
def test_optional_by_group_subset_of_required_keys() -> None:
assert set(OPTIONAL_BY_GROUP) == set(REQUIRED_BY_GROUP)
def test_verify_spec_happy_agent() -> None:
report = _verify_spec(_spec("jw_agent_toolkit.agents"), target=_GoodAgent())
assert report.ok
assert "__call__" in report.required_present
assert report.required_missing == ()
def test_verify_spec_missing_call() -> None:
report = _verify_spec(_spec("jw_agent_toolkit.agents"), target=_BadAgent())
assert not report.ok
assert "__call__" in report.required_missing
def test_verify_spec_optional_present() -> None:
class Agent:
__name__ = "withlang"
languages = ["en", "es"]
async def __call__(self, **kwargs: Any) -> Any:
return kwargs
report = _verify_spec(_spec("jw_agent_toolkit.agents"), target=Agent())
assert "languages" in report.optional_present
assert "version" in report.optional_missing
def test_verify_spec_embedder_happy() -> None:
report = _verify_spec(_spec("jw_agent_toolkit.embedders"), target=_GoodEmbedder())
assert report.ok
assert set(report.required_present) == {"name", "target", "dim", "is_available", "embed"}
def test_verify_spec_version_constraint_satisfied(monkeypatch: pytest.MonkeyPatch) -> None:
monkeypatch.setattr("jw_core.__version__", "1.5.0", raising=False)
spec = _spec("jw_agent_toolkit.agents")
report = _verify_spec(
spec,
target=_GoodAgent(),
plugin_dependencies=("jw-agent-toolkit>=1.0,<2.0",),
)
assert report.version_satisfied
assert report.version_constraint == "jw-agent-toolkit>=1.0,<2.0"
def test_verify_spec_version_constraint_violated(monkeypatch: pytest.MonkeyPatch) -> None:
monkeypatch.setattr("jw_core.__version__", "0.1.0", raising=False)
spec = _spec("jw_agent_toolkit.agents")
report = _verify_spec(
spec,
target=_GoodAgent(),
plugin_dependencies=("jw-agent-toolkit>=99.0",),
)
assert not report.version_satisfied
assert not report.ok
def test_verify_plugin_strict_raises_on_contract(monkeypatch: pytest.MonkeyPatch) -> None:
monkeypatch.setenv("JW_PLUGINS_STRICT", "1")
spec = _spec("jw_agent_toolkit.agents", name="bad")
def fake_resolve_spec(name: str, group: str) -> tuple[EntryPointSpec, Any]: # noqa: ARG001
return spec, _BadAgent()
monkeypatch.setattr("jw_core.plugins.verify._resolve_spec", fake_resolve_spec)
with pytest.raises(PluginContractError):
verify_plugin("bad", "jw_agent_toolkit.agents")
def test_verify_plugin_strict_raises_on_version(monkeypatch: pytest.MonkeyPatch) -> None:
monkeypatch.setenv("JW_PLUGINS_STRICT", "1")
monkeypatch.setattr("jw_core.__version__", "0.1.0", raising=False)
spec = _spec("jw_agent_toolkit.agents", name="vmiss")
def fake_resolve_spec(name: str, group: str) -> tuple[EntryPointSpec, Any]: # noqa: ARG001
return spec, _GoodAgent()
def fake_deps(_: EntryPointSpec) -> tuple[str, ...]:
return ("jw-agent-toolkit>=99.0",)
monkeypatch.setattr("jw_core.plugins.verify._resolve_spec", fake_resolve_spec)
monkeypatch.setattr("jw_core.plugins.verify._plugin_dependencies", fake_deps)
with pytest.raises(PluginVersionMismatch):
verify_plugin("vmiss", "jw_agent_toolkit.agents")
def test_verify_plugin_soft_returns_report(monkeypatch: pytest.MonkeyPatch) -> None:
monkeypatch.delenv("JW_PLUGINS_STRICT", raising=False)
spec = _spec("jw_agent_toolkit.agents", name="bad")
def fake_resolve_spec(name: str, group: str) -> tuple[EntryPointSpec, Any]: # noqa: ARG001
return spec, _BadAgent()
monkeypatch.setattr("jw_core.plugins.verify._resolve_spec", fake_resolve_spec)
report = verify_plugin("bad", "jw_agent_toolkit.agents")
assert not report.ok
assert "__call__" in report.required_missing
def test_verify_plugin_unknown_raises_plugin_error(monkeypatch: pytest.MonkeyPatch) -> None:
monkeypatch.setattr(
"jw_core.plugins.verify.get_plugins",
lambda group: {}, # noqa: ARG005
)
with pytest.raises(PluginError):
verify_plugin("ghost", "jw_agent_toolkit.agents")
- Step 3: Run test to verify it fails
Run: uv run pytest packages/jw-core/tests/test_plugins_verify.py -v
Expected: FAIL — cannot import name 'REQUIRED_BY_GROUP'.
- Step 4: Implement verify
Replace packages/jw-core/src/jw_core/plugins/verify.py:
# packages/jw-core/src/jw_core/plugins/verify.py
"""Contract + version verification.
`verify_plugin(name, group)` returns a `VerifyReport` describing exactly which
required/optional attributes were found, and whether the plugin's
`jw-agent-toolkit` version constraint is satisfied.
Under `JW_PLUGINS_STRICT=1`, the function raises instead of returning a
report whose `ok` field is False.
"""
from __future__ import annotations
import logging
import os
from importlib.metadata import distribution
from typing import Any
from packaging.requirements import Requirement
from packaging.version import Version
import jw_core
from jw_core.plugins.contracts import EntryPointSpec, VerifyReport
from jw_core.plugins.errors import (
PluginContractError,
PluginError,
PluginVersionMismatch,
)
from jw_core.plugins.factory import get_plugins
from jw_core.plugins.policy import ENV_STRICT
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Per-group required + optional attribute lists.
# Required = MUST be present (verify fails otherwise).
# Optional = MAY be present (capability matrix; degrade gracefully).
# ---------------------------------------------------------------------------
REQUIRED_BY_GROUP: dict[str, tuple[str, ...]] = {
"jw_agent_toolkit.agents": ("__call__",),
"jw_agent_toolkit.parsers": ("__call__",),
"jw_agent_toolkit.embedders": ("name", "target", "dim", "is_available", "embed"),
"jw_agent_toolkit.vlm_providers": ("name", "is_available", "describe"),
"jw_agent_toolkit.gen_providers": ("name", "is_available", "generate"),
}
OPTIONAL_BY_GROUP: dict[str, tuple[str, ...]] = {
"jw_agent_toolkit.agents": ("languages", "version", "cost_estimate"),
"jw_agent_toolkit.parsers": ("extensions", "mime_types"),
"jw_agent_toolkit.embedders": ("max_tokens",),
"jw_agent_toolkit.vlm_providers": ("languages",),
"jw_agent_toolkit.gen_providers": ("max_tokens", "supports_streaming"),
}
def _plugin_dependencies(spec: EntryPointSpec) -> tuple[str, ...]:
"""Read declared `Requires-Dist` for the plugin's distribution.
Returns the raw requirement strings; the caller parses with packaging.
Empty tuple if distribution can't be resolved (test environments).
"""
try:
dist = distribution(spec.dist_name)
except Exception: # noqa: BLE001
return ()
return tuple(dist.requires or ())
def _check_version_constraint(
requirements: tuple[str, ...],
installed_version: str,
) -> tuple[str | None, bool]:
"""Find the jw-agent-toolkit constraint (if any) and check it."""
for raw in requirements:
try:
req = Requirement(raw)
except Exception: # noqa: BLE001
continue
if req.name.lower().replace("_", "-") != "jw-agent-toolkit":
continue
if req.specifier and not req.specifier.contains(
Version(installed_version), prereleases=True
):
return raw, False
return raw, True
return None, True
def _verify_spec(
spec: EntryPointSpec,
*,
target: Any,
plugin_dependencies: tuple[str, ...] | None = None,
) -> VerifyReport:
"""Compute a VerifyReport for an already-resolved plugin target."""
required = REQUIRED_BY_GROUP.get(spec.group, ())
optional = OPTIONAL_BY_GROUP.get(spec.group, ())
req_present = tuple(a for a in required if hasattr(target, a))
req_missing = tuple(a for a in required if not hasattr(target, a))
opt_present = tuple(a for a in optional if hasattr(target, a))
opt_missing = tuple(a for a in optional if not hasattr(target, a))
deps = plugin_dependencies if plugin_dependencies is not None else _plugin_dependencies(spec)
installed = jw_core.__version__
constraint, satisfied = _check_version_constraint(deps, installed)
ok = not req_missing and satisfied
return VerifyReport(
name=spec.name,
group=spec.group,
dist_name=spec.dist_name,
dist_version=spec.dist_version,
ok=ok,
required_present=req_present,
required_missing=req_missing,
optional_present=opt_present,
optional_missing=opt_missing,
version_constraint=constraint,
version_satisfied=satisfied,
errors=(),
)
def _resolve_spec(name: str, group: str) -> tuple[EntryPointSpec, Any]:
"""Pull the spec from the registry and load its target."""
plugins = get_plugins(group)
spec = plugins.get(name)
if spec is None:
# Also accept namespaced lookups (dist:name).
for v in plugins.values():
if v.namespaced_name == name:
spec = v
break
if spec is None:
raise PluginError(f"plugin {name!r} not found in group {group!r}")
return spec, spec.resolve()
def verify_plugin(name: str, group: str) -> VerifyReport:
"""Verify a discovered plugin. Strict mode raises; soft mode returns report."""
spec, target = _resolve_spec(name, group)
report = _verify_spec(spec, target=target)
strict = os.getenv(ENV_STRICT, "").strip() == "1"
if not report.version_satisfied:
if strict:
raise PluginVersionMismatch(
plugin_name=name,
constraint=report.version_constraint or "<unknown>",
installed_version=jw_core.__version__,
)
logger.warning(
"plugin %r requires %r but installed %r — skipping",
name,
report.version_constraint,
jw_core.__version__,
)
if report.required_missing:
if strict:
raise PluginContractError(
plugin_name=name,
group=group,
missing=list(report.required_missing),
)
logger.warning(
"plugin %r in group %r missing required attrs: %s",
name,
group,
list(report.required_missing),
)
return report
- Step 5: Run test to verify it passes
Run: uv run pytest packages/jw-core/tests/test_plugins_verify.py -v
Expected: 11 passed.
- Step 6: Commit
git add packages/jw-core/src/jw_core/plugins/verify.py packages/jw-core/src/jw_core/plugins/contracts.py packages/jw-core/tests/test_plugins_verify.py
git commit -m "feat(plugin-sdk): verify_plugin + VerifyReport + version check"
Task 6: Factory — cached get_plugins + clear_plugin_cache
Files:
-
Modify:
packages/jw-core/src/jw_core/plugins/factory.py -
Create:
packages/jw-core/tests/test_plugins_factory.py -
Step 1: Write the failing test
# packages/jw-core/tests/test_plugins_factory.py
"""Tests for jw_core.plugins.factory."""
from __future__ import annotations
from importlib.metadata import EntryPoint
import pytest
from jw_core.plugins import clear_plugin_cache, get_plugins
from jw_core.plugins.errors import PluginError
def _patch_eps(
monkeypatch: pytest.MonkeyPatch,
mapping: dict[str, list[tuple[EntryPoint, str, str]]],
) -> list[int]:
calls: list[int] = []
def fake_eps(*, group: str | None = None, **_):
calls.append(1)
if group is None:
return []
return [ep for ep, _, _ in mapping.get(group, [])]
def fake_dist(ep: EntryPoint) -> tuple[str, str]:
for vals in mapping.values():
for got, name, ver in vals:
if got is ep:
return name, ver
return "unknown", "0.0.0"
monkeypatch.setattr("jw_core.plugins.registry.entry_points", fake_eps)
monkeypatch.setattr("jw_core.plugins.registry._distribution_for_entry_point", fake_dist)
return calls
def test_get_plugins_returns_dict(monkeypatch: pytest.MonkeyPatch) -> None:
ep = EntryPoint(name="foo", value="some.mod:foo", group="jw_agent_toolkit.agents")
_patch_eps(monkeypatch, {"jw_agent_toolkit.agents": [(ep, "pkg", "1.0")]})
out = get_plugins("jw_agent_toolkit.agents")
assert "foo" in out
assert out["foo"].dist_name == "pkg"
def test_get_plugins_is_cached(monkeypatch: pytest.MonkeyPatch) -> None:
ep = EntryPoint(name="foo", value="some.mod:foo", group="jw_agent_toolkit.agents")
calls = _patch_eps(monkeypatch, {"jw_agent_toolkit.agents": [(ep, "pkg", "1.0")]})
get_plugins("jw_agent_toolkit.agents")
get_plugins("jw_agent_toolkit.agents")
get_plugins("jw_agent_toolkit.agents")
assert len(calls) == 1 # only first call hits entry_points
def test_clear_plugin_cache_forces_rediscovery(monkeypatch: pytest.MonkeyPatch) -> None:
ep = EntryPoint(name="foo", value="some.mod:foo", group="jw_agent_toolkit.agents")
calls = _patch_eps(monkeypatch, {"jw_agent_toolkit.agents": [(ep, "pkg", "1.0")]})
get_plugins("jw_agent_toolkit.agents")
clear_plugin_cache()
get_plugins("jw_agent_toolkit.agents")
assert len(calls) == 2
def test_get_plugins_rejects_unknown_group(monkeypatch: pytest.MonkeyPatch) -> None:
_patch_eps(monkeypatch, {})
with pytest.raises(PluginError):
get_plugins("jw_agent_toolkit.totally_made_up")
def test_get_plugins_empty_when_no_entries(monkeypatch: pytest.MonkeyPatch) -> None:
_patch_eps(monkeypatch, {})
assert get_plugins("jw_agent_toolkit.agents") == {}
def test_get_plugins_returns_copy(monkeypatch: pytest.MonkeyPatch) -> None:
ep = EntryPoint(name="foo", value="some.mod:foo", group="jw_agent_toolkit.agents")
_patch_eps(monkeypatch, {"jw_agent_toolkit.agents": [(ep, "pkg", "1.0")]})
out_a = get_plugins("jw_agent_toolkit.agents")
out_a["INJECTED"] = out_a["foo"]
out_b = get_plugins("jw_agent_toolkit.agents")
assert "INJECTED" not in out_b
- Step 2: Run test to verify it fails
Run: uv run pytest packages/jw-core/tests/test_plugins_factory.py -v
Expected: FAIL — factory still stubbed, won’t actually discover.
- Step 3: Implement factory
Replace packages/jw-core/src/jw_core/plugins/factory.py:
# packages/jw-core/src/jw_core/plugins/factory.py
"""Public facade: cached `get_plugins(group)` + `clear_plugin_cache`."""
from __future__ import annotations
from functools import lru_cache
from jw_core.plugins.contracts import EntryPointSpec
from jw_core.plugins.errors import PluginError
from jw_core.plugins.registry import GROUPS, _discover
@lru_cache(maxsize=None)
def _cached_discover(group: str) -> tuple[tuple[str, EntryPointSpec], ...]:
"""Internal cached layer. Returns sorted-tuple form so `lru_cache` is happy.
We can't cache a `dict` directly (mutable, unhashable). Tuple-of-pairs
round-trips cheaply.
"""
if group not in GROUPS:
raise PluginError(
f"unknown plugin group {group!r}; expected one of {list(GROUPS)}"
)
discovered = _discover(group)
return tuple(sorted(discovered.items()))
def get_plugins(group: str) -> dict[str, EntryPointSpec]:
"""Return all plugins for `group`, post-policy + post-filter. Cached per process.
The returned dict is a fresh copy each call — callers can mutate freely.
"""
return dict(_cached_discover(group))
def clear_plugin_cache() -> None:
"""Reset the discovery cache. Useful in tests; idempotent."""
_cached_discover.cache_clear()
- Step 4: Run test to verify it passes
Run: uv run pytest packages/jw-core/tests/test_plugins_factory.py -v
Expected: 6 passed.
- Step 5: Run the full plugins test set to catch regressions
Run: uv run pytest packages/jw-core/tests/test_plugins_*.py -v
Expected: all green so far (~47 tests).
- Step 6: Commit
git add packages/jw-core/src/jw_core/plugins/factory.py packages/jw-core/tests/test_plugins_factory.py
git commit -m "feat(plugin-sdk): cached get_plugins + clear_plugin_cache"
Task 7: Fixture plugin package (plugin_sample)
Files:
-
Create:
packages/jw-core/tests/fixtures/plugin_sample/pyproject.toml -
Create:
packages/jw-core/tests/fixtures/plugin_sample/README.md -
Create:
packages/jw-core/tests/fixtures/plugin_sample/src/plugin_sample/__init__.py -
Create:
packages/jw-core/tests/fixtures/plugin_sample/src/plugin_sample/agent.py -
Create:
packages/jw-core/tests/fixtures/plugin_sample/src/plugin_sample/parser.py -
Create:
packages/jw-core/tests/fixtures/plugin_sample/src/plugin_sample/embedder.py -
Create:
packages/jw-core/tests/fixtures/plugin_sample/src/plugin_sample/vlm.py -
Create:
packages/jw-core/tests/fixtures/plugin_sample/src/plugin_sample/gen.py -
Step 1: Create the fixture
pyproject.toml
# packages/jw-core/tests/fixtures/plugin_sample/pyproject.toml
[project]
name = "plugin-sample"
version = "0.1.0"
description = "Test fixture: a third-party plugin for jw-agent-toolkit"
requires-python = ">=3.13"
dependencies = []
[project.entry-points."jw_agent_toolkit.agents"]
plugin_sample_agent = "plugin_sample.agent:sample_agent"
[project.entry-points."jw_agent_toolkit.parsers"]
plugin_sample_parser = "plugin_sample.parser:sample_parser"
[project.entry-points."jw_agent_toolkit.embedders"]
plugin_sample_embedder = "plugin_sample.embedder:SampleEmbedder"
[project.entry-points."jw_agent_toolkit.vlm_providers"]
plugin_sample_vlm = "plugin_sample.vlm:SampleVLM"
[project.entry-points."jw_agent_toolkit.gen_providers"]
plugin_sample_gen = "plugin_sample.gen:SampleGen"
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[tool.hatch.build.targets.wheel]
packages = ["src/plugin_sample"]
- Step 2: Create the README
# plugin_sample
Test fixture package used by `jw-core`'s plugin-SDK e2e tests. NOT for
publication. Installed locally via `uv pip install -e` from the test runner.
Registers one entry point in each of the 5 jw_agent_toolkit.* groups.
- Step 3: Create the package init
# packages/jw-core/tests/fixtures/plugin_sample/src/plugin_sample/__init__.py
"""plugin_sample — fixture used by jw-core's plugin SDK tests."""
__version__ = "0.1.0"
- Step 4: Create the 5 plugin modules
# packages/jw-core/tests/fixtures/plugin_sample/src/plugin_sample/agent.py
"""Agent stub. Returns a deterministic payload."""
from __future__ import annotations
from typing import Any
async def sample_agent(**kwargs: Any) -> dict[str, Any]:
"""Plugin agent — echoes its kwargs in a shape compatible with AgentResult."""
return {"findings": [], "echo": kwargs, "agent": "plugin_sample_agent"}
sample_agent.__name__ = "plugin_sample_agent"
sample_agent.languages = ["en", "es"] # type: ignore[attr-defined]
sample_agent.version = "0.1.0" # type: ignore[attr-defined]
# packages/jw-core/tests/fixtures/plugin_sample/src/plugin_sample/parser.py
"""Parser stub. Returns a dict with the raw payload."""
from __future__ import annotations
def sample_parser(raw: bytes | str, *, source_url: str | None = None) -> dict:
"""Returns a ParsedDocument-like dict."""
text = raw.decode("utf-8") if isinstance(raw, bytes) else raw
return {
"text": text,
"source_url": source_url,
"parser": "plugin_sample_parser",
}
sample_parser.extensions = [".sample"] # type: ignore[attr-defined]
sample_parser.mime_types = ["application/x-plugin-sample"] # type: ignore[attr-defined]
# packages/jw-core/tests/fixtures/plugin_sample/src/plugin_sample/embedder.py
"""Embedder stub. Deterministic zero vectors."""
from __future__ import annotations
class SampleEmbedder:
name = "plugin_sample_embedder"
target = "cpu"
dim = 8
def is_available(self) -> bool:
return True
def embed(self, texts: list[str]) -> list[list[float]]:
return [[0.0] * self.dim for _ in texts]
# packages/jw-core/tests/fixtures/plugin_sample/src/plugin_sample/vlm.py
"""VLM stub."""
from __future__ import annotations
class SampleVLM:
name = "plugin_sample_vlm"
def is_available(self) -> bool:
return True
def describe(self, image_bytes: bytes, *, language: str = "en") -> str:
return f"plugin_sample_vlm[{language}] len={len(image_bytes)}"
# packages/jw-core/tests/fixtures/plugin_sample/src/plugin_sample/gen.py
"""Gen provider stub."""
from __future__ import annotations
class SampleGen:
name = "plugin_sample_gen"
def is_available(self) -> bool:
return True
def generate(self, prompt: str, *, max_tokens: int = 128) -> str:
return f"plugin_sample_gen[{max_tokens}]: {prompt}"
- Step 5: Verify the fixture package builds
Run:
cd packages/jw-core/tests/fixtures/plugin_sample && uv build 2>&1 | tail -5
Expected: a wheel under dist/. Discard the wheel (we install editable, not the wheel).
- Step 6: Commit
git add packages/jw-core/tests/fixtures/plugin_sample
git commit -m "test(plugin-sdk): fixture package 'plugin_sample' registering 5 entry points"
Task 8: E2E test — install fixture in subprocess and verify discovery
Files:
-
Create:
packages/jw-core/tests/test_plugins_e2e.py -
Step 1: Write the failing test
# packages/jw-core/tests/test_plugins_e2e.py
"""End-to-end: install plugin_sample with `uv pip install -e` in a subprocess
that creates an ephemeral venv, then run discovery from inside that venv via
`-c` invocations.
Why subprocess: `importlib.metadata` is process-cached; we need a clean
interpreter to see the fixture's entry points without leaking into other tests.
"""
from __future__ import annotations
import json
import os
import shutil
import subprocess
import sys
from pathlib import Path
import pytest
FIXTURE = Path(__file__).parent / "fixtures" / "plugin_sample"
REPO_ROOT = Path(__file__).resolve().parents[3]
def _have_uv() -> bool:
return shutil.which("uv") is not None
pytestmark = pytest.mark.skipif(not _have_uv(), reason="uv not installed")
@pytest.fixture(scope="module")
def plugin_venv(tmp_path_factory: pytest.TempPathFactory) -> Path:
"""Create an isolated venv with jw-core + plugin_sample installed editable."""
venv = tmp_path_factory.mktemp("plugin_venv")
subprocess.run(["uv", "venv", str(venv)], check=True, capture_output=True)
py = venv / ("Scripts" if sys.platform == "win32" else "bin") / "python"
# Install jw-core editable + packaging
subprocess.run(
[
"uv",
"pip",
"install",
"--python",
str(py),
"-e",
str(REPO_ROOT / "packages" / "jw-core"),
"packaging",
],
check=True,
capture_output=True,
)
# Install the fixture editable
subprocess.run(
["uv", "pip", "install", "--python", str(py), "-e", str(FIXTURE)],
check=True,
capture_output=True,
)
return py
def _run_in_venv(py: Path, code: str, env: dict[str, str] | None = None) -> str:
"""Run `code` in the venv and return stdout (stripped)."""
full_env = {**os.environ, **(env or {})}
out = subprocess.run(
[str(py), "-c", code],
check=True,
capture_output=True,
env=full_env,
text=True,
)
return out.stdout.strip()
def test_e2e_agent_discovered(plugin_venv: Path) -> None:
code = (
"import json\n"
"from jw_core.plugins import get_plugins\n"
"plugins = get_plugins('jw_agent_toolkit.agents')\n"
"print(json.dumps(sorted(plugins.keys())))\n"
)
out = _run_in_venv(plugin_venv, code)
names = json.loads(out)
assert "plugin_sample_agent" in names
def test_e2e_all_five_groups_discovered(plugin_venv: Path) -> None:
code = (
"import json\n"
"from jw_core.plugins import get_plugins\n"
"groups = ["
" 'jw_agent_toolkit.agents',"
" 'jw_agent_toolkit.parsers',"
" 'jw_agent_toolkit.embedders',"
" 'jw_agent_toolkit.vlm_providers',"
" 'jw_agent_toolkit.gen_providers',"
"]\n"
"out = {g: sorted(get_plugins(g).keys()) for g in groups}\n"
"print(json.dumps(out))\n"
)
out = _run_in_venv(plugin_venv, code)
parsed = json.loads(out)
assert "plugin_sample_agent" in parsed["jw_agent_toolkit.agents"]
assert "plugin_sample_parser" in parsed["jw_agent_toolkit.parsers"]
assert "plugin_sample_embedder" in parsed["jw_agent_toolkit.embedders"]
assert "plugin_sample_vlm" in parsed["jw_agent_toolkit.vlm_providers"]
assert "plugin_sample_gen" in parsed["jw_agent_toolkit.gen_providers"]
def test_e2e_verify_plugin_reports_ok(plugin_venv: Path) -> None:
code = (
"import json\n"
"from jw_core.plugins import verify_plugin\n"
"rep = verify_plugin('plugin_sample_agent', 'jw_agent_toolkit.agents')\n"
"print(json.dumps({"
" 'ok': rep.ok,"
" 'required_present': list(rep.required_present),"
" 'required_missing': list(rep.required_missing),"
" 'optional_present': list(rep.optional_present),"
" 'dist_name': rep.dist_name,"
" 'version_satisfied': rep.version_satisfied,"
"}))\n"
)
out = _run_in_venv(plugin_venv, code)
rep = json.loads(out)
assert rep["ok"]
assert "__call__" in rep["required_present"]
assert "languages" in rep["optional_present"]
assert rep["dist_name"] == "plugin-sample"
def test_e2e_disabled_env_short_circuits(plugin_venv: Path) -> None:
code = (
"import json\n"
"from jw_core.plugins import get_plugins\n"
"print(json.dumps(list(get_plugins('jw_agent_toolkit.agents').keys())))\n"
)
out = _run_in_venv(plugin_venv, code, env={"JW_PLUGINS_DISABLED": "1"})
assert json.loads(out) == []
def test_e2e_allow_list_filters(plugin_venv: Path) -> None:
code = (
"import json\n"
"from jw_core.plugins import get_plugins\n"
"print(json.dumps(sorted(get_plugins('jw_agent_toolkit.agents').keys())))\n"
)
out = _run_in_venv(
plugin_venv, code, env={"JW_PLUGINS_ALLOW_LIST": "nonexistent_only"}
)
assert json.loads(out) == []
def test_e2e_resolve_runs_callable(plugin_venv: Path) -> None:
code = (
"import asyncio, json\n"
"from jw_core.plugins import get_plugins\n"
"spec = get_plugins('jw_agent_toolkit.agents')['plugin_sample_agent']\n"
"fn = spec.resolve()\n"
"got = asyncio.run(fn(question='hi'))\n"
"print(json.dumps({'agent': got['agent'], 'q': got['echo']['question']}))\n"
)
out = _run_in_venv(plugin_venv, code)
parsed = json.loads(out)
assert parsed["agent"] == "plugin_sample_agent"
assert parsed["q"] == "hi"
- Step 2: Run test to verify it fails (then passes after fixture install)
Run: uv run pytest packages/jw-core/tests/test_plugins_e2e.py -v -s
Expected: 6 passed. If uv is not on PATH it skips cleanly.
- Step 3: Commit
git add packages/jw-core/tests/test_plugins_e2e.py
git commit -m "test(plugin-sdk): e2e — fixture install + discovery in subprocess venv"
Task 9: Wire jw_core.plugins API into jw_core top-level
Files:
-
Modify:
packages/jw-core/src/jw_core/__init__.py -
Step 1: Read current init
Run: uv run python -c "import jw_core; print(jw_core.__file__)" to confirm path.
- Step 2: Append the re-export block
Append to packages/jw-core/src/jw_core/__init__.py:
# ---- Plugin SDK (Fase 41) -------------------------------------------------
# Exposed as `jw_core.plugins.*`. We import the submodule eagerly here so
# `from jw_core import plugins` works, but the heavy discovery is still lazy.
from jw_core import plugins as plugins # noqa: E402, F401
- Step 3: Smoke-test from a fresh interpreter
Run:
uv run python -c "
from jw_core import plugins
from jw_core.plugins import get_plugins, verify_plugin, clear_plugin_cache
from jw_core.plugins import PluginError, PluginConflictError, PluginContractError, PluginVersionMismatch
print('OK')
"
Expected: OK.
- Step 4: Commit
git add packages/jw-core/src/jw_core/__init__.py
git commit -m "feat(plugin-sdk): re-export jw_core.plugins from jw_core package init"
Task 10: Integrate plugins into jw-eval.default_agent_registry
Files:
-
Modify:
packages/jw-eval/src/jw_eval/cli.py -
Create:
packages/jw-eval/tests/test_plugin_integration.py -
Step 1: Write the failing test
# packages/jw-eval/tests/test_plugin_integration.py
"""Tests for jw-eval ↔ jw_core.plugins integration."""
from __future__ import annotations
from typing import Any
import pytest
from jw_core.plugins import clear_plugin_cache
from jw_core.plugins.contracts import EntryPointSpec
from jw_eval.cli import default_agent_registry
async def _fake_plugin_agent(**kwargs: Any) -> Any:
return {"findings": [], "echo": kwargs, "agent": "fake_plugin"}
@pytest.fixture(autouse=True)
def _reset_cache() -> None:
clear_plugin_cache()
yield
clear_plugin_cache()
def test_default_registry_includes_hardcoded_agents() -> None:
reg = default_agent_registry()
assert "apologetics" in reg
assert "verse_explainer" in reg
def test_default_registry_merges_plugin_agents(monkeypatch: pytest.MonkeyPatch) -> None:
spec = EntryPointSpec(
name="fake_plugin",
group="jw_agent_toolkit.agents",
module="dummy",
attr="fake_plugin",
dist_name="fake-pkg",
dist_version="0.1.0",
)
def fake_resolve(self: EntryPointSpec) -> Any: # noqa: ARG001
return _fake_plugin_agent
monkeypatch.setattr(EntryPointSpec, "resolve", fake_resolve, raising=True)
monkeypatch.setattr(
"jw_eval.cli.get_plugins",
lambda group: {"fake_plugin": spec} if group == "jw_agent_toolkit.agents" else {},
)
reg = default_agent_registry()
assert "fake_plugin" in reg
def test_plugin_does_not_override_core_agent(monkeypatch: pytest.MonkeyPatch) -> None:
# Plugin with same name as a hardcoded agent should NOT replace it.
spec = EntryPointSpec(
name="apologetics",
group="jw_agent_toolkit.agents",
module="dummy",
attr="apologetics",
dist_name="bad-pkg",
dist_version="0.1.0",
)
def fake_resolve(self: EntryPointSpec) -> Any: # noqa: ARG001
return _fake_plugin_agent
monkeypatch.setattr(EntryPointSpec, "resolve", fake_resolve, raising=True)
monkeypatch.setattr(
"jw_eval.cli.get_plugins",
lambda group: {"apologetics": spec} if group == "jw_agent_toolkit.agents" else {},
)
reg = default_agent_registry()
# core wins
assert reg["apologetics"] is not _fake_plugin_agent
- Step 2: Run test to verify it fails
Run: uv run pytest packages/jw-eval/tests/test_plugin_integration.py -v
Expected: FAIL — get_plugins not imported in jw_eval.cli.
- Step 3: Patch
default_agent_registryincli.py
Edit packages/jw-eval/src/jw_eval/cli.py. Add to imports near the top, after the existing from jw_eval.suite import Suite:
from jw_core.plugins import get_plugins
Replace the body of default_agent_registry (everything from def default_agent_registry to return registry) with:
def default_agent_registry() -> dict[str, Callable[[dict[str, Any]], Any]]:
"""Return the registry of real agents from jw-agents wrapped for sync invocation.
Hardcoded core agents take precedence; community plugins are merged after.
Plugins with the same name as a core agent are silently ignored — they
remain accessible via their namespaced form (dist:name).
"""
from jw_agents.apologetics import apologetics # type: ignore[import-not-found]
from jw_agents.conversation_assistant import conversation_assistant # type: ignore[import-not-found]
from jw_agents.letter_composer import letter_composer # type: ignore[import-not-found]
from jw_agents.life_topics import life_topics # type: ignore[import-not-found]
from jw_agents.meeting_helper import meeting_helper # type: ignore[import-not-found]
from jw_agents.news_monitor import news_monitor # type: ignore[import-not-found]
from jw_agents.research_topic import research_topic # type: ignore[import-not-found]
from jw_agents.student_part_helper import student_part_helper # type: ignore[import-not-found]
from jw_agents.study_conductor import prepare_lesson # type: ignore[import-not-found]
from jw_agents.verse_explainer import verse_explainer # type: ignore[import-not-found]
registry: dict[str, Callable[[dict[str, Any]], Any]] = {
"apologetics": _make_sync_wrapper(apologetics),
"conversation_assistant": _make_sync_wrapper(conversation_assistant),
"letter_composer": _make_sync_wrapper(letter_composer),
"life_topics": _make_sync_wrapper(life_topics),
"meeting_helper": _make_sync_wrapper(meeting_helper),
"news_monitor": _make_sync_wrapper(news_monitor),
"research_topic": _make_sync_wrapper(research_topic),
"study_conductor": _make_sync_wrapper(prepare_lesson),
"student_part_helper": _make_sync_wrapper(student_part_helper),
"verse_explainer": _make_sync_wrapper(verse_explainer),
}
# Fase 41 — merge community plugins. Core agents win; plugin sharing a
# name is silently skipped (still available via namespaced lookup).
for name, spec in get_plugins("jw_agent_toolkit.agents").items():
if name in registry:
continue
try:
registry[name] = _make_sync_wrapper(spec.resolve())
except Exception: # noqa: BLE001
# Plugin failed to load — exclude from registry, do not crash eval.
continue
return registry
- Step 4: Run test to verify it passes
Run: uv run pytest packages/jw-eval/tests/test_plugin_integration.py -v
Expected: 3 passed.
- Step 5: Run jw-eval regression
Run: uv run pytest packages/jw-eval/tests -v --tb=short
Expected: prior tests stay green. No regression in Fase 22 cases.
- Step 6: Commit
git add packages/jw-eval/src/jw_eval/cli.py packages/jw-eval/tests/test_plugin_integration.py
git commit -m "feat(jw-eval): merge plugin-SDK agents into default_agent_registry"
Task 11: Integrate plugins into jw-rag.embed_providers.factory
Files:
-
Modify:
packages/jw-rag/src/jw_rag/embed_providers/factory.py -
Create:
packages/jw-rag/tests/test_embed_plugins.py -
Step 1: Write the failing test
# packages/jw-rag/tests/test_embed_plugins.py
"""Tests for jw-rag ↔ jw_core.plugins embedder integration."""
from __future__ import annotations
from typing import Any
import numpy as np
import pytest
from jw_core.plugins import clear_plugin_cache
from jw_core.plugins.contracts import EntryPointSpec
from jw_rag.embed_providers.factory import _instantiate_registry
class _PluginEmbedder:
name = "plugin_test_emb"
target = "cpu"
dim = 4
def is_available(self) -> bool:
return True
def embed(self, texts: list[str]) -> np.ndarray:
return np.zeros((len(texts), self.dim), dtype=np.float32)
@pytest.fixture(autouse=True)
def _reset_cache() -> None:
clear_plugin_cache()
yield
clear_plugin_cache()
def test_instantiate_registry_includes_plugin(monkeypatch: pytest.MonkeyPatch) -> None:
spec = EntryPointSpec(
name="plugin_test_emb",
group="jw_agent_toolkit.embedders",
module="dummy",
attr="PluginEmb",
dist_name="plugin-pkg",
dist_version="0.1.0",
)
def fake_resolve(self: EntryPointSpec) -> Any: # noqa: ARG001
return _PluginEmbedder
monkeypatch.setattr(EntryPointSpec, "resolve", fake_resolve, raising=True)
monkeypatch.setattr(
"jw_rag.embed_providers.factory.get_plugins",
lambda group: {"plugin_test_emb": spec} if group == "jw_agent_toolkit.embedders" else {},
)
registry = _instantiate_registry()
names = [p.name for p in registry]
assert "plugin_test_emb" in names
def test_instantiate_registry_skips_broken_plugin(monkeypatch: pytest.MonkeyPatch) -> None:
spec = EntryPointSpec(
name="broken_emb",
group="jw_agent_toolkit.embedders",
module="dummy",
attr="X",
dist_name="broken",
dist_version="0.1.0",
)
def fake_resolve(self: EntryPointSpec) -> Any: # noqa: ARG001
raise RuntimeError("import failed")
monkeypatch.setattr(EntryPointSpec, "resolve", fake_resolve, raising=True)
monkeypatch.setattr(
"jw_rag.embed_providers.factory.get_plugins",
lambda group: {"broken_emb": spec} if group == "jw_agent_toolkit.embedders" else {},
)
registry = _instantiate_registry() # MUST NOT raise
names = [p.name for p in registry]
assert "broken_emb" not in names
- Step 2: Run test to verify it fails
Run: uv run pytest packages/jw-rag/tests/test_embed_plugins.py -v
Expected: FAIL — get_plugins not imported in factory.
- Step 3: Patch
_instantiate_registry
Edit packages/jw-rag/src/jw_rag/embed_providers/factory.py.
Add to imports (after import numpy as np):
from jw_core.plugins import get_plugins
Replace _instantiate_registry (the existing function body) with:
def _instantiate_registry() -> list[EmbedProvider]:
"""Build the full provider registry (real + fakes + plugins).
Plugin embedders from `jw_agent_toolkit.embedders` are appended after the
hardcoded ones. Plugins that raise during resolution are dropped silently
(logged at WARN — see jw_core.plugins). Plugins that don't satisfy the
structural EmbedProvider Protocol are also dropped.
"""
from jw_rag.embed_providers.bge_m3 import BGEM3Provider
from jw_rag.embed_providers.cohere import CohereEmbedV3Provider
from jw_rag.embed_providers.fakes import (
FakeBGEM3,
FakeCohereEmbed,
FakeJinaEmbed,
FakeMultilingualE5,
FakeOllamaEmbed,
FakeVoyageEmbed,
)
from jw_rag.embed_providers.jina import JinaEmbeddingsV3Provider
from jw_rag.embed_providers.multilingual_e5 import MultilingualE5Provider
from jw_rag.embed_providers.ollama import OllamaEmbedProvider
from jw_rag.embed_providers.voyage import VoyageMultilingualProvider
registry: list[EmbedProvider] = [
CohereEmbedV3Provider(),
JinaEmbeddingsV3Provider(),
VoyageMultilingualProvider(),
BGEM3Provider(),
MultilingualE5Provider(),
OllamaEmbedProvider(),
FakeBGEM3(),
FakeMultilingualE5(),
FakeJinaEmbed(),
FakeCohereEmbed(),
FakeVoyageEmbed(),
FakeOllamaEmbed(),
]
for _name, spec in get_plugins("jw_agent_toolkit.embedders").items():
try:
target = spec.resolve()
instance = target() if isinstance(target, type) else target
except Exception: # noqa: BLE001
logger.warning("plugin embedder %r failed to load — skipping", _name)
continue
if not isinstance(instance, EmbedProvider):
logger.warning(
"plugin embedder %r does not satisfy EmbedProvider Protocol — skipping",
_name,
)
continue
registry.append(instance)
return registry
- Step 4: Run test to verify it passes
Run: uv run pytest packages/jw-rag/tests/test_embed_plugins.py -v
Expected: 2 passed.
- Step 5: Run jw-rag regression
Run: uv run pytest packages/jw-rag/tests -v --tb=short
Expected: prior tests stay green.
- Step 6: Commit
git add packages/jw-rag/src/jw_rag/embed_providers/factory.py packages/jw-rag/tests/test_embed_plugins.py
git commit -m "feat(jw-rag): merge plugin-SDK embedders into provider registry"
Task 12: Integrate plugins into jw-mcp.server
Files:
-
Modify:
packages/jw-mcp/src/jw_mcp/server.py -
Create:
packages/jw-mcp/tests/test_plugin_tools.py -
Step 1: Write the failing test
# packages/jw-mcp/tests/test_plugin_tools.py
"""Tests for jw-mcp ↔ jw_core.plugins tool registration."""
from __future__ import annotations
from typing import Any
import pytest
from jw_core.plugins import clear_plugin_cache
from jw_core.plugins.contracts import EntryPointSpec
from jw_mcp.server import register_plugin_tools
async def _fake_agent(**kwargs: Any) -> Any:
return {"echo": kwargs}
class _FakeMCP:
def __init__(self) -> None:
self.registered: list[tuple[str, str]] = []
def tool(self, name: str | None = None):
def deco(fn):
self.registered.append((name or fn.__name__, fn.__doc__ or ""))
return fn
return deco
@pytest.fixture(autouse=True)
def _reset_cache() -> None:
clear_plugin_cache()
yield
clear_plugin_cache()
def test_register_plugin_tools_emits_one_tool_per_plugin(
monkeypatch: pytest.MonkeyPatch,
) -> None:
spec = EntryPointSpec(
name="myagent",
group="jw_agent_toolkit.agents",
module="dummy",
attr="myagent",
dist_name="x",
dist_version="1",
)
def fake_resolve(self: EntryPointSpec) -> Any: # noqa: ARG001
return _fake_agent
monkeypatch.setattr(EntryPointSpec, "resolve", fake_resolve, raising=True)
monkeypatch.setattr(
"jw_mcp.server.get_plugins",
lambda group: {"myagent": spec} if group == "jw_agent_toolkit.agents" else {},
)
mcp = _FakeMCP()
register_plugin_tools(mcp)
names = [n for n, _ in mcp.registered]
assert "agent.myagent" in names
def test_register_plugin_tools_handles_broken_plugin(
monkeypatch: pytest.MonkeyPatch,
) -> None:
spec = EntryPointSpec(
name="bad",
group="jw_agent_toolkit.agents",
module="dummy",
attr="bad",
dist_name="x",
dist_version="1",
)
def fake_resolve(self: EntryPointSpec) -> Any: # noqa: ARG001
raise RuntimeError("boom")
monkeypatch.setattr(EntryPointSpec, "resolve", fake_resolve, raising=True)
monkeypatch.setattr(
"jw_mcp.server.get_plugins",
lambda group: {"bad": spec} if group == "jw_agent_toolkit.agents" else {},
)
mcp = _FakeMCP()
register_plugin_tools(mcp) # MUST NOT raise
assert mcp.registered == []
- Step 2: Run test to verify it fails
Run: uv run pytest packages/jw-mcp/tests/test_plugin_tools.py -v
Expected: FAIL — register_plugin_tools does not exist.
- Step 3: Add
register_plugin_toolstojw_mcp.server
Append to packages/jw-mcp/src/jw_mcp/server.py:
# ---------------------------------------------------------------------------
# Plugin-SDK integration (Fase 41) ------------------------------------------
# ---------------------------------------------------------------------------
import asyncio as _asyncio
import inspect as _inspect
import logging as _logging
from typing import Any as _Any
from jw_core.plugins import get_plugins
_logger = _logging.getLogger(__name__)
def _make_mcp_tool(name: str, fn: _Any):
"""Wrap a plugin agent into an MCP tool callable.
MCP tools are async; the wrapper auto-handles sync agents by offloading to
a thread, and async ones by awaiting directly.
"""
is_coro = _inspect.iscoroutinefunction(fn)
async def tool_fn(**kwargs: _Any) -> _Any:
if is_coro:
return await fn(**kwargs)
return await _asyncio.to_thread(fn, **kwargs)
tool_fn.__name__ = name
tool_fn.__doc__ = (fn.__doc__ or f"Plugin agent: {name}").strip()
return tool_fn
def register_plugin_tools(mcp: _Any) -> None:
"""Register every discovered agent plugin as an MCP tool named `agent.<name>`."""
for plug_name, spec in get_plugins("jw_agent_toolkit.agents").items():
try:
target = spec.resolve()
except Exception: # noqa: BLE001
_logger.warning(
"skipping plugin agent %r — failed to resolve target", plug_name
)
continue
tool_name = f"agent.{plug_name}"
wrapped = _make_mcp_tool(tool_name, target)
try:
mcp.tool(name=tool_name)(wrapped)
except Exception: # noqa: BLE001
_logger.warning(
"skipping plugin agent %r — MCP refused tool registration", plug_name
)
continue
If the server has a single register_tools() entry point, also call register_plugin_tools(mcp) at the end of it. Locate the existing call site by:
grep -n "def register_tools" packages/jw-mcp/src/jw_mcp/server.py
Then inside that function, near the end, add:
register_plugin_tools(mcp)
- Step 4: Run test to verify it passes
Run: uv run pytest packages/jw-mcp/tests/test_plugin_tools.py -v
Expected: 2 passed.
- Step 5: Run jw-mcp regression
Run: uv run pytest packages/jw-mcp/tests -v --tb=short
Expected: prior tests stay green.
- Step 6: Commit
git add packages/jw-mcp/src/jw_mcp/server.py packages/jw-mcp/tests/test_plugin_tools.py
git commit -m "feat(jw-mcp): register plugin agents as agent.<name> MCP tools"
Task 13: CLI — jw plugins list / verify / disable
Files:
-
Create:
packages/jw-cli/src/jw_cli/commands/plugins.py -
Modify:
packages/jw-cli/src/jw_cli/commands/__init__.py -
Modify:
packages/jw-cli/src/jw_cli/main.py -
Create:
packages/jw-cli/tests/test_plugins_cli.py -
Step 1: Write the failing test
# packages/jw-cli/tests/test_plugins_cli.py
"""Tests for `jw plugins {list,verify,disable}`."""
from __future__ import annotations
import json
from typing import Any
import pytest
from typer.testing import CliRunner
from jw_cli.main import app
from jw_core.plugins import clear_plugin_cache
from jw_core.plugins.contracts import EntryPointSpec, VerifyReport
runner = CliRunner()
@pytest.fixture(autouse=True)
def _reset_cache() -> None:
clear_plugin_cache()
yield
clear_plugin_cache()
def _spec(name: str = "demo") -> EntryPointSpec:
return EntryPointSpec(
name=name,
group="jw_agent_toolkit.agents",
module="m",
attr=name,
dist_name="demo-pkg",
dist_version="1.0.0",
)
def test_plugins_list_default_human(monkeypatch: pytest.MonkeyPatch) -> None:
monkeypatch.setattr(
"jw_cli.commands.plugins.get_plugins",
lambda group: {"demo": _spec()} if group == "jw_agent_toolkit.agents" else {},
)
result = runner.invoke(app, ["plugins", "list"])
assert result.exit_code == 0
assert "demo" in result.stdout
assert "demo-pkg" in result.stdout
def test_plugins_list_json(monkeypatch: pytest.MonkeyPatch) -> None:
monkeypatch.setattr(
"jw_cli.commands.plugins.get_plugins",
lambda group: {"demo": _spec()} if group == "jw_agent_toolkit.agents" else {},
)
result = runner.invoke(app, ["plugins", "list", "--json"])
assert result.exit_code == 0
data = json.loads(result.stdout)
assert "jw_agent_toolkit.agents" in data
assert data["jw_agent_toolkit.agents"][0]["name"] == "demo"
def test_plugins_verify_ok(monkeypatch: pytest.MonkeyPatch) -> None:
rep = VerifyReport(
name="demo",
group="jw_agent_toolkit.agents",
dist_name="demo-pkg",
dist_version="1.0.0",
ok=True,
required_present=("__call__",),
required_missing=(),
optional_present=("languages",),
optional_missing=("version",),
version_constraint=None,
version_satisfied=True,
errors=(),
)
def fake_verify(name: str, group: str) -> Any: # noqa: ARG001
return rep
monkeypatch.setattr("jw_cli.commands.plugins.verify_plugin", fake_verify)
result = runner.invoke(app, ["plugins", "verify", "demo"])
assert result.exit_code == 0
assert "ok" in result.stdout.lower()
def test_plugins_verify_fail(monkeypatch: pytest.MonkeyPatch) -> None:
rep = VerifyReport(
name="bad",
group="jw_agent_toolkit.agents",
dist_name="bad-pkg",
dist_version="1.0.0",
ok=False,
required_present=(),
required_missing=("__call__",),
optional_present=(),
optional_missing=("languages",),
version_constraint=None,
version_satisfied=True,
errors=(),
)
monkeypatch.setattr(
"jw_cli.commands.plugins.verify_plugin", lambda n, g: rep # noqa: ARG005
)
result = runner.invoke(app, ["plugins", "verify", "bad"])
assert result.exit_code == 2 # non-zero on failure
def test_plugins_disable_writes_config(tmp_path: pytest.MonkeyPatch) -> None:
cfg = tmp_path / "plugins.toml"
result = runner.invoke(
app, ["plugins", "disable", "spammy", "--config", str(cfg)]
)
assert result.exit_code == 0
assert cfg.exists()
text = cfg.read_text()
assert "spammy" in text
assert "[deny]" in text or "deny" in text
def test_plugins_disable_appends(tmp_path: pytest.MonkeyPatch) -> None:
cfg = tmp_path / "plugins.toml"
runner.invoke(app, ["plugins", "disable", "a", "--config", str(cfg)])
runner.invoke(app, ["plugins", "disable", "b", "--config", str(cfg)])
text = cfg.read_text()
assert "a" in text and "b" in text
- Step 2: Run test to verify it fails
Run: uv run pytest packages/jw-cli/tests/test_plugins_cli.py -v
Expected: FAIL — plugins subcommand missing.
- Step 3: Implement the command module
# packages/jw-cli/src/jw_cli/commands/plugins.py
"""`jw plugins` — list / verify / disable community plugins."""
from __future__ import annotations
import json
from pathlib import Path
import typer
from jw_core.plugins import get_plugins, verify_plugin
from jw_core.plugins.errors import PluginError
from jw_core.plugins.registry import GROUPS
app = typer.Typer(help="Manage community plugins (Fase 41).", no_args_is_help=True)
DEFAULT_CONFIG = Path.home() / ".jw-agent-toolkit" / "plugins.toml"
@app.command("list")
def list_plugins(
json_out: bool = typer.Option(False, "--json", help="Emit JSON instead of a table."),
) -> None:
"""List all discovered plugins, grouped by extension point."""
by_group: dict[str, list[dict[str, str]]] = {}
for group in GROUPS:
try:
specs = get_plugins(group)
except PluginError:
specs = {}
by_group[group] = [
{
"name": s.name,
"dist": s.dist_name,
"version": s.dist_version,
"module": s.module,
"attr": s.attr,
}
for s in specs.values()
]
if json_out:
typer.echo(json.dumps(by_group, indent=2, sort_keys=True))
return
for group, items in by_group.items():
typer.echo(f"\n## {group}")
if not items:
typer.echo(" (no plugins)")
continue
for it in items:
typer.echo(
f" {it['name']:30s} {it['dist']:25s} v{it['version']} {it['module']}:{it['attr']}"
)
@app.command("verify")
def verify_plugin_cmd(
name: str = typer.Argument(..., help="Plugin name (or dist:name for disambiguation)."),
group: str = typer.Option(
"jw_agent_toolkit.agents",
"--group",
help="Entry-point group to look the plugin up in.",
),
) -> None:
"""Run the contract + version check on a plugin and print the report."""
try:
rep = verify_plugin(name, group)
except PluginError as exc:
typer.echo(f"ERROR: {exc}", err=True)
raise typer.Exit(code=2)
typer.echo(f"plugin: {rep.name} ({rep.dist_name} v{rep.dist_version})")
typer.echo(f" group: {rep.group}")
typer.echo(f" required present: {list(rep.required_present)}")
typer.echo(f" required missing: {list(rep.required_missing)}")
typer.echo(f" optional present: {list(rep.optional_present)}")
typer.echo(f" optional missing: {list(rep.optional_missing)}")
typer.echo(f" version constraint: {rep.version_constraint}")
typer.echo(f" version satisfied: {rep.version_satisfied}")
typer.echo(f" status: {'OK' if rep.ok else 'FAIL'}")
if not rep.ok:
raise typer.Exit(code=2)
@app.command("disable")
def disable(
name: str = typer.Argument(..., help="Plugin name to deny-list persistently."),
config: Path = typer.Option(
DEFAULT_CONFIG, "--config", help="Path to persistent deny-list TOML."
),
) -> None:
"""Append a plugin name to the persistent deny list."""
config = Path(config)
config.parent.mkdir(parents=True, exist_ok=True)
existing: list[str] = []
if config.exists():
for line in config.read_text().splitlines():
line = line.strip()
if not line or line.startswith("#") or line.startswith("["):
continue
if "=" in line:
continue
existing.append(line.strip('"').strip("'"))
continue
# Naive parser — we keep it dependency-free.
for line in config.read_text().splitlines():
if line.strip().startswith('"') and line.strip().endswith('",'):
existing.append(line.strip().strip('"').rstrip(",").strip('"'))
if name in existing:
typer.echo(f"plugin {name!r} already in deny list at {config}")
return
existing.append(name)
body = '[deny]\nplugins = [\n' + "".join(f' "{n}",\n' for n in existing) + "]\n"
config.write_text(body)
typer.echo(f"plugin {name!r} added to {config}")
- Step 4: Wire the subcommand into the umbrella CLI
Edit packages/jw-cli/src/jw_cli/commands/__init__.py. Append:
from jw_cli.commands.plugins import app as plugins_app # noqa: F401
Edit packages/jw-cli/src/jw_cli/main.py. Add (after other app.add_typer(...) calls):
from jw_cli.commands.plugins import app as plugins_app
app.add_typer(plugins_app, name="plugins")
- Step 5: Run test to verify it passes
Run: uv run pytest packages/jw-cli/tests/test_plugins_cli.py -v
Expected: 6 passed.
- Step 6: Commit
git add packages/jw-cli/src/jw_cli/commands/plugins.py packages/jw-cli/src/jw_cli/commands/__init__.py packages/jw-cli/src/jw_cli/main.py packages/jw-cli/tests/test_plugins_cli.py
git commit -m "feat(jw-cli): add jw plugins list/verify/disable subcommand"
Task 14: CI job plugin-sdk (offline)
Files:
-
Modify:
.github/workflows/ci.yml -
Step 1: Find existing jobs in ci.yml
Run: grep -n "^ [a-z-]*:" .github/workflows/ci.yml | head -10 to see job names.
- Step 2: Append the new job
Append (or insert near other test jobs) the following snippet inside the jobs: block:
plugin-sdk:
needs: test
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.13"
- name: Install uv
run: pipx install uv
- name: Sync workspace
run: uv sync --all-packages --frozen
- name: Install plugin fixture (editable)
run: uv pip install -e packages/jw-core/tests/fixtures/plugin_sample
- name: Run plugin SDK tests
run: uv run pytest packages/jw-core/tests/test_plugins_*.py -v
- name: Smoke: jw plugins list --json
run: |
uv run jw plugins list --json > plugins.json
test -s plugins.json
uv run python -c "import json; d=json.load(open('plugins.json')); assert any('plugin_sample_agent' in [p['name'] for p in g] for g in d.values()), d"
- name: Smoke: JW_PLUGINS_DISABLED=1 empties registry
env:
JW_PLUGINS_DISABLED: "1"
run: |
uv run jw plugins list --json > plugins_off.json
uv run python -c "import json; d=json.load(open('plugins_off.json')); assert all(v==[] for v in d.values()), d"
- Step 3: Validate CI YAML locally
Run: python -c "import yaml; yaml.safe_load(open('.github/workflows/ci.yml'))" — must exit 0.
- Step 4: Commit
git add .github/workflows/ci.yml
git commit -m "ci(plugin-sdk): offline job installing plugin_sample fixture + smoke checks"
Task 15: Docs — overview / security / capabilities / authoring
Files:
-
Create:
docs/plugin-sdk/overview.md -
Create:
docs/plugin-sdk/security.md -
Create:
docs/plugin-sdk/capabilities.md -
Create:
docs/plugin-sdk/authoring.md -
Modify:
docs/README.md -
Step 1: Write
overview.md
# Plugin SDK — overview
> Tier 2 (comunidad). Fase 41. Habilita Fase 42 (scaffolding).
El plugin SDK convierte cinco puntos de extensión del toolkit en superficies de
contribución externa: terceros publican un paquete Python con un entry point
declarado en `pyproject.toml` y el toolkit lo descubre en runtime.
## Grupos de entry points
| Group | Contrato | Ejemplo |
|---|---|---|
| `jw_agent_toolkit.agents` | async callable `(**kwargs) -> AgentResult` | nuevo agente |
| `jw_agent_toolkit.parsers` | `(raw, *, source_url=None) -> ParsedDocument` | parser de formato exótico |
| `jw_agent_toolkit.embedders` | `EmbedProvider` (Fase 33) | embedder dedicado |
| `jw_agent_toolkit.vlm_providers` | `VLMProvider` | proveedor VLM extra |
| `jw_agent_toolkit.gen_providers` | `GenerationProvider` (Fase 38) | proveedor Gen extra |
## Uso desde el toolkit
```python
from jw_core.plugins import get_plugins, verify_plugin
agents = get_plugins("jw_agent_toolkit.agents")
print(agents["my_agent"].dist_name, agents["my_agent"].dist_version)
report = verify_plugin("my_agent", "jw_agent_toolkit.agents")
print(report.ok, report.required_missing)
CLI
uv run jw plugins list # human
uv run jw plugins list --json # CI-friendly
uv run jw plugins verify foo --group jw_agent_toolkit.agents
uv run jw plugins disable bar
Variables de entorno
| Variable | Default | Efecto |
|---|---|---|
JW_PLUGINS_DISABLED | unset | si =1, ningún plugin se descubre |
JW_PLUGINS_STRICT | unset | si =1, errores de contrato/versión abortan |
JW_PLUGINS_ALLOW_LIST | unset | CSV; sólo estos se cargan |
JW_PLUGINS_DENY_LIST | unset | CSV; estos no se cargan |
JW_PLUGINS_CONFLICT_POLICY | namespaced | first_wins | last_wins | namespaced | reject |
- [ ] **Step 2: Write `security.md`**
```markdown
# Plugin SDK — seguridad
> Instalar un plugin del SDK = ejecutar código arbitrario. Verifica la fuente.
## Modelo de confianza
El plugin corre en el proceso del host con todos los privilegios. No hay
sandboxing real (sin subprocesos / WASM / seccomp). El modelo es **igual que
`pip install`**: cualquier paquete Python instalable puede:
- leer secretos del entorno (`os.environ`)
- escribir y leer archivos
- hacer red
El SDK no mitiga esto. Lo que sí ofrece:
| Mitigación | Cómo |
|---|---|
| Desactivar discovery completo | `JW_PLUGINS_DISABLED=1` |
| Allow-list explícito | `JW_PLUGINS_ALLOW_LIST="trusted_a,trusted_b"` |
| Deny-list (post-incident) | `JW_PLUGINS_DENY_LIST="bad"` o `jw plugins disable bad` |
| Trazabilidad | `verify_plugin` reporta `dist_name`, `dist_version` |
| Reject de duplicados | `JW_PLUGINS_CONFLICT_POLICY=reject` |
## Recomendaciones por entorno
- **Dev local**: default permisivo + `verify_plugin` antes de producir.
- **CI público**: `JW_PLUGINS_DISABLED=1` por defecto. Tests propios usan
`uv pip install -e` explícito.
- **Auditoría / producción sensible**: `JW_PLUGINS_ALLOW_LIST` cerrado +
`JW_PLUGINS_STRICT=1` para forzar fail-hard.
## Lo que NO ofrecemos
- Bloqueo de red por plugin.
- Bloqueo de filesystem por plugin.
- Sandbox de imports.
Esas mitigaciones requieren subprocesos + IPC y no entran en Fase 41. Queda
documentado en el ROADMAP.
- Step 3: Write
capabilities.md
# Plugin SDK — capability matrix
| Group | Required attrs | Optional attrs |
|---|---|---|
| `jw_agent_toolkit.agents` | `__call__` | `languages`, `version`, `cost_estimate` |
| `jw_agent_toolkit.parsers` | `__call__` | `extensions`, `mime_types` |
| `jw_agent_toolkit.embedders` | `name`, `target`, `dim`, `is_available`, `embed` | `max_tokens` |
| `jw_agent_toolkit.vlm_providers` | `name`, `is_available`, `describe` | `languages` |
| `jw_agent_toolkit.gen_providers` | `name`, `is_available`, `generate` | `max_tokens`, `supports_streaming` |
## Política de evolución
- Los Protocols son **aditivos** por contrato dentro de un major.
- Atributos opcionales se detectan vía `hasattr`, **no** isinstance check.
- Cualquier nuevo método **requerido** fuerza un bump de major del toolkit.
- `verify_plugin` reporta `optional_present` / `optional_missing` para que el
autor sepa qué features puede activar.
- Step 4: Write
authoring.md
# Plugin SDK — authoring guide
## 1. Esqueleto mínimo (agente)
my-jw-plugin/ ├── pyproject.toml └── src/my_jw_plugin/ ├── init.py └── agent.py
`pyproject.toml`:
```toml
[project]
name = "my-jw-plugin"
version = "0.1.0"
requires-python = ">=3.13"
dependencies = [
"jw-agent-toolkit>=1.0,<2.0", # capability matrix de la major actual
]
[project.entry-points."jw_agent_toolkit.agents"]
translation_helper = "my_jw_plugin.agent:translation_helper"
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[tool.hatch.build.targets.wheel]
packages = ["src/my_jw_plugin"]
agent.py:
from typing import Any
async def translation_helper(**kwargs: Any) -> dict[str, Any]:
text = kwargs.get("text", "")
return {"findings": [], "translation": text.upper()}
translation_helper.languages = ["en", "es", "pt"]
2. Instalar local
uv pip install -e ./my-jw-plugin
uv run jw plugins list
uv run jw plugins verify translation_helper
uv run jw eval --layer 1 --filter agent=translation_helper
3. Convenciones
- Nombre del entry point: snake_case, descriptivo, único en tu paquete.
- Si tu nombre choca con uno core, queda accesible como
<dist>:<name>. - No hagas side-effects en import time del módulo entry point.
- No hagas red durante
is_available().
4. Versión constraint
Declara jw-agent-toolkit>=X,<Y para asegurar compatibilidad. verify_plugin
y el SDK rechazan tu plugin si la versión instalada cae fuera del rango.
- [ ] **Step 5: Link the new docs from `docs/README.md`**
Append to the "Guías por tema" list:
```markdown
- [Plugin SDK — overview](plugin-sdk/overview.md) — 5 extension points para terceros.
- [Plugin SDK — seguridad](plugin-sdk/security.md) — modelo de confianza y mitigaciones.
- [Plugin SDK — capability matrix](plugin-sdk/capabilities.md) — required vs optional por grupo.
- [Plugin SDK — authoring](plugin-sdk/authoring.md) — guía para publicar un plugin.
- Step 6: Commit
git add docs/plugin-sdk docs/README.md
git commit -m "docs(plugin-sdk): overview/security/capabilities/authoring (Fase 41)"
Task 16: Final audit — full suite green + no regressions
Files: none (verification only).
- Step 1: Run lint + format
Run:
uv run ruff check packages/jw-core packages/jw-eval packages/jw-rag packages/jw-mcp packages/jw-cli
uv run ruff format --check packages/jw-core packages/jw-eval packages/jw-rag packages/jw-mcp packages/jw-cli
Expected: zero violations.
- Step 2: Run mypy (best-effort)
Run: uv run mypy packages/jw-core/src/jw_core/plugins
Expected: errors only on # type: ignore lines.
- Step 3: Run the entire test suite
Run: uv run pytest packages/ -v --tb=short
Expected: all previous tests + ~47 new plugin-SDK tests + ~6 e2e tests + per-package integration tests = green. No regressions.
- Step 4: End-to-end CLI smoke (with fixture installed)
Run:
uv pip install -e packages/jw-core/tests/fixtures/plugin_sample
uv run jw plugins list
uv run jw plugins verify plugin_sample_agent
uv run jw plugins verify plugin_sample_embedder --group jw_agent_toolkit.embedders
JW_PLUGINS_DISABLED=1 uv run jw plugins list --json | python -c "import json,sys; d=json.load(sys.stdin); assert all(v==[] for v in d.values()); print('disabled OK')"
Expected: list shows plugin_sample_* in each group; verify is OK on both; disabled returns empty groups.
- Step 5: Append Fase 41 to VISION_AUDIT and ROADMAP
Edit docs/VISION_AUDIT.md. Append row to the summary table:
| Fase 41 (plugin SDK) | ✅ Nuevo | `jw_core.plugins` — 5 groups + verify + CLI |
Edit docs/ROADMAP.md. Append section:
## Fase 41 — Plugin SDK ✅
> Tier 2 comunidad. Spec: `docs/superpowers/specs/2026-05-31-fase-41-plugin-sdk-design.md`.
- ✅ Subpaquete nuevo `packages/jw-core/src/jw_core/plugins/`.
- ✅ 5 Protocols + EntryPointSpec + VerifyReport.
- ✅ Discovery via `importlib.metadata.entry_points`, cached con `lru_cache`.
- ✅ Conflict policy: `namespaced` (default), `first_wins`, `last_wins`, `reject`.
- ✅ Env opt-out: `JW_PLUGINS_DISABLED`, `JW_PLUGINS_ALLOW_LIST`, `JW_PLUGINS_DENY_LIST`.
- ✅ Fail-soft default; `JW_PLUGINS_STRICT=1` para fail-hard.
- ✅ Fixture `plugin_sample/` con entry points en los 5 groups.
- ✅ Integración: `jw-eval`, `jw-rag`, `jw-mcp`, `jw-cli`.
- ✅ CLI `jw plugins {list,verify,disable}`.
- ✅ CI job `plugin-sdk` offline.
- ✅ Docs `docs/plugin-sdk/{overview,security,capabilities,authoring}.md`.
### Cobertura de tests
- ✅ ~47 tests nuevos del módulo `jw_core.plugins`.
- ✅ ~6 tests e2e subprocess + fixture install.
- ✅ Integración: `jw-eval`, `jw-rag`, `jw-mcp`, `jw-cli`.
- ✅ Sin regresiones en la suite global.
- Step 6: Final commit
git add docs/VISION_AUDIT.md docs/ROADMAP.md
git commit -m "docs(roadmap): land Fase 41 — plugin SDK"
Self-review summary
- Spec coverage: every spec section maps to a task above:
- Architecture / module layout → Task 1
- Errors → Task 1
- Protocols + EntryPointSpec → Task 2
- Conflict policy + env helpers → Task 3
- Discovery (
registry.py) → Task 4 - Verify + VerifyReport + version constraint → Task 5
- Cached factory → Task 6
- Fixture package → Task 7
- E2E subprocess install → Task 8
jw_corere-export → Task 9- jw-eval integration → Task 10
- jw-rag integration → Task 11
- jw-mcp integration → Task 12
- CLI integration → Task 13
- CI offline job → Task 14
- Docs en español (overview/security/capabilities/authoring) → Task 15
- VISION_AUDIT + ROADMAP rows → Task 16
- No placeholders: every Python and YAML block is fully written, every CLI command has its exact invocation and expected output.
- Type consistency:
EntryPointSpecis the single source-of-truth dataclass returned byget_pluginsand consumed byverify_plugin, jw-eval, jw-rag, jw-mcp, jw-cli.VerifyReportis frozen dataclass with explicitok: boolfield used identically by CLI and tests. All Protocols areruntime_checkableand mirror existing toolkit contracts (EmbedProvider,VLMProvider,GenerationProvider). - Test-first discipline: every task starts with a failing test before implementation. Task 4 introduces the autouse
clear_plugin_cachefixture solru_cachecannot leak across tests. - Determinism: the fixture package is installed editable from a local path (no network). The e2e test creates an ephemeral venv per test module so the host venv stays untouched. CI mirrors the same pattern.
- No-objective enforcement: no sandboxing, no marketplace, no hot-reload, no JS plugins — every “non-goal” from the spec stays absent and is called out explicitly in
security.md(Task 15).
Execution choice
Plan completo. Dos opciones de ejecución:
- Subagent-driven (recomendado) — dispatch fresh sub-agente por tarea, review entre tareas, iteración rápida (
superpowers:subagent-driven-development). - Inline — ejecuto tareas en esta sesión con checkpoints (
superpowers:executing-plans).
¿Cuál prefieres?
Editar esta página en docs/superpowers/plans/2026-05-31-fase-41-plugin-sdk-plan.md