Merge pull request #42 from ayhammouda/release/v0.2.0

ayhammouda · web-flow · commit 0ec7ef706764 · 2026-05-29T11:14:13.000+02:00
chore(release): v0.2.0
diff --git a/.github/TEST-STRATEGY.md b/.github/TEST-STRATEGY.md
@@ -0,0 +1,101 @@
+# Test Strategy
+
+Canonical map of **what we test, at which layer, and where the gaps are**.
+Pairs with the how-to-run instructions in `CONTRIBUTING.md` and the manual
+client runbook in `.github/INTEGRATION-TEST.md`.
+
+Last verified: 2026-05-29 — 284 tests passing, ruff clean, pyright 0 errors.
+
+## 1. The pyramid (current shape)
+
+```
+        /  stdio E2E   \      9 tests   — real MCP process over stdio
+       /  integration   \    ~38 tests  — multi-version, publish, cache, phase1
+      /   unit / service  \  ~230 tests — services, retrieval, ingestion, compare
+     /  contract + regression \  38 tests — schema snapshots, stability, curated cases
+```
+
+This is the right shape: a wide, fast base and a thin, slow top. Keep new
+tests pushed **down** the pyramid — only add an stdio E2E test when a bug can
+*only* manifest across the process boundary (framing, lifespan DI, stdout
+hygiene).
+
+## 2. Expected features → coverage map
+
+The server exposes **6 MCP tools**. Every tool must have at least one
+behavioral test and appear in the schema snapshot.
+
+| Tool / feature         | Primary tests                                                        | Layer            | Status |
+|------------------------|----------------------------------------------------------------------|------------------|--------|
+| `search_docs`          | `test_services`, `test_retrieval`, `test_synonyms`, `test_stability` | unit + regression| STRONG |
+| `get_docs`             | `test_services`, `test_retrieval`, `test_persistent_docs_cache`, `test_mcp_get_docs_cache_smoke` | unit + integration | STRONG |
+| `list_versions`        | `test_services`, `test_multi_version`                                | unit + integration| GOOD   |
+| `compare_versions`     | `test_compare_versions` (15), `test_services`                        | unit             | GOOD   |
+| `lookup_package_docs`  | `test_package_docs` (8)                                              | unit only        | THIN   |
+| `detect_python_version`| `test_detection` (12)                                               | unit             | GOOD   |
+
+Cross-cutting coverage:
+
+- **Schema contract**: `test_schema.py`, `test_schema_snapshot.py` — input/output
+  JSON schemas for each tool are frozen as fixtures; a wire-shape change fails CI.
+- **Multi-version routing**: `test_multi_version.py` — version param resolution and
+  default fallback across indexed doc sets.
+- **Regression**: `test_retrieval_regression.py` (curated query→expected cases) and
+  `test_stability.py` (property-based invariants that survive CPython doc revisions).
+- **Process hygiene**: `test_stdio_smoke.py`, `test_stdio_hygiene.py` — confirm a real
+  stdio server starts, answers, and keeps stdout free of non-protocol noise.
+- **Packaging / CI**: `test_packaging.py`, `test_ci_workflows.py` — installable
+  artifact + workflow file invariants.
+
+## 3. What to test, by component type
+
+- **Services** (`services/`): business logic in isolation against a `tmp_path`
+  SQLite fixture. Cover the happy path, every error branch (`DocsServerError`
+  subclasses), and token-budget trimming.
+- **Retrieval/ranking** (`retrieval/`): query parsing, FTS5 behavior, ranker
+  ordering. Use property assertions (`>= 1 result`, substring match) over exact
+  content so upstream doc edits don't break the suite.
+- **Ingestion** (`ingestion/`): parse valid + deliberately broken `.fjson`
+  fixtures; assert idempotency on re-publish.
+- **Server layer** (`server.py`): thin — it only delegates to services and maps
+  `DocsServerError → ToolError`. Cover that mapping via stdio smoke, not unit tests.
+- **Detection** (`detection.py`): pure environment probing — see gap below.
+
+## 4. Coverage targets
+
+No line-coverage gate is enforced (no `pytest-cov` in the dev deps). The bar is
+**behavioral**, not numeric:
+
+- Every public tool has ≥1 happy-path + ≥1 error-path test.
+- Every `errors.py` exception type is raised by at least one test.
+- Every wire-facing model is pinned by a schema snapshot.
+
+Adopt these as the definition of done for new tools. A line-coverage gate is
+optional future work; if added, target the `services/` and `retrieval/`
+packages, not `server.py` (intentionally thin) or `__main__.py`.
+
+## 5. Known gaps
+
+1. **`detection.py` — CLOSED (2026-05-29).** `tests/test_detection.py` now
+   covers all three branches of the fallback chain (`.python-version` file →
+   `python3` in PATH → `sys.version_info`), `_parse_major_minor` parsing, and
+   `match_to_indexed` — 12 tests. The isolation pattern (`monkeypatch.chdir` to
+   escape a real `.python-version`, `monkeypatch.setattr` on `subprocess.run`)
+   is the reference for testing order-dependent environment probing.
+2. **`lookup_package_docs` has no stdio smoke (LOW).** Covered at the service
+   layer only; the PyPI-allowlist trust boundary is never exercised end-to-end.
+3. **No negative version-resolution E2E (LOW).** Unknown-version errors are
+   unit-tested but not asserted over the stdio boundary.
+
+## 6. Reference cases — `detection.py` (now implemented in `test_detection.py`)
+
+| Case                                   | Expectation                              |
+|----------------------------------------|------------------------------------------|
+| `.python-version` file present in cwd  | returns `(version, ".python-version file")` |
+| `.python-version` malformed / empty    | falls through to next source, no crash   |
+| no file, `python3` on PATH             | returns `(version, "python3 in PATH")`   |
+| no file, no `python3`                  | returns runtime `(X.Y, "server runtime")`|
+| `_parse_major_minor("Python 3.13.2")`  | `"3.13"`                                  |
+| `_parse_major_minor("no digits here")` | `None`                                    |
+| `match_to_indexed("3.13", ["3.13"])`   | `"3.13"`                                  |
+| `match_to_indexed("3.9", ["3.13"])`    | `None`                                    |
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -4,6 +4,40 @@ All notable changes to `python-docs-mcp-server` are documented here.
 Format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/);
 this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [0.2.0] — 2026-05-29
+
+### Added
+
+- **New MCP tool: `compare_versions(symbol, v1, v2)`** (Phase 09). Diffs a Python
+  stdlib symbol between two indexed versions and returns a structured result with
+  `change=added|removed|changed|unchanged` plus optional `new_in`, `changed_in`,
+  `deprecated_in`, `signature_delta` (advisory heuristic), `see_also_added`,
+  `see_also_removed`, `section_diff`, and `note` fields. Token-frugal by design —
+  emits only changed fields, not full page content. Both versions must be indexed;
+  an unknown version raises an actionable error naming the available versions. This
+  brings the server to a **six-tool surface**. ([#41](https://github.com/ayhammouda/python-docs-mcp-server/pull/41))
+
+### Security
+
+- Bumped two transitive dependencies to patched releases:
+  - `idna` 3.13 → 3.17 — resolves CVE-2026-45409 (ReDoS in `idna.encode()`).
+  - `starlette` 1.0.0 → 1.2.0 — resolves PYSEC-2026-161 ("BadHost", a `Host`-header
+    auth bypass that explicitly affects MCP servers).
+  Both arrive via the `mcp` / `sse-starlette` chain; no direct-dependency or API
+  changes. `pip-audit` reports no known vulnerabilities after the bump.
+
+### Changed
+
+- `services/compare.py` extractors simplified — precompiled the four Sphinx-directive
+  regexes and collapsed three near-identical `_extract_*` helpers into one.
+
+### Docs
+
+- README tools table and `.github/INTEGRATION-TEST.md` updated to document the full
+  six-tool surface including `compare_versions`.
+- Added `.github/TEST-STRATEGY.md` — canonical map of test layers, the feature→coverage
+  matrix, and known gaps.
+
 ## [0.1.6] — 2026-05-14
 
 ### Fixed
diff --git a/pyproject.toml b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
 
 [project]
 name = "python-docs-mcp-server"
-version = "0.1.6"
+version = "0.2.0"
 description = "The canonical Python stdlib oracle for AI coding agents — exact symbols, exact sections, exact versions, offline, always free, always MIT, token-frugal."
 readme = "README.md"
 license = "MIT"
diff --git a/server.json b/server.json
@@ -8,13 +8,13 @@
     "url": "https://github.com/ayhammouda/python-docs-mcp-server",
     "source": "github"
   },
-  "version": "0.1.6",
+  "version": "0.2.0",
   "packages": [
     {
       "registryType": "pypi",
       "registryBaseUrl": "https://pypi.org",
       "identifier": "python-docs-mcp-server",
-      "version": "0.1.6",
+      "version": "0.2.0",
       "runtimeHint": "uvx",
       "transport": {
         "type": "stdio"
diff --git a/tests/test_detection.py b/tests/test_detection.py
@@ -0,0 +1,125 @@
+"""Unit tests for environment Python-version detection (detection.py).
+
+Closes the long-standing gap: ``detect_python_version`` backs 1 of the 6
+public MCP tools but had no dedicated coverage. See ``.github/TEST-STRATEGY.md``
+section 5/6.
+
+Detection is a *fallback chain*:
+    1. ``.python-version`` file in cwd
+    2. ``python3 --version`` on PATH
+    3. ``sys.version_info`` (server runtime)
+
+To test any branch in isolation we must neutralize the branches *above* it:
+escape the dev machine's real ``.python-version`` with ``monkeypatch.chdir``,
+and control the ``python3`` probe by patching ``subprocess.run``.
+"""
+from __future__ import annotations
+
+import subprocess
+import sys
+
+import pytest
+
+from mcp_server_python_docs import detection
+from mcp_server_python_docs.detection import (
+    _parse_major_minor,
+    detect_python_version,
+    match_to_indexed,
+)
+
+# ── _parse_major_minor: pure regex extraction ──────────────────────
+
+@pytest.mark.parametrize(
+    "raw, expected",
+    [
+        ("3.13.2", "3.13"),
+        ("Python 3.13.2", "3.13"),
+        ("cpython-3.13", "3.13"),
+        ("3.9", "3.9"),
+        ("no digits here", None),
+        ("", None),
+    ],
+)
+def test_parse_major_minor(raw: str, expected: str | None) -> None:
+    assert _parse_major_minor(raw) == expected
+
+
+# ── match_to_indexed: only return exact, indexed matches ───────────
+
+def test_match_to_indexed_returns_exact_match() -> None:
+    assert match_to_indexed("3.13", ["3.12", "3.13"]) == "3.13"
+
+
+def test_match_to_indexed_returns_none_when_absent() -> None:
+    assert match_to_indexed("3.9", ["3.12", "3.13"]) is None
+
+
+# ── detect_python_version: the fallback chain ──────────────────────
+
+def test_detects_from_python_version_file(tmp_path, monkeypatch) -> None:
+    """Branch 1: a .python-version file in cwd wins over everything else."""
+    monkeypatch.chdir(tmp_path)
+    (tmp_path / ".python-version").write_text("3.11.4\n")
+
+    version, source = detect_python_version()
+
+    assert version == "3.11"
+    assert source == ".python-version file"
+
+
+def test_malformed_version_file_falls_through(tmp_path, monkeypatch) -> None:
+    """Branch 1 with no parseable version must NOT crash — it falls through.
+
+    We stub the PATH probe so the assertion is deterministic regardless of
+    what ``python3`` the host actually has.
+    """
+    monkeypatch.chdir(tmp_path)
+    (tmp_path / ".python-version").write_text("not-a-version\n")
+
+    def fake_run(*args, **_kwargs):
+        return subprocess.CompletedProcess(args, 0, stdout="Python 3.12.1\n", stderr="")
+
+    monkeypatch.setattr(detection.subprocess, "run", fake_run)
+
+    version, source = detect_python_version()
+
+    assert version == "3.12"
+    assert source == "python3 in PATH"
+
+
+def test_detects_from_path_probe(tmp_path, monkeypatch) -> None:
+    """Branch 2: no .python-version file, so the python3 PATH probe wins.
+
+    chdir to an empty tmp dir to neutralize branch 1 (any real
+    .python-version on the host), then stub the probe deterministically.
+    """
+    monkeypatch.chdir(tmp_path)
+
+    def fake_run(*args, **_kwargs):
+        return subprocess.CompletedProcess(args, 0, stdout="Python 3.10.9\n", stderr="")
+
+    monkeypatch.setattr(detection.subprocess, "run", fake_run)
+
+    version, source = detect_python_version()
+
+    assert version == "3.10"
+    assert source == "python3 in PATH"
+
+
+def test_falls_back_to_runtime_when_no_python3(tmp_path, monkeypatch) -> None:
+    """Branch 3: no file and no python3 on PATH -> server's own interpreter.
+
+    Neutralize branch 1 (empty cwd) and force branch 2 to fail by making the
+    probe raise FileNotFoundError, exactly as a missing python3 would.
+    """
+    monkeypatch.chdir(tmp_path)
+
+    def boom(*args, **_kwargs):
+        raise FileNotFoundError("python3 not found")
+
+    monkeypatch.setattr(detection.subprocess, "run", boom)
+
+    version, source = detect_python_version()
+
+    assert source == "server runtime"
+    assert version == f"{sys.version_info.major}.{sys.version_info.minor}"
diff --git a/uv.lock b/uv.lock