Skip to content

fix: small correctness/security cleanups in experimental skill scripts#534

Open
jamesbroadhead wants to merge 5 commits into
databricks-solutions:experimentalfrom
jamesbroadhead:ace-review-fixes-experimental
Open

fix: small correctness/security cleanups in experimental skill scripts#534
jamesbroadhead wants to merge 5 commits into
databricks-solutions:experimentalfrom
jamesbroadhead:ace-review-fixes-experimental

Conversation

@jamesbroadhead
Copy link
Copy Markdown
Contributor

@jamesbroadhead jamesbroadhead commented May 12, 2026

Summary

Batch of small, contained fixes surfaced by a parallel GPT 5.4 + Gemini 3.1 Pro code review. None of these are introduced by recent changes — they are pre-existing issues that ought to land regardless. Each fix is local and review-by-eye.

Commit history

  • 6a3bdb1 — initial set (see per-file breakdown below)
  • c28037e — round-2: round-1 self-review findings
    • mas_manager._build_agent_list: reject empty / whitespace-only UC segments (catalog..fn)
    • mas_manager.main(): catch ValueError from _build_agent_list so malformed configs exit cleanly via stderr
    • compute.get_cluster_status: add success: True to happy-path return for contract coherence
    • compute.cmd_manage_cluster: drop redundant DatabricksError clause + unused import
    • .tests/test_agent_bricks_manager.py: drop stale add_examples_queued import (was failing collection), fix sys.path, add TestBuildAgentListValidation / TestAddExamplesBatch / TestDeleteResponseHandling
  • 2fbbb4b — round-3: neighbouring consistency + polish
    • mas_manager: typed MASNotFound exception (replaces "does not exist" in str(e) substring matching in MASManager.get())
    • mas_manager: factor empty-body / non-JSON tolerance into _safe_json, apply symmetrically to _get / _post / _patch (was _delete-only)
    • compute.cmd_list_compute: apply the same NotFound handling cmd_manage_cluster got — was missing
    • fm-parallel-calls.py: explicit "Speedup: N/A (total_time below resolution)" instead of silently dropping the line
    • 5-serving-and-vector-search.py: replace dim-3 placeholder with [0.0] * 768 and a comment listing common embedding dimensions
    • .tests/test_agent_bricks_manager.py: add TestResponseErrorHandling covering the typed-404 path and symmetric empty-body handling

Per-file summary

databricks-skills/databricks-agent-bricks/scripts/mas_manager.py

  • _delete tolerates empty / non-JSON DELETE response bodies; _get / _post / _patch do too via shared _safe_json helper.
  • add_examples_batch short-circuits on empty input (ThreadPoolExecutor rejects max_workers=0).
  • _build_agent_list validates uc_function_name has 3 non-empty dotted parts and rejects missing endpoint_name with a clear error.
  • main() wraps dispatch in try/except ValueError for clean stderr/exit on bad agent configs.
  • MASManager.get() uses typed MASNotFound instead of substring-matching error messages.

databricks-skills/databricks-execution-compute/scripts/compute.py

  • manage-cluster --action get keys off the SDK's typed NotFound exception (was substring-matching "does not exist").
  • Same NotFound handling extended to cmd_list_compute clusters resource path.
  • Happy-path get_cluster_status includes success: True so the contract is consistent across all return paths.

databricks-skills/databricks-app-python/examples/llm_config.py (security)

  • OAuth error no longer interpolates the full token-endpoint payload (which can contain id_token / refresh material). Logs the present key names instead.
  • DATABRICKS_MODEL validation error drops the response.text[:300] echo so server bodies don't end up in operator-visible error text.

databricks-skills/databricks-app-python/examples/fm-minimal-chat.py

  • Docstring + app.yaml examples reference the actual filename (fm-minimal-chat.py), not 2-minimal-chat-app.py.

databricks-skills/databricks-app-python/examples/fm-parallel-calls.py

  • Guard Speedup division on total_time > 0; prints "Speedup: N/A (total_time below resolution)" when the metric can't be computed.
  • Convert the trailing standalone triple-quoted string (dead code) to real # comments.

databricks-skills/databricks-python-sdk/examples/5-serving-and-vector-search.py

  • Replace the dim-3 literal-Ellipsis [0.1, 0.2, 0.3, ...] with [0.0] * 768 and a comment naming common embedding dimensions (768 / 1024 / 1536).

databricks-skills/.tests/test_agent_bricks_manager.py

  • Unbreak collection (stale add_examples_queued import + wrong sys.path).
  • New negative tests for _build_agent_list ValueError branches, add_examples_batch([]), _delete empty/non-JSON bodies, MASNotFound typed-404 path, and _post symmetric empty-body handling.

CI caveat

Nothing under databricks-skills/.tests/ is currently wired into CI — .github/workflows/ci.yml only lints databricks-tools-core, databricks-mcp-server, and .test/src. That's why the broken add_examples_queued import had gone unnoticed. The new tests run via local pytest databricks-skills/.tests/ only. Wiring them into CI is a separate concern.

Companion PR

Equivalent (smaller) PR for mainmas_manager.py and compute.py don't exist under databricks-skills/ on main: #535

Test plan

  • python3 -m py_compile on all modified files
  • WORKTREE=<branch> python3 /tmp/tdd_verify.py — 12 / 12 pass post-fix on this branch
  • Full test_agent_bricks_manager.py run — 18 / 18 unit tests pass
  • cmd_list_compute NotFound path verified via mocked SDK
  • CI green

This pull request and its description were written by Isaac.

@jamesbroadhead
Copy link
Copy Markdown
Contributor Author

Hi @calreynolds — could you take a look at this when you have a moment? Small set of fixes from a code review. Companion PR for main is #535.

— this comment was written by Claude

jamesbroadhead added a commit to jamesbroadhead/ai-dev-kit that referenced this pull request May 15, 2026
…ain)

Round-3 review surfaced that nothing under databricks-skills/.tests/ was
being run by CI on either branch. This wires up a unit-tests job for
forward compatibility: when the .tests/ tree lands on main (currently
on experimental only — companion PR databricks-solutions#534), CI will start running it
automatically without a separate plumbing PR.

The job guards on the directory's existence so it no-ops cleanly on
branches that don't have .tests/ yet (i.e. main today). Once the tree
arrives, the guard becomes a no-op and pytest runs.

uvx + pytest + databricks-sdk + requests, deselecting
@pytest.mark.integration (those need a real Databricks workspace).

Co-authored-by: Isaac
Batch of small fixes surfaced by a parallel GPT 5.4 + Gemini 3.1 Pro
review. None of these are introduced by recent changes; they are
pre-existing issues that ought to land regardless.

databricks-agent-bricks/scripts/mas_manager.py:
- `_delete` now tolerates empty / non-JSON DELETE response bodies
  instead of raising JSONDecodeError on success.
- `add_examples_batch` short-circuits on empty input (ThreadPoolExecutor
  rejects max_workers=0).
- `_build_agent_list` validates `uc_function_name` has 3 dotted parts
  and rejects missing `endpoint_name` with a clear error, instead of
  IndexError / `{"name": null}` on the API.
- `main()` uses a `_parse_json_arg` helper that exits cleanly on
  malformed JSON CLI args, replacing raw `json.loads` tracebacks.

databricks-execution-compute/scripts/compute.py:
- `manage-cluster --action get` now keys off the SDK's `NotFound`
  exception type instead of substring-matching "does not exist" in the
  message. Also returns `success: false` for the DELETED state, so
  callers gating on `success` don't treat a missing cluster as a
  successful lookup.

databricks-app-python/examples/llm_config.py:
- OAuth error no longer interpolates the full token-endpoint payload
  (which can contain `id_token` / refresh material). Logs the present
  key names instead.
- DATABRICKS_MODEL validation error drops the `response.text[:300]`
  echo so server bodies don't end up in operator-visible error text.

databricks-app-python/examples/fm-minimal-chat.py:
- Docstring + `app.yaml` examples reference the actual filename
  (`fm-minimal-chat.py`), not `2-minimal-chat-app.py`.

databricks-app-python/examples/fm-parallel-calls.py:
- Guard `Speedup` division on `total_time > 0` to avoid
  ZeroDivisionError on fast paths.
- Convert the trailing standalone triple-quoted string (dead code) to
  real `#` comments.

databricks-python-sdk/examples/5-serving-and-vector-search.py:
- Replace the `[0.1, 0.2, 0.3, ...]` literal-Ellipsis vector with a
  named placeholder + comment explaining it's a stand-in.

This pull request was AI-assisted by Isaac.

Co-authored-by: Isaac
Follow-up to the ACE multi-model self-review on this branch.

mas_manager.py
- _build_agent_list: reject UC function names with empty / whitespace-only
  segments (e.g. "catalog..fn"). The previous len==3 check let these through
  and they reached the API as schema="".
- main(): catch ValueError from _build_agent_list so malformed agent configs
  exit cleanly via stderr instead of dumping a raw traceback — consistent
  with the JSON-arg handling _parse_json_arg already provides.

compute.py
- get_cluster_status: add success=True to the happy-path return so the
  manage-cluster get contract is coherent (success absent on success vs.
  success=False on missing/error was confusing).
- cmd_manage_cluster: drop the redundant DatabricksError clause (identical
  body to the generic Exception fallback) and the now-unused import.

.tests/test_agent_bricks_manager.py
- Drop the stale `add_examples_queued` import (function does not exist),
  which was causing the entire test module to fail collection.
- Fix sys.path to point at databricks-agent-bricks/scripts (not the parent),
  so mas_manager imports actually resolve.
- Add TestBuildAgentListValidation, TestAddExamplesBatch, and
  TestDeleteResponseHandling covering every new error path / contract
  change introduced in the previous commit on this branch.

Co-authored-by: Isaac
Sweep of issues neighboring the round-1/2 fixes that were called out in
the round-3 review.

mas_manager.py
- Introduce typed MASNotFound exception. _handle_response_error now raises
  it on HTTP 404 so MASManager.get() can branch on existence cleanly
  instead of substring-matching "does not exist" / "not found" in the
  error message (the same anti-pattern the round-1 fix removed from
  compute.py).
- Factor empty-body / non-JSON tolerance from _delete into a shared
  _safe_json helper and apply it symmetrically to _get / _post / _patch.
  Eliminates a class of latent JSONDecodeError on empty 2xx bodies.

compute.py
- cmd_list_compute's clusters resource path also called get_cluster_status
  with no try/except, so a stale --cluster-id raw-tracebacked. Apply the
  same typed-NotFound handling cmd_manage_cluster already uses.

fm-parallel-calls.py
- When total_time is below resolution, print "Speedup: N/A
  (total_time below resolution)" instead of silently omitting the line.

5-serving-and-vector-search.py
- Replace the dim-3 [0.1, 0.2, 0.3] placeholder with [0.0] * 768 and a
  comment listing common embedding dimensions (768 / 1024 / 1536). The
  earlier value was small enough to mislead copy-pasters into thinking
  any short list works.

.tests/test_agent_bricks_manager.py
- Add TestResponseErrorHandling covering the 4 new behaviors: 404 raises
  MASNotFound, MASManager.get() returns None on 404, non-404 errors still
  propagate, and _post tolerates empty success bodies.

Note: nothing under databricks-skills/.tests/ is currently wired into
CI (.github/workflows/ci.yml only lints databricks-tools-core,
databricks-mcp-server, .test/src). The tests run via local pytest only.
Separate concern, separate PR.

Co-authored-by: Isaac
Round-3 review surfaced that nothing under databricks-skills/.tests/ was
being run by CI — which is why the stale add_examples_queued import
(collection failure) had gone unnoticed in this branch's parent.

Add a skills-unit-tests job that runs the .tests/ tree via uvx with
pytest + databricks-sdk + requests, deselecting @pytest.mark.integration
(those need a real Databricks workspace).

Also add experimental to the pull_request branches list so PRs targeting
experimental — like this one — trigger CI at all. Previously CI only
fired on PRs to main, leaving experimental-branch work entirely
unvalidated.

Local trial run: 18 unit tests pass, 3 integration tests correctly
deselected, ~25s wall-clock including dependency install.

Co-authored-by: Isaac
The script this targets — databricks-genie/scripts/conversation.py —
was removed in 8fd7f31 ("Replace Genie conversation.py script with pure
CLI flow"). The test file's `from conversation import ...` has been
unimportable ever since, but nothing was running tests in this tree
(see ci.yml until two commits ago), so the breakage went unnoticed.

With the new skills-unit-tests CI job pointed at databricks-skills/.tests/,
this file's collection error halts the whole run before any real test
executes. Removing it.
Co-authored-by: Isaac
@jamesbroadhead jamesbroadhead force-pushed the ace-review-fixes-experimental branch from b64fc16 to 4b0bf26 Compare May 15, 2026 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant