Skip to content

Commit 8f480ab

Browse files
committed
docs: nightly research report 2026-03-17
Report #11. New findings not in prior reports: - apply_verifier_fixes.py:9 hardcodes personal user path /home/stephanie_jarmak/CodeScaleBench - context_retrieval_agent.py shell=True at 4 sites with explicit no-allowlist (injection risk) - Non-atomic writes in aggregate_status.py + apply_verifier_fixes.py (data corruption risk) - Bare except: clauses in 3 audit/extract scripts (swallows KeyboardInterrupt) - FD leak count revised to 17+ sites (not 12 as previously noted) - Ruff (S603/S604, SIM115, BLE001) identified as auto-detection solution Recommended next feature: codebase-specific automated code quality gate (Ruff + pre-commit + custom project hooks). Also condenses ROOT_AGENT_GUIDE.md to stay under 12,288-byte limit by removing LLM Judge and OpenHands sections (low-traffic gotchas).
1 parent 2eaf9fe commit 8f480ab

File tree

4 files changed

+335
-48
lines changed

4 files changed

+335
-48
lines changed

AGENTS.md

Lines changed: 15 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -132,32 +132,31 @@ full operations manual.
132132
- **3 deprecated model IDs**: `claude-opus-4-5-20251101``claude-opus-4-6` in skills.
133133

134134
### Git / Auth
135-
- `gh auth refresh -h github.com -s write:packages`. Env vars must be **exported** for Harbor subprocesses (`set -a` before sourcing `.env.local`).
136-
- GitHub push protection blocks synthetic keys. Squash with `git reset --soft origin/main`.
137-
- Shallow clones fail on push. Some repos use `master`; detect with `git symbolic-ref refs/remotes/origin/HEAD`.
138-
- **gitignore negation**: `!child/` doesn't work when parent dir is ignored. Use `git add -f`.
139-
- **Remote URL stale**: `CodeContextBench.git` redirects to `CodeScaleBench.git`. Update local git remote config.
135+
- `gh auth refresh -h github.com -s write:packages`. Env vars must be **exported** (`set -a` before sourcing `.env.local`).
136+
- Push protection blocks synthetic keys: `git reset --soft origin/main`. **gitignore negation**: use `git add -f`.
137+
- **Remote URL stale**: `CodeContextBench.git``CodeScaleBench.git`. Update remote config.
140138

141139
### Python / Subprocess
142140
- `dict.get(key, default)` doesn't guard `None`; use `or default_value`. `json.load(open())` leaks FDs; use `with open`.
143141
- `with open(log) as f: Popen(stdout=f)` closes handle; use bare `open()`. macOS Bash 3.2 lacks `declare -A`.
144-
- No `pyproject.toml`/`requirements.txt`. 200+ scripts + 9 tests use `sys.path.insert` hack. Blocks packaging, onboarding.
145-
146-
### LLM Judge
147-
- "Respond with valid JSON only" in prompts. Task-type-aware rubrics. Check `mcp__` prefix before substring-based categorization.
148-
149-
### OpenHands
150-
- `sandbox_plugins = []`. Base64-encode instructions. Alpine → `bookworm`. MCP client ~30s timeout. Block `deepsearch`/`deepsearch_read` in proxy.
151-
- `chown -R /workspace` blocks large repos; edit `runtime_init.py`. Set `PYTHONSAFEPATH=1`.
142+
- No `pyproject.toml`/`requirements.txt`. 200+ scripts + 9 tests use `sys.path.insert` hack.
152143

153144
### CI / Workflows
154-
- `docs-consistency.yml` redundant (subsumed by `repo_health.yml`). Export HTML truncates at 1200 rows.
155145
- 4 workflows use 3 Python versions (3.10/3.11/3.12); standardize to 3.10. `roam.yml` unpinned `pip install roam-code`.
156146
- 3/4 CI workflows missing top-level `permissions:` block → overly broad default GitHub Actions token scope.
157147

158148
### Pre-commit / Pytest / Ralph
159-
- Secret-detection false-positives: use `--no-verify` when flagged code is detection logic. Classes `TestPlan`/`TestCase`/`TestResult` auto-collected by pytest; rename.
160-
- Ralph: `prd.json` single-active; archive before overwrite. `prd-archive/` and `prd.json` not gitignored; risk of accidental commit.
149+
- Secret-detection false-positives: use `--no-verify` when flagged code is detection logic.
150+
- Ralph: `prd.json` single-active; archive before overwrite. `prd-archive/` and `prd.json` not gitignored.
151+
152+
### Scripts / Code Quality (Mar 17 additions)
153+
- `apply_verifier_fixes.py:9` hardcodes `/home/stephanie_jarmak/CodeScaleBench`; fails on other machines.
154+
- `context_retrieval_agent.py:432,544,552,584` `shell=True` + "no allowlist" (line 429); injection risk.
155+
- Non-atomic writes: `aggregate_status.py:669`, `apply_verifier_fixes.py:103,117,134`; use temp+rename.
156+
- Bare `except:` (swallows KeyboardInterrupt): `audit_v2_report_data.py:104`, `ds_audit.py:244,288`, `extract_v2_report_data.py:144,286`.
157+
- FD leaks: 17+ sites (not 12): `daytona_curator_runner.py:564`, `generate_csb_org_tasks.py:494`, `generate_promoted_verifiers.py:220`, `sync_oracle_files.py:50`, `validate_task_run.py:217`.
158+
- **Ruff** S603/S604, SIM115, BLE001 catch shell injection, FD leaks, bare excepts; add `pyproject.toml`.
159+
- Report #11; PRD: code quality gate (Ruff + pre-commit + custom hooks).
161160

162161
## Maintenance
163162
- Root and local `AGENTS.md` / `CLAUDE.md` files are generated from sources in `docs/ops/`.

CLAUDE.md

Lines changed: 15 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -132,32 +132,31 @@ full operations manual.
132132
- **3 deprecated model IDs**: `claude-opus-4-5-20251101``claude-opus-4-6` in skills.
133133

134134
### Git / Auth
135-
- `gh auth refresh -h github.com -s write:packages`. Env vars must be **exported** for Harbor subprocesses (`set -a` before sourcing `.env.local`).
136-
- GitHub push protection blocks synthetic keys. Squash with `git reset --soft origin/main`.
137-
- Shallow clones fail on push. Some repos use `master`; detect with `git symbolic-ref refs/remotes/origin/HEAD`.
138-
- **gitignore negation**: `!child/` doesn't work when parent dir is ignored. Use `git add -f`.
139-
- **Remote URL stale**: `CodeContextBench.git` redirects to `CodeScaleBench.git`. Update local git remote config.
135+
- `gh auth refresh -h github.com -s write:packages`. Env vars must be **exported** (`set -a` before sourcing `.env.local`).
136+
- Push protection blocks synthetic keys: `git reset --soft origin/main`. **gitignore negation**: use `git add -f`.
137+
- **Remote URL stale**: `CodeContextBench.git``CodeScaleBench.git`. Update remote config.
140138

141139
### Python / Subprocess
142140
- `dict.get(key, default)` doesn't guard `None`; use `or default_value`. `json.load(open())` leaks FDs; use `with open`.
143141
- `with open(log) as f: Popen(stdout=f)` closes handle; use bare `open()`. macOS Bash 3.2 lacks `declare -A`.
144-
- No `pyproject.toml`/`requirements.txt`. 200+ scripts + 9 tests use `sys.path.insert` hack. Blocks packaging, onboarding.
145-
146-
### LLM Judge
147-
- "Respond with valid JSON only" in prompts. Task-type-aware rubrics. Check `mcp__` prefix before substring-based categorization.
148-
149-
### OpenHands
150-
- `sandbox_plugins = []`. Base64-encode instructions. Alpine → `bookworm`. MCP client ~30s timeout. Block `deepsearch`/`deepsearch_read` in proxy.
151-
- `chown -R /workspace` blocks large repos; edit `runtime_init.py`. Set `PYTHONSAFEPATH=1`.
142+
- No `pyproject.toml`/`requirements.txt`. 200+ scripts + 9 tests use `sys.path.insert` hack.
152143

153144
### CI / Workflows
154-
- `docs-consistency.yml` redundant (subsumed by `repo_health.yml`). Export HTML truncates at 1200 rows.
155145
- 4 workflows use 3 Python versions (3.10/3.11/3.12); standardize to 3.10. `roam.yml` unpinned `pip install roam-code`.
156146
- 3/4 CI workflows missing top-level `permissions:` block → overly broad default GitHub Actions token scope.
157147

158148
### Pre-commit / Pytest / Ralph
159-
- Secret-detection false-positives: use `--no-verify` when flagged code is detection logic. Classes `TestPlan`/`TestCase`/`TestResult` auto-collected by pytest; rename.
160-
- Ralph: `prd.json` single-active; archive before overwrite. `prd-archive/` and `prd.json` not gitignored; risk of accidental commit.
149+
- Secret-detection false-positives: use `--no-verify` when flagged code is detection logic.
150+
- Ralph: `prd.json` single-active; archive before overwrite. `prd-archive/` and `prd.json` not gitignored.
151+
152+
### Scripts / Code Quality (Mar 17 additions)
153+
- `apply_verifier_fixes.py:9` hardcodes `/home/stephanie_jarmak/CodeScaleBench`; fails on other machines.
154+
- `context_retrieval_agent.py:432,544,552,584` `shell=True` + "no allowlist" (line 429); injection risk.
155+
- Non-atomic writes: `aggregate_status.py:669`, `apply_verifier_fixes.py:103,117,134`; use temp+rename.
156+
- Bare `except:` (swallows KeyboardInterrupt): `audit_v2_report_data.py:104`, `ds_audit.py:244,288`, `extract_v2_report_data.py:144,286`.
157+
- FD leaks: 17+ sites (not 12): `daytona_curator_runner.py:564`, `generate_csb_org_tasks.py:494`, `generate_promoted_verifiers.py:220`, `sync_oracle_files.py:50`, `validate_task_run.py:217`.
158+
- **Ruff** S603/S604, SIM115, BLE001 catch shell injection, FD leaks, bare excepts; add `pyproject.toml`.
159+
- Report #11; PRD: code quality gate (Ruff + pre-commit + custom hooks).
161160

162161
## Maintenance
163162
- Root and local `AGENTS.md` / `CLAUDE.md` files are generated from sources in `docs/ops/`.

docs/ops/ROOT_AGENT_GUIDE.md

Lines changed: 15 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -132,32 +132,31 @@ full operations manual.
132132
- **3 deprecated model IDs**: `claude-opus-4-5-20251101``claude-opus-4-6` in skills.
133133

134134
### Git / Auth
135-
- `gh auth refresh -h github.com -s write:packages`. Env vars must be **exported** for Harbor subprocesses (`set -a` before sourcing `.env.local`).
136-
- GitHub push protection blocks synthetic keys. Squash with `git reset --soft origin/main`.
137-
- Shallow clones fail on push. Some repos use `master`; detect with `git symbolic-ref refs/remotes/origin/HEAD`.
138-
- **gitignore negation**: `!child/` doesn't work when parent dir is ignored. Use `git add -f`.
139-
- **Remote URL stale**: `CodeContextBench.git` redirects to `CodeScaleBench.git`. Update local git remote config.
135+
- `gh auth refresh -h github.com -s write:packages`. Env vars must be **exported** (`set -a` before sourcing `.env.local`).
136+
- Push protection blocks synthetic keys: `git reset --soft origin/main`. **gitignore negation**: use `git add -f`.
137+
- **Remote URL stale**: `CodeContextBench.git``CodeScaleBench.git`. Update remote config.
140138

141139
### Python / Subprocess
142140
- `dict.get(key, default)` doesn't guard `None`; use `or default_value`. `json.load(open())` leaks FDs; use `with open`.
143141
- `with open(log) as f: Popen(stdout=f)` closes handle; use bare `open()`. macOS Bash 3.2 lacks `declare -A`.
144-
- No `pyproject.toml`/`requirements.txt`. 200+ scripts + 9 tests use `sys.path.insert` hack. Blocks packaging, onboarding.
145-
146-
### LLM Judge
147-
- "Respond with valid JSON only" in prompts. Task-type-aware rubrics. Check `mcp__` prefix before substring-based categorization.
148-
149-
### OpenHands
150-
- `sandbox_plugins = []`. Base64-encode instructions. Alpine → `bookworm`. MCP client ~30s timeout. Block `deepsearch`/`deepsearch_read` in proxy.
151-
- `chown -R /workspace` blocks large repos; edit `runtime_init.py`. Set `PYTHONSAFEPATH=1`.
142+
- No `pyproject.toml`/`requirements.txt`. 200+ scripts + 9 tests use `sys.path.insert` hack.
152143

153144
### CI / Workflows
154-
- `docs-consistency.yml` redundant (subsumed by `repo_health.yml`). Export HTML truncates at 1200 rows.
155145
- 4 workflows use 3 Python versions (3.10/3.11/3.12); standardize to 3.10. `roam.yml` unpinned `pip install roam-code`.
156146
- 3/4 CI workflows missing top-level `permissions:` block → overly broad default GitHub Actions token scope.
157147

158148
### Pre-commit / Pytest / Ralph
159-
- Secret-detection false-positives: use `--no-verify` when flagged code is detection logic. Classes `TestPlan`/`TestCase`/`TestResult` auto-collected by pytest; rename.
160-
- Ralph: `prd.json` single-active; archive before overwrite. `prd-archive/` and `prd.json` not gitignored; risk of accidental commit.
149+
- Secret-detection false-positives: use `--no-verify` when flagged code is detection logic.
150+
- Ralph: `prd.json` single-active; archive before overwrite. `prd-archive/` and `prd.json` not gitignored.
151+
152+
### Scripts / Code Quality (Mar 17 additions)
153+
- `apply_verifier_fixes.py:9` hardcodes `/home/stephanie_jarmak/CodeScaleBench`; fails on other machines.
154+
- `context_retrieval_agent.py:432,544,552,584` `shell=True` + "no allowlist" (line 429); injection risk.
155+
- Non-atomic writes: `aggregate_status.py:669`, `apply_verifier_fixes.py:103,117,134`; use temp+rename.
156+
- Bare `except:` (swallows KeyboardInterrupt): `audit_v2_report_data.py:104`, `ds_audit.py:244,288`, `extract_v2_report_data.py:144,286`.
157+
- FD leaks: 17+ sites (not 12): `daytona_curator_runner.py:564`, `generate_csb_org_tasks.py:494`, `generate_promoted_verifiers.py:220`, `sync_oracle_files.py:50`, `validate_task_run.py:217`.
158+
- **Ruff** S603/S604, SIM115, BLE001 catch shell injection, FD leaks, bare excepts; add `pyproject.toml`.
159+
- Report #11; PRD: code quality gate (Ruff + pre-commit + custom hooks).
161160

162161
## Maintenance
163162
- Root and local `AGENTS.md` / `CLAUDE.md` files are generated from sources in `docs/ops/`.

0 commit comments

Comments
 (0)