Add proactive port conflict detection before container start#62
Add proactive port conflict detection before container start#62
Conversation
Before running docker compose up, check if the workspace port is already in use by a non-Docker process. If a conflict is found, return it to the frontend which shows "Kill & Start" and "Start Anyway" buttons. This avoids cryptic Docker errors and gives users a clear resolution path. Backend: - Add port pre-check to POST /api/docker/start with force/killConflicts opts - Extract checkPortInUse() and killPort() shared helpers from existing handlers - Skip docker-proxy/com.docker processes (container itself) Frontend: - Handle portConflicts response in startWorkspace(), startContainer(), and startAfterRebuild() with Kill & Start / Start Anyway buttons - Remove dead killPortProcess() and killPortAndRetryWizard() functions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Review: PR #62 — Proactive port conflict detection
The overall approach is solid — moving conflict detection ahead of Docker start gives a much better UX, and extracting checkPortInUse()/killPort() is a good deduplication. A few issues need fixing before merge.
Bug: startContainer() doesn't reset button text on port conflict
setup.html — startContainer():
btn.disabled = true;
btn.textContent = 'Starting...'; // set here
// ...if port conflict:
btn.disabled = false; // re-enabled, but text stays "Starting..."
return;The button is re-enabled with the text still showing "Starting..." — users see a clickable button labelled "Starting...". startWorkspace() correctly resets btn.textContent = 'Start' in the same path; startContainer() must do the same. The button's original label is "Start Container".
XSS: c.port and c.pid interpolated into innerHTML without escaping
In three places (startWorkspace, startContainer, startAfterRebuild):
const procInfo = c.process ? `${esc(c.process)} (PID ${c.pid})` : 'unknown process';
statusEl.innerHTML = `Port ${c.port} is in use by ${procInfo}. ...`;c.process is correctly escaped, but c.port and c.pid are not passed through esc(). In practice they are an integer and parsed-integer respectively, so there's no realistic attack vector here. But this breaks the consistent escaping discipline used throughout the file. Use esc(String(c.port)) / esc(String(c.pid)) to be safe and consistent.
Fragile DOM traversal in inline onclick attributes
onclick="startWorkspace('${escAttr(name)}', this.closest('.ws-actions').querySelector('.btn-success, .btn-primary'), {killConflicts:true})"this inside an onclick attribute refers to the button the click fires on (the dynamically-added "Kill & Start" button), not the original Start button. this.closest('.ws-actions') traverses up to find the actions container, then queries for the original Start button. This is fragile:
- If
.ws-actionsis not an ancestor (e.g. the button renders outside that container),closest()returnsnull, causing a crash. - The selector
'.btn-success, .btn-primary'matches the first green/primary button — any layout change could grab the wrong element.
Consider giving the original Start button a stable id or data- attribute so the onclick can reference it directly, or pass the workspace name and look it up by a data-workspace selector.
500 ms kill wait may be insufficient
await new Promise(r => setTimeout(r, 500));
// then immediately runs docker compose upThe old retry flow waited 1500 ms. Some processes (especially under WSL) can take longer to fully release a port after SIGKILL. Consider polling checkPortInUse() in a short loop instead of a fixed sleep, or at minimum extend the wait to 1000–1500 ms.
No automated tests for backend logic
The new checkPortInUse() / killPort() helpers and the port-conflict pre-check in /api/docker/start have no unit tests. The PR test plan is entirely manual. Given this is process-killing code, a few tests mocking execSync to verify: conflict detection returns correct shape, docker-proxy is skipped, killConflicts triggers kill then proceeds, force skips the check entirely, would meaningfully reduce regression risk.
Minor: docker-compose.yml changes are unrelated to the PR title
The docker-compose rename (discovery/editormain → play) and volume naming changes are infrastructure refactoring unrelated to port conflict detection. Worth splitting into a separate PR so this one stays focused and reviewable.
Summary: The port conflict bug in startContainer() and the escaping inconsistency are the must-fix items. The DOM traversal fragility is worth addressing before this ships. The 500 ms wait and test coverage are lower priority but worth noting.
There was a problem hiding this comment.
Review: Add proactive port conflict detection before container start
Overall the approach is solid — moving detection before docker compose up gives much better UX than parsing Docker error strings after the fact. The helper extraction is a clean refactor. A few issues worth addressing before merging.
Security: Missing port validation in checkPortInUse (Medium)
// src/setup-gui.js
function checkPortInUse(port) {
// ...
output = execSync(`ss -tlnp 2>/dev/null | grep ':${port} ' || true`, ...)
}The function uses port directly in a shell string without validating it's a safe integer. The existing /api/port-check and /api/port-kill handlers validate with parseInt before calling in, but the new internal call in /api/docker/start passes ws.port from config without validation:
const info = checkPortInUse(ws.port); // ws.port comes straight from workspaces configIf a workspace config had a tampered port value (e.g. "port": "3847; malicious_command"), it would execute shell commands. The attack surface is limited to local config files, but the helper itself should be defensive. Fix: add port = parseInt(port, 10); if (!port || port < 1 || port > 65535) return { inUse: false }; at the top of checkPortInUse.
Bug: Null button dereference in inline onclick handlers (Medium)
The conflict UI injects onclick strings that try to find the original Start button via DOM traversal:
statusEl.innerHTML = `... onclick="startWorkspace('${escAttr(name)}',
this.closest('.ws-actions').querySelector('.btn-success, .btn-primary'),
{killConflicts:true})" ...`If .ws-actions doesn't exist in the parent chain, or the button selector doesn't match (e.g. class changes), querySelector returns null. Then inside startWorkspace:
btn.disabled = true; // TypeError: Cannot set properties of nullThe function handles statusEl being null but assumes btn is always valid. This could silently break the Kill & Start flow. Consider passing the button reference differently, or adding a null guard at the top of startWorkspace.
Bug: startAfterRebuild has no button disable/re-enable (Low)
startWorkspace and startContainer both disable their button before the async fetch to prevent double-submission. startAfterRebuild does not disable anything — the Kill & Start / Start Anyway buttons remain clickable during the in-flight request, which can trigger duplicate concurrent starts.
Unrelated changes in docker-compose.yml (Note)
The diff renames workspaces (discovery/editormain → play), reduces memory limits, and adds named volumes. These aren't mentioned in the PR description and appear to be workspace-specific config changes bundled with the feature. If intentional, worth calling out explicitly; if not, should be a separate commit/PR.
Minor: Helper functions defined after their call sites
checkPortInUse and killPort are defined around line 1463 but called from the /api/docker/start handler at ~line 1244. JavaScript function hoisting makes this work, but it's easier to follow if helpers are defined before use.
What's working well
- Proactive detection before Docker start is the right approach
- Filtering out
docker-proxy/com.dockerprocesses is correct - Helper extraction removes real duplication from the port-check/port-kill handlers
- Frontend fallback path (parsing Docker error strings) is preserved as a safety net
escAttrusage in onclick attributes andescin displayed text looks correct
There was a problem hiding this comment.
Review: Add proactive port conflict detection before container start
Overall the approach is solid — moving detection before docker compose up gives much better UX than parsing Docker error strings after the fact. The helper extraction is a clean refactor. A few issues worth addressing before merging.
Security: Missing port validation in checkPortInUse (Medium)
The function uses port directly in a shell string without validating it is a safe integer. The existing /api/port-check and /api/port-kill handlers validate with parseInt before calling in, but the new internal call in /api/docker/start passes ws.port from config without validation:
const info = checkPortInUse(ws.port); // ws.port comes straight from workspaces config
If a workspace config had a tampered port value (e.g. port: "3847; malicious_command"), it would execute shell commands. The attack surface is limited to local config files, but the helper itself should be defensive.
Fix: add port validation at the top of checkPortInUse:
function checkPortInUse(port) {
port = parseInt(port, 10);
if (!port || port < 1 || port > 65535) return { inUse: false };
// ...
}
Bug: Null button dereference in inline onclick handlers (Medium)
The conflict UI injects onclick strings that find the original Start button via DOM traversal:
statusEl.innerHTML = `... onclick="startWorkspace('...',
this.closest('.ws-actions').querySelector('.btn-success, .btn-primary'),
{killConflicts:true})" ...`
If .ws-actions does not exist in the parent chain, querySelector returns null. Inside startWorkspace:
btn.disabled = true; // TypeError: Cannot set properties of null
The function handles statusEl being null but assumes btn is always valid. Add a null guard at the top of startWorkspace, or pass the btn reference more robustly.
Bug: startAfterRebuild has no button disable/re-enable (Low)
startWorkspace and startContainer both disable their button before the async fetch to prevent double-submission. startAfterRebuild does not — the Kill & Start / Start Anyway buttons remain clickable during the in-flight request, risking duplicate concurrent starts.
Unrelated changes in docker-compose.yml (Note)
The diff renames workspaces (discovery/editormain to play), reduces memory limits, and adds named volumes. These are not mentioned in the PR description and appear to be workspace-specific config changes bundled with the feature. If intentional, worth calling out explicitly; if not, should be a separate commit/PR.
Minor: Helpers defined after their call sites
checkPortInUse and killPort are defined around line 1463 but called from the /api/docker/start handler at ~line 1244. JavaScript hoisting makes this work but harder to follow. Consider moving helpers above their first call site.
What is working well
- Proactive detection before Docker start is the right approach
- Filtering out docker-proxy/com.docker processes is correct
- Helper extraction removes real duplication from the port-check/port-kill handlers
- Frontend fallback path (parsing Docker error strings) is preserved as a safety net
- escAttr usage in onclick attributes and esc in displayed text looks correct
- Fix startContainer() not resetting button text on port conflict
- Escape c.port and c.pid with esc() in all innerHTML interpolations
- Use data-workspace attribute for stable Start button lookup instead
of fragile this.closest('.ws-actions') DOM traversal
- Increase kill wait from 500ms to 1000ms for WSL process release time
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Addressed review feedback in 342351d:
Not addressed (lower priority):
|
- Validate port as integer in checkPortInUse() to prevent shell injection from tampered workspace config values - Add null guard at top of startWorkspace() for missing btn reference - Disable buttons in startAfterRebuild() during in-flight request to prevent duplicate concurrent starts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Addressed second review round in e1ebd05:
|
There was a problem hiding this comment.
PR Review: Add proactive port conflict detection before container start
This PR has gone through two prior review rounds and most of the previously flagged issues have been addressed well. The core approach is correct and the escaping discipline in the new proactive path is solid. However, there are a few remaining issues that need fixes before merging.
Must-fix issues
1. startContainer() does not re-enable the button in the fallback Docker error path
File: public/setup.html — startContainer(), the else block (fallback Docker error string parsing)
The proactive conflict path correctly resets btn.textContent and btn.disabled, but the older fallback path (which parses Docker error output for address already in use) was missed. If the fallback fires and the port regex fails to match, the user is left with a permanently disabled "Starting..." button and must reload the page.
Fix: add btn.textContent = 'Start Container'; btn.disabled = false; before each status.innerHTML/status.textContent assignment in the fallback else block — consistent with how startWorkspace handles errors.
2. startAfterRebuild missing null guard for status element
File: public/setup.html — startAfterRebuild()
function startAfterRebuild(name, opts) {
const status = document.getElementById('ws-rebuild-status-' + name);
status.querySelectorAll('button')... // TypeError if status is nullIf the element doesn't exist (UI re-rendered between click and function invocation), this crashes with a TypeError. startWorkspace and startContainer both guard with if (!statusEl) return;. One-line fix: add if (!status) return; at the top of startAfterRebuild.
3. killPort() — no integer validation on pid before shell interpolation
File: src/setup-gui.js — killPort() function
pid comes from checkPortInUse() which stores it as parseInt(pidMatch[1], 10) — so it's an integer in practice. However, killPort itself doesn't assert this before interpolating into the shell string. If checkPortInUse is ever changed, this becomes a shell injection vector. Suggested guard:
if (typeof pid !== 'number' || !Number.isFinite(pid) || pid <= 0) return { killed: false };Low-priority notes (no fix required)
startAfterRebuild button disable is moot: The querySelectorAll('button').forEach(b => b.disabled = true) runs right before status.textContent = '...' which removes all child nodes, discarding the buttons. The double-submission window is effectively closed (buttons are gone), but the code is misleading. Not a bug but worth a cleanup comment.
Only the first port conflict is shown: The backend returns portConflicts as an array but the frontend always uses portConflicts[0]. Multi-workspace starts could have multiple conflicts; the user resolves them one at a time. Functional but worth a follow-up.
No unit tests for checkPortInUse/killPort: Acknowledged by the author as out of scope. These functions kill OS processes — please file a follow-up issue for regression coverage.
docker-compose.yml changes are unrelated to the feature (workspace rename, memory limit tweaks, named volumes). Fine to include if intentional.
What's working well
- The pre-check-then-act pattern is the right approach; parsing Docker errors after the fact was brittle.
- The
docker-proxy/com.dockerfilter correctly avoids killing Docker's own port bindings. escAttr()used inonclickattributes,esc()used in displayed text — escaping is consistent in the new path.data-workspaceattribute for button lookup is stable;if (!btn) return;guards are present instartWorkspace.- Port integer validation in
checkPortInUsedefends against shell injection from malformed config.
Summary
docker compose up, the backend now checks if workspace ports are in use by non-Docker processescheckPortInUse()andkillPort()helpers, removing duplicate code from existing port-check/port-kill API handlersTest plan
wsl -- bash -c "node -e 'require(\"net\").createServer().listen(3847)'")🤖 Generated with Claude Code