feat: add MCP server example for sandboxed JavaScript execution by simongdavies · Pull Request #35 · hyperlight-dev/hyperlight-js

simongdavies · 2026-03-03T18:16:01Z

Add an MCP (Model Context Protocol) server that exposes an execute_javascript tool, allowing AI agents to run arbitrary JavaScript inside an isolated Hyperlight micro-VM sandbox with strict CPU time limits and automatic snapshot/restore recovery after timeouts.

Includes server implementation, demo scripts (PowerShell and Bash), vitest test suite, and documentation.

Add an MCP (Model Context Protocol) server that exposes an execute_javascript tool, allowing AI agents to run arbitrary JavaScript inside an isolated Hyperlight micro-VM sandbox with strict CPU time limits and automatic snapshot/restore recovery after timeouts. Includes server implementation, demo scripts (PowerShell and Bash), vitest test suite, and documentation. Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>

Copilot

Pull request overview

Adds a new example MCP (Model Context Protocol) server under src/js-host-api/examples/mcp-server that lets MCP clients execute JavaScript inside a Hyperlight sandbox with configurable resource limits, plus demo scripts, documentation, and a Vitest-based integration test suite.

Changes:

Introduces an MCP stdio server (execute_javascript) that compiles/runs JS inside a reusable Hyperlight sandbox with CPU + wall-clock timeouts, snapshot/restore recovery, and optional timing/code logs.
Adds Vitest config + multiple integration-style test suites covering tool behavior, timeouts/recovery, env-var configurability, and timing log output.
Adds end-to-end demo scripts (bash + PowerShell) and a README describing setup and client configuration.

Reviewed changes

Copilot reviewed 11 out of 13 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
src/js-host-api/examples/mcp-server/server.js	MCP server implementation; sandbox lifecycle, limits, logging, and tool registration.
src/js-host-api/examples/mcp-server/package.json	Example package definition with MCP SDK, Zod, and Vitest.
src/js-host-api/examples/mcp-server/vitest.config.js	Vitest configuration for the example’s tests and timeouts.
src/js-host-api/examples/mcp-server/tests/mcp-server.test.js	End-to-end MCP protocol/tool integration tests via stdio NDJSON.
src/js-host-api/examples/mcp-server/tests/config.test.js	Tests for env-configurable limits, defaults, and stderr warnings.
src/js-host-api/examples/mcp-server/tests/timing.test.js	Tests for `HYPERLIGHT_TIMING_LOG` JSONL output and timing fields.
src/js-host-api/examples/mcp-server/tests/prompt-examples.test.js	Large suite validating outputs for “README prompt” examples.
src/js-host-api/examples/mcp-server/demo-copilot-cli.sh	Bash demo script to run prompts via Copilot CLI with MCP config.
src/js-host-api/examples/mcp-server/demo-copilot-cli.ps1	PowerShell demo script to run prompts via Copilot CLI with MCP config.
src/js-host-api/examples/mcp-server/README.md	End-user documentation for the example server and demos.
src/js-host-api/eslint.config.mjs	Adds `performance` as an allowed global (used by the new server).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 11 out of 13 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 11 out of 13 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 11 out of 13 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 11 out of 13 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 11 out of 13 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 11 out of 13 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 11 out of 13 changed files in this pull request and generated 8 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-27T23:02:45Z

+    it('should have all timing values as non-negative integers', async () => {
+        const records = readTimingRecords();
+        expect(records.length).toBeGreaterThanOrEqual(1);
+
+        const record = records[records.length - 1];


This test (and several below) assumes a previous test already executed the tool and wrote at least one timing record. That makes the suite order-dependent and can fail when running a single test or if the runner ever shuffles tests. Consider ensuring each test arranges its own tool invocation (or add a beforeEach that performs one call and/or clears the timing log).

Copilot · 2026-03-27T23:02:46Z

+let inside = 0;
+const N = 100000;
+for (let i = 0; i < N; i++) {
+    const x = Math.random();
+    const y = Math.random();


This prompt-implementation uses Math.random() inside the sandbox, which makes the test nondeterministic and potentially flaky (even with wide bounds) and also increases runtime (100k samples). For CI stability, consider swapping in a small deterministic PRNG with a fixed seed (or otherwise removing randomness) so outputs and performance are reproducible.

Suggested change

let inside = 0;

const N = 100000;

for (let i = 0; i < N; i++) {

const x = Math.random();

const y = Math.random();

// Deterministic PRNG (xorshift32) for reproducible tests

let seed = 123456789;

function rand() {

seed ^= seed << 13;

seed ^= seed >>> 17;

seed ^= seed << 5;

return (seed >>> 0) / 0x100000000;

}

let inside = 0;

const N = 100000;

for (let i = 0; i < N; i++) {

const x = rand();

const y = rand();

Copilot · 2026-03-27T23:02:46Z

+            // 100 particles with random velocities over 1000 steps will
+            // always produce bounces — the probability of zero bounces
+            // is vanishingly small (each particle has ~50% chance of
+            // bouncing per axis per step).
+            expect(result.totalBounces).toBeGreaterThan(0);


expect(result.totalBounces).toBeGreaterThan(0) is probabilistic because initial positions/velocities are random; it can fail (rarely) if no particle crosses a boundary. To avoid flaky CI, make the simulation deterministic (seeded PRNG) or assert only deterministic invariants (e.g., allInBounds, array sizes, numeric types/ranges).

Suggested change

// 100 particles with random velocities over 1000 steps will

// always produce bounces — the probability of zero bounces

// is vanishingly small (each particle has ~50% chance of

// bouncing per axis per step).

expect(result.totalBounces).toBeGreaterThan(0);

// totalBounces should be a non-negative integer; the exact

// value depends on random initial conditions and velocities.

expect(result.totalBounces).toBeGreaterThanOrEqual(0);

expect(Number.isInteger(result.totalBounces)).toBe(true);

Copilot · 2026-03-27T23:02:46Z

+    #   --available-tools               Restrict model to ONLY our MCP tool plus
+    #                                   internal tools the agent needs to function
+    #                                   (task_complete, report_intent). The model
+    #                                   cannot call shell, file write, web fetch,
+    #                                   or any other tool. This is the security
+    #                                   layer — even though --allow-all-tools is
+    #                                   set, only whitelisted tools are visible.


The comments describe using --available-tools to restrict the model’s visible toolset, but the actual copilot invocation doesn’t include --available-tools. Either add the flag (if still supported) or update the comments, since this currently overstates the security restrictions being applied.

Suggested change

# --available-tools Restrict model to ONLY our MCP tool plus

# internal tools the agent needs to function

# (task_complete, report_intent). The model

# cannot call shell, file write, web fetch,

# or any other tool. This is the security

# layer — even though --allow-all-tools is

# set, only whitelisted tools are visible.

# --available-tools (Optional) Can be used to restrict the model

# to ONLY specific tools plus internal tools

# the agent needs to function (for example,

# task_complete, report_intent). When set,

# the model cannot call shell, file write,

# web fetch, or any other non-whitelisted

# tool. NOTE: this demo script does NOT

# currently pass --available-tools; do not

# assume such a restriction is in effect.

Copilot · 2026-03-27T23:02:46Z

+            type: 'stdio',
+            command: 'node',
+            args: ['${SERVER_JS}'],
+            env,
+        };


This embedded node -e snippet interpolates ${SERVER_JS} (and other values) directly into a JavaScript string. Paths containing spaces, quotes, or backslashes can break the generated JS/JSON. Prefer passing values via environment variables/stdin, or JSON-escaping them (e.g., via JSON.stringify) before embedding.

Copilot · 2026-03-27T23:02:47Z

+    try {
+        # Use --% (stop-parsing token) to prevent PS from mangling
+        # the native command arguments. Pass prompt via temp file.
+        $rawOutput = & $script:CopilotBin `
+            -p $fullPrompt `
+            -s `
+            --additional-mcp-config "@$mcpTmp" `


The comment says to use PowerShell’s --% stop-parsing token to prevent argument mangling, but the command invocation below doesn’t use --%. Either adjust the comment or actually use the recommended mechanism (or a temp file via @file) so the rationale matches the implementation and avoids confusion when debugging.

Copilot · 2026-03-27T23:02:47Z

+                // Track objects during traversal to detect true circular
+                // references. We use a replacer that adds objects on entry
+                // and removes them on exit (post-order), so DAG-shared refs
+                // (e.g. { a: obj, b: obj }) are correctly duplicated rather
+                // than replaced with "[Circular]".
+                const ancestors = new Set();
+                return JSON.stringify(
+                    value,
+                    function (key, val) {
+                        if (typeof val === 'bigint') {
+                            return val.toString();
+                        }
+                        if (typeof val === 'object' && val !== null) {
+                            if (ancestors.has(val)) {
+                                return '[Circular]';
+                            }
+                            ancestors.add(val);
+                            // Schedule removal after this subtree is fully traversed.
+                            // JSON.stringify calls the replacer depth-first, so by the
+                            // time we return from this key the children are already
+                            // processed. We use a finally-scheduled microtask to
+                            // remove after the current synchronous stringify pass.
+                            // Actually — JSON.stringify is synchronous, so we can
+                            // lean on the fact that the replacer is called in-order
+                            // and use a post-processing cleanup. For simplicity,
+                            // just leave the Set as-is — true cycles will be caught,
+                            // and shared non-cyclic refs in practice don't occur in
+                            // sandbox return values (they're freshly JSON-parsed).
+                        }
+                        return val;
+                    },
+                    2
+                );


safeStringifyResult marks any repeated object reference as "[Circular]" because ancestors is never cleared during traversal. This will incorrectly replace shared (non-cyclic) references and can also grow memory for large results. Consider using a stack-based cycle detector (add on entry + remove on exit) or a proven circular-safe serializer so only true cycles are replaced.

Suggested change

// Track objects during traversal to detect true circular

// references. We use a replacer that adds objects on entry

// and removes them on exit (post-order), so DAG-shared refs

// (e.g. { a: obj, b: obj }) are correctly duplicated rather

// than replaced with "[Circular]".

const ancestors = new Set();

return JSON.stringify(

value,

function (key, val) {

if (typeof val === 'bigint') {

return val.toString();

}

if (typeof val === 'object' && val !== null) {

if (ancestors.has(val)) {

return '[Circular]';

}

ancestors.add(val);

// Schedule removal after this subtree is fully traversed.

// JSON.stringify calls the replacer depth-first, so by the

// time we return from this key the children are already

// processed. We use a finally-scheduled microtask to

// remove after the current synchronous stringify pass.

// Actually — JSON.stringify is synchronous, so we can

// lean on the fact that the replacer is called in-order

// and use a post-processing cleanup. For simplicity,

// just leave the Set as-is — true cycles will be caught,

// and shared non-cyclic refs in practice don't occur in

// sandbox return values (they're freshly JSON-parsed).

}

return val;

},

2

);

// Track the current traversal path to detect true circular

// references. Objects are added on entry and removed on exit,

// so only objects that appear again on the active stack are

// treated as "[Circular]". Shared, non-cyclic references are

// serialized normally.

const ancestors = [];

const serialize = (val) => {

if (typeof val === 'bigint') {

return val.toString();

}

if (val === null || typeof val !== 'object') {

return val;

}

if (ancestors.includes(val)) {

return '[Circular]';

}

ancestors.push(val);

try {

if (Array.isArray(val)) {

return val.map((item) => serialize(item));

}

const resultObj = {};

for (const [k, v] of Object.entries(val)) {

resultObj[k] = serialize(v);

}

return resultObj;

} finally {

ancestors.pop();

}

};

const processed = serialize(value);

return JSON.stringify(processed, null, 2);

Copilot · 2026-03-27T23:02:47Z

+/** Guest heap size in bytes. Override with HYPERLIGHT_HEAP_SIZE_MB (megabytes). */
+const HEAP_SIZE_BYTES = Math.min(
+    parsePositiveInt(process.env.HYPERLIGHT_HEAP_SIZE_MB, DEFAULT_HEAP_SIZE_MB) * 1024 * 1024,
+    MAX_BYTES
+);
+
+/** Guest scratch size in bytes. Override with HYPERLIGHT_SCRATCH_SIZE_MB (megabytes).
+ *  Maps to setScratchSize() on the SandboxBuilder API. */
+const SCRATCH_SIZE_BYTES = Math.min(
+    parsePositiveInt(process.env.HYPERLIGHT_SCRATCH_SIZE_MB, DEFAULT_SCRATCH_SIZE_MB) * 1024 * 1024,
+    MAX_BYTES
+);


Clamping heap/scratch bytes with Math.min(..., 0xffffffff) can produce non-integer MiB values (e.g., 4096MB becomes 4095.999...MB) and may pass a byte size that isn’t aligned to MiB. Prefer clamping at the MB level (or rounding down to a MiB boundary) before converting to bytes so the configured/printed sizes stay consistent and predictable.

Suggested change

/** Guest heap size in bytes. Override with HYPERLIGHT_HEAP_SIZE_MB (megabytes). */

const HEAP_SIZE_BYTES = Math.min(

parsePositiveInt(process.env.HYPERLIGHT_HEAP_SIZE_MB, DEFAULT_HEAP_SIZE_MB) * 1024 * 1024,

MAX_BYTES

);

/** Guest scratch size in bytes. Override with HYPERLIGHT_SCRATCH_SIZE_MB (megabytes).

* Maps to setScratchSize() on the SandboxBuilder API. */

const SCRATCH_SIZE_BYTES = Math.min(

parsePositiveInt(process.env.HYPERLIGHT_SCRATCH_SIZE_MB, DEFAULT_SCRATCH_SIZE_MB) * 1024 * 1024,

MAX_BYTES

);

/** Maximum heap/scratch size in mebibytes that still fits within MAX_BYTES. */

const MAX_MIB = Math.floor(MAX_BYTES / (1024 * 1024));

/** Guest heap size in bytes. Override with HYPERLIGHT_HEAP_SIZE_MB (megabytes). */

const HEAP_SIZE_BYTES =

Math.min(

parsePositiveInt(process.env.HYPERLIGHT_HEAP_SIZE_MB, DEFAULT_HEAP_SIZE_MB),

MAX_MIB

) *

1024 *

1024;

/** Guest scratch size in bytes. Override with HYPERLIGHT_SCRATCH_SIZE_MB (megabytes).

* Maps to setScratchSize() on the SandboxBuilder API. */

const SCRATCH_SIZE_BYTES =

Math.min(

parsePositiveInt(process.env.HYPERLIGHT_SCRATCH_SIZE_MB, DEFAULT_SCRATCH_SIZE_MB),

MAX_MIB

) *

1024 *

1024;

simongdavies added the kind/enhancement New feature or improvement label Mar 3, 2026

simongdavies force-pushed the add-mcp-example branch from 8e32b4b to 62d98d0 Compare March 3, 2026 18:29

simongdavies requested a review from Copilot March 26, 2026 13:23

Copilot started reviewing on behalf of simongdavies March 26, 2026 13:23 View session

Copilot AI reviewed Mar 26, 2026

View reviewed changes

simongdavies requested a review from Copilot March 26, 2026 13:56

Copilot started reviewing on behalf of simongdavies March 26, 2026 13:57 View session

Copilot AI reviewed Mar 26, 2026

View reviewed changes

simongdavies requested a review from Copilot March 26, 2026 22:58

Copilot started reviewing on behalf of simongdavies March 26, 2026 22:58 View session

simongdavies force-pushed the add-mcp-example branch from 1b8889b to 8980f35 Compare March 26, 2026 23:00

Copilot AI reviewed Mar 26, 2026

View reviewed changes

simongdavies requested a review from Copilot March 27, 2026 14:25

Copilot started reviewing on behalf of simongdavies March 27, 2026 14:26 View session

Copilot AI reviewed Mar 27, 2026

View reviewed changes

simongdavies force-pushed the add-mcp-example branch from 8980f35 to 5d3985b Compare March 27, 2026 15:09

simongdavies requested a review from Copilot March 27, 2026 15:10

Copilot started reviewing on behalf of simongdavies March 27, 2026 15:11 View session

Copilot AI reviewed Mar 27, 2026

View reviewed changes

Comment thread src/js-host-api/examples/mcp-server/server.js

Comment thread src/js-host-api/examples/mcp-server/server.js

Comment thread src/js-host-api/examples/mcp-server/tests/timing.test.js Outdated

simongdavies force-pushed the add-mcp-example branch from 5d3985b to 90267e4 Compare March 27, 2026 16:31

simongdavies requested a review from Copilot March 27, 2026 16:32

Copilot started reviewing on behalf of simongdavies March 27, 2026 16:33 View session

Copilot AI reviewed Mar 27, 2026

View reviewed changes

Comment thread src/js-host-api/examples/mcp-server/server.js

Comment thread src/js-host-api/examples/mcp-server/server.js

Comment thread src/js-host-api/examples/mcp-server/server.js

Comment thread src/js-host-api/examples/mcp-server/tests/mcp-server.test.js

simongdavies closed this Mar 27, 2026

simongdavies deleted the add-mcp-example branch March 27, 2026 19:03

simongdavies restored the add-mcp-example branch March 27, 2026 19:32

simongdavies reopened this Mar 27, 2026

simongdavies force-pushed the add-mcp-example branch from 90267e4 to e0a8f26 Compare March 27, 2026 21:59

simongdavies requested a review from Copilot March 27, 2026 22:00

Copilot started reviewing on behalf of simongdavies March 27, 2026 22:01 View session

Copilot AI reviewed Mar 27, 2026

View reviewed changes

Comment thread src/js-host-api/examples/mcp-server/server.js Outdated

Comment thread src/js-host-api/examples/mcp-server/README.md Outdated

Comment thread src/js-host-api/examples/mcp-server/server.js Outdated

Comment thread src/js-host-api/examples/mcp-server/server.js

Apply suggestions from code review

dcc0a5c

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>

simongdavies force-pushed the add-mcp-example branch from e0a8f26 to dcc0a5c Compare March 27, 2026 22:51

simongdavies requested a review from Copilot March 27, 2026 22:55

Copilot started reviewing on behalf of simongdavies March 27, 2026 22:55 View session

Copilot AI reviewed Mar 27, 2026

View reviewed changes

-let inside = 0;
-const N = 100000;
-for (let i = 0; i < N; i++) {
-    const x = Math.random();
-    const y = Math.random();
+// Deterministic PRNG (xorshift32) for reproducible tests
+let seed = 123456789;
+function rand() {
+    seed ^= seed << 13;
+    seed ^= seed >>> 17;
+    seed ^= seed << 5;
+    return (seed >>> 0) / 0x100000000;
+}
+let inside = 0;
+const N = 100000;
+for (let i = 0; i < N; i++) {
+    const x = rand();
+    const y = rand();

-    #   --available-tools               Restrict model to ONLY our MCP tool plus
-    #                                   internal tools the agent needs to function
-    #                                   (task_complete, report_intent). The model
-    #                                   cannot call shell, file write, web fetch,
-    #                                   or any other tool. This is the security
-    #                                   layer — even though --allow-all-tools is
-    #                                   set, only whitelisted tools are visible.
+    #   --available-tools               (Optional) Can be used to restrict the model
+    #                                   to ONLY specific tools plus internal tools
+    #                                   the agent needs to function (for example,
+    #                                   task_complete, report_intent). When set,
+    #                                   the model cannot call shell, file write,
+    #                                   web fetch, or any other non-whitelisted
+    #                                   tool. NOTE: this demo script does NOT
+    #                                   currently pass --available-tools; do not
+    #                                   assume such a restriction is in effect.

Conversation

simongdavies commented Mar 3, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!