docs(simulator): updated tool_simulator docs#752

Open

poshinchen wants to merge 1 commit intostrands-agents:mainfrom

poshinchen:docs/simulators

Contributor

poshinchen commented Apr 10, 2026

Description

Added toolSimulator related docs

Related Issues

N/A

Type of Change

New content
Content update/revision

Checklist

I have read the CONTRIBUTING document
My changes follow the project's documentation style
I have tested the documentation locally using npm run dev
Links in the documentation are valid and working

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.


          docs(simulator): updated tool_simulator docs

91968b7

poshinchen temporarily deployed to auto-approve

April 10, 2026 15:32

— with

GitHub Actions Inactive

poshinchen temporarily deployed to manual-approval

April 10, 2026 15:32

— with

GitHub Actions Inactive

poshinchen mentioned this pull request

docs(simulators): updated simulators README strands-agents/evals#195

Merged

7 tasks

Contributor

github-actions bot commented Apr 10, 2026

Documentation Preview Ready

Your documentation preview has been successfully deployed!

Preview URL: https://d3ehv1nix5p99z.cloudfront.net/pr-cms-752/docs/user-guide/quickstart/overview/

Updated at: 2026-04-10T15:36:53.987Z

github-actions bot added the strands-running label

Unshure reviewed

View reviewed changes

src/content/docs/user-guide/evals-sdk/simulators/index.mdx

    
              ## Overview

              Simulators enable dynamic, multi-turn evaluation of conversational agents by generating realistic interaction patterns. Unlike static evaluators that assess single outputs, simulators actively participate in conversations, adapting their behavior based on agent responses to create authentic evaluation scenarios.

              Simulators enable dynamic evaluation of agents by generating realistic interaction patterns. Unlike static evaluators that assess single outputs, simulators actively participate in the evaluation loop — driving multi-turn conversations or generating realistic tool responses — to create authentic evaluation scenarios.

Member

Unshure Apr 10, 2026

non-blocker: I would like to try and avoid em-dashes if possible? Its basically shouting that its llm generated (not a bad thing, but would just like to lean away from these obvious signals).

src/content/docs/user-guide/evals-sdk/simulators/tool_simulation.mdx

+              - You want to test agent tool-use patterns without side effects
+              - Tools are still under development or unavailable in the test environment
+              ```python

Member

Unshure Apr 10, 2026

In this example, its not obvious how the simulated tool would know what the weather is in seattle. Can you show how you pass in context to the simulated tool in this case?

src/content/docs/user-guide/evals-sdk/simulators/tool_simulation.mdx

Comment on lines +38 to +45

+              ## Key Features
+              - **Decorator-Based Registration**: Register tools with `@tool_simulator.tool()` using familiar function signatures and docstrings
+              - **Schema-Validated Responses**: Pydantic output schemas ensure structured, consistent responses from the LLM
+              - **Shared State**: Related tools share call history and context via `share_state_id`
+              - **Stateful Context**: Initial state descriptions and call history are included in LLM prompts for consistent multi-call sequences
+              - **Drop-in Replacement**: Simulated tools plug directly into Strands `Agent` via `get_tool()`
+              - **Bounded Call Cache**: FIFO eviction keeps memory usage predictable for long-running evaluations

Member

Unshure Apr 10, 2026

These are all covered in below sections, and the page already has a table of contents. I dont think we need this.

src/content/docs/user-guide/evals-sdk/simulators/tool_simulation.mdx

Comment on lines +56 to +78

+              ### Registering a Tool
+              Define a function with type hints and a docstring, then decorate it with `@tool_simulator.tool()`. Provide an `output_schema` to control the response structure:
+              ```python
+              from typing import Any
+              from pydantic import BaseModel, Field
+              from strands_evals.simulation.tool_simulator import ToolSimulator
+              tool_simulator = ToolSimulator()
+              class OrderStatus(BaseModel):
+                  order_id: str = Field(..., description="Order identifier")
+                  status: str = Field(..., description="Current order status")
+                  estimated_delivery: str = Field(..., description="Estimated delivery date")
+              @tool_simulator.tool(output_schema=OrderStatus)
+              def check_order(order_id: str) -> dict[str, Any]:
+                  """Check the current status of a customer order."""
+                  pass
+              ```
+              ### Attaching to an Agent

Member

Unshure Apr 10, 2026

Can we combine these two? I think "Attaching to an Agent" is self explanatory. We dont need a whole section describing it

src/content/docs/user-guide/evals-sdk/simulators/tool_simulation.mdx

+              reports[0].run_display()
+              ```
+              ## Inspecting State

Member

Unshure Apr 10, 2026

This, and the below sections, might make more sense under a broader "Advanced Usage" or something.

src/content/docs/user-guide/evals-sdk/simulators/tool_simulation.mdx

+              tool_simulator = ToolSimulator(state_registry=registry)
+              ```
+              ## API Reference

Member

Unshure Apr 10, 2026

Do we have auto-generated evals api docs? If not we should

src/content/docs/user-guide/evals-sdk/simulators/tool_simulation.mdx

+                  response: str  # Default response when no output_schema is provided
+              ```
+              ## Best Practices

Member

Unshure Apr 10, 2026

Do these actually help anyone? When I see "Best Practices" on docs pages, i basically assume they are ai generated and provide little value. If they arent actually helpful, and just restate what is said above, we should remove them. Same goes for "Troubleshooting", this should be for explicit cases where customers commonly trip up

github-actions bot reviewed

View reviewed changes

src/content/docs/user-guide/evals-sdk/simulators/tool_simulation.mdx

+              memory_exporter = telemetry.in_memory_exporter
+              tool_simulator = ToolSimulator()
+              class HVACResponse(BaseModel):

Contributor

github-actions bot Apr 10, 2026

Issue: Missing imports for BaseModel and Field in the "Integration with Experiments" code example. The snippet uses BaseModel and Field (line 187-190) but only imports from strands, strands_evals, and strands_evals.simulation.tool_simulator.

Suggestion: Add from pydantic import BaseModel, Field to the imports block at the top of this code example to match the pattern used in other code blocks on this page (e.g., the "Registering a Tool" example at line 62).

github-actions bot reviewed

View reviewed changes

src/content/docs/user-guide/evals-sdk/simulators/tool_simulation.mdx

+              This is useful when:
+              - Real tools require live infrastructure (APIs, databases, hardware)
+              - You need deterministic, controllable tool behavior for evaluation

Contributor

github-actions bot Apr 10, 2026

Issue: The term "deterministic" may be misleading here. Since the tool responses are LLM-generated, they are inherently non-deterministic — the same inputs can produce different outputs across runs. The value is "controllable" and "reproducible in character" (e.g., the LLM will return weather-like data), but not deterministic in the strict sense.

Suggestion: Consider rewording to: You need controllable tool behavior for evaluation (dropping "deterministic"), or clarifying what's meant, e.g., You need controllable, consistent tool behavior for evaluation (without external dependencies).

Contributor

github-actions bot commented Apr 10, 2026

Assessment: Comment

Good addition to the evals SDK documentation. The ToolSimulator docs are well-structured, follow the established patterns from the user_simulation.mdx page, and provide clear code examples with progressive complexity (basic → shared state → experiment integration → troubleshooting).

Review Details

Accuracy: The word "deterministic" appears in both files when describing LLM-powered simulation — since responses are LLM-generated, this is misleading and should be softened to "controllable" (see inline comments).
Code Examples: The "Integration with Experiments" example is missing from pydantic import BaseModel, Field imports (see inline comment on tool_simulation.mdx L187). Other code examples are complete and well-structured.
Structure & Consistency: The page follows the same section pattern as user_simulation.mdx (Overview → Key Features → Basic Usage → Integration → Best Practices → Troubleshooting → Related Docs). The comparison table in index.mdx is a nice touch.
Links: All internal links (index.md, user_simulation.md, quickstart.md, goal_success_rate_evaluator.md) reference files that exist in the repo.

Nice comprehensive documentation with the shared state and troubleshooting sections — those will save users a lot of debugging time.

github-actions bot reviewed

View reviewed changes

src/content/docs/user-guide/evals-sdk/simulators/index.mdx


		The `ToolSimulator` enables LLM-powered simulation of tool behavior for controlled agent evaluation. Instead of calling real tools, registered tools are executed by an LLM that generates realistic, schema-validated responses while maintaining state across calls.

		This is useful when real tools require live infrastructure, when you need deterministic behavior for evaluation, or when tools are still under development.

Contributor

github-actions bot Apr 10, 2026

Issue: Same "deterministic" wording concern as in tool_simulation.mdx. Since responses are LLM-generated, "deterministic" is misleading.

Suggestion: Consider: ...when you need controllable behavior for evaluation...

github-actions bot removed the strands-running label

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet