Skip to content

Add deep research multi-agent example#288

Open
danielmillerp wants to merge 1 commit intomainfrom
dm/deep-research-example
Open

Add deep research multi-agent example#288
danielmillerp wants to merge 1 commit intomainfrom
dm/deep-research-example

Conversation

@danielmillerp
Copy link
Contributor

@danielmillerp danielmillerp commented Mar 19, 2026

Summary

  • Adds a multi-agent deep research demo with 4 agents: orchestrator + GitHub, Docs, and Slack research subagents
  • Demonstrates key patterns: orchestrator/subagent communication via ACP, shared task ID for unified output, MCP server integration, and conversation compaction for long-running research
  • All sensitive data (API keys, internal Slack channels, deployment infra) has been removed or genericized for public consumption

Test plan

  • Verify no sensitive data in the PR diff (API keys, internal URLs, Slack channel IDs)
  • Review agent structure matches existing examples/demos conventions
  • Test running agents locally with agentex agents run --manifest manifest.yaml

🤖 Generated with Claude Code

Greptile Summary

This PR adds a complete multi-agent deep research demo (examples/demos/deep_research/) consisting of four agents — an orchestrator and three specialized subagents (GitHub, Docs, Slack). It demonstrates several important AgentEx patterns: orchestrator/subagent communication via ACP, unified output through a shared parent task_id, Temporal-safe batched Runner.run() loops with a two-stage conversation compaction strategy, and MCP server integration via StatelessMCPServerProvider.

Key findings:

  • P1 – Orchestrator has no error handling around Runner.run(): The subagent workflows all wrap their runner loops in try/except and always reach _complete_task = True. The orchestrator does not, meaning any exception (API error, max-turns, timeout) will leave _complete_task = False and cause on_task_create to hang indefinitely at await workflow.wait_condition(lambda: self._complete_task, timeout=None).
  • P1 – All four Dockerfiles hardcode the linux_arm64 tctl binary: This breaks builds on amd64 hosts (standard Linux CI, most cloud build environments). The architecture should be detected at build time.
  • P2 – summarization.py is duplicated verbatim across all three subagent packages. Any change (threshold tuning, bug fixes) must be manually applied in three places.
  • P2 – find_last_summary_index is dead code: Defined in all three summarization.py copies but never imported or called by any workflow.

Confidence Score: 3/5

  • Safe to merge for demo/example purposes, but has a workflow-hanging bug in the orchestrator and broken Dockerfiles on amd64 that should be fixed before wider use.
  • Two P1 issues: missing error handling in the orchestrator can permanently hang workflows, and the hardcoded arm64 Docker binary breaks builds on standard amd64 CI. These are straightforward fixes but would prevent the demo from working reliably as written.
  • examples/demos/deep_research/orchestrator/project/workflow.py (missing error handling) and all four Dockerfile files (arm64 binary hardcode).

Important Files Changed

Filename Overview
examples/demos/deep_research/orchestrator/project/workflow.py Core orchestrator workflow — dispatches subagents in parallel and awaits their results via Temporal signals. Missing error handling around Runner.run() can cause the workflow to hang indefinitely if an exception is thrown.
examples/demos/deep_research/orchestrator/Dockerfile Hardcodes the arm64 tctl binary — will fail on amd64 build hosts. Same issue exists in all four agent Dockerfiles.
examples/demos/deep_research/docs_researcher/project/workflow.py Docs research workflow with batched Runner.run() and two-stage compaction. Properly handles errors and always sets _complete_task. Sends research_complete event back to parent orchestrator.
examples/demos/deep_research/github_researcher/project/workflow.py GitHub research workflow using MCP server via StatelessMCPServerProvider. Mirrors docs researcher pattern with batched runs and compaction. Uses ModelSettings(parallel_tool_calls=False) per instructions in system prompt.
examples/demos/deep_research/slack_researcher/project/workflow.py Slack research workflow using Slack MCP server. Follows the same batched-run pattern as other subagents, with proper error handling and parent notification.
examples/demos/deep_research/docs_researcher/project/summarization.py Conversation compaction logic — correctly handles both Responses API and Chat Completions formats. Contains dead code (find_last_summary_index never called) and is duplicated verbatim across all three subagents.
examples/demos/deep_research/orchestrator/project/prompts.py Orchestrator system prompt defining dispatch strategy and output format. Clean separation from workflow logic.
examples/demos/deep_research/docs_researcher/project/activities.py Temporal activity definitions for web_search (Tavily) and fetch_docs_page with URL allowlist. Correctly decorated with @activity.defn and includes reasonable error handling.
examples/demos/deep_research/README.md Well-written README covering architecture, setup, and the shared task ID pattern. Clear and accurate.

Sequence Diagram

sequenceDiagram
    participant User
    participant Orch as Orchestrator<br/>(ResearchOrchestratorWorkflow)
    participant GH as GitHub Researcher<br/>(GitHubResearchWorkflow)
    participant Docs as Docs Researcher<br/>(DocsResearchWorkflow)
    participant Slack as Slack Researcher<br/>(SlackResearchWorkflow)

    User->>Orch: EVENT_SEND (user query)
    Orch->>Orch: Runner.run(agent, max_turns=50)<br/>agent calls dispatch tools in parallel

    par Parallel dispatch
        Orch->>GH: acp.create_task(source_task_id=orch_task_id)<br/>+ EVENT_SEND {query}
        Orch->>Docs: acp.create_task(source_task_id=orch_task_id)<br/>+ EVENT_SEND {query}
        Orch->>Slack: acp.create_task(source_task_id=orch_task_id)<br/>+ EVENT_SEND {query}
    end

    GH-->>User: adk.messages.create(task_id=orch_task_id) [streams progress]
    Docs-->>User: adk.messages.create(task_id=orch_task_id) [streams progress]
    Slack-->>User: adk.messages.create(task_id=orch_task_id) [streams progress]

    GH->>Orch: EVENT_SEND {event_type: research_complete, result: ...}
    Docs->>Orch: EVENT_SEND {event_type: research_complete, result: ...}
    Slack->>Orch: EVENT_SEND {event_type: research_complete, result: ...}

    Note over Orch: workflow.wait_condition satisfied<br/>for each child_task_id

    Orch->>Orch: Runner.run() resumes — synthesizes all results
    Orch-->>User: Final comprehensive answer (streamed via TemporalStreamingHooks)
Loading

Comments Outside Diff (1)

  1. examples/demos/deep_research/orchestrator/project/workflow.py, line 2192-2196 (link)

    P1 Missing error handling around Runner.run() may hang workflow indefinitely

    The orchestrator's on_task_event_send signal handler has no try/except around Runner.run(). If the runner throws for any reason (e.g., max turns exceeded, API error, timeout), _complete_task will never be set to True, causing on_task_create to hang forever at await workflow.wait_condition(lambda: self._complete_task, timeout=None).

    All three subagent workflows correctly wrap their Runner.run() calls in try/except and ensure _complete_task = True is always reached. The orchestrator should do the same:

            try:
                result = await Runner.run(agent, self._input_list, hooks=hooks, max_turns=50)
                self._research_result = result.final_output
                self._input_list = result.to_input_list()
            except Exception as e:
                logger.error("Orchestrator research error: %s", e)
                self._research_result = f"Research failed: {e}"
            finally:
                self._complete_task = True
    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: examples/demos/deep_research/orchestrator/project/workflow.py
    Line: 2192-2196
    
    Comment:
    **Missing error handling around `Runner.run()` may hang workflow indefinitely**
    
    The orchestrator's `on_task_event_send` signal handler has no try/except around `Runner.run()`. If the runner throws for any reason (e.g., max turns exceeded, API error, timeout), `_complete_task` will never be set to `True`, causing `on_task_create` to hang forever at `await workflow.wait_condition(lambda: self._complete_task, timeout=None)`.
    
    All three subagent workflows correctly wrap their `Runner.run()` calls in try/except and ensure `_complete_task = True` is always reached. The orchestrator should do the same:
    
    ```python
            try:
                result = await Runner.run(agent, self._input_list, hooks=hooks, max_turns=50)
                self._research_result = result.final_output
                self._input_list = result.to_input_list()
            except Exception as e:
                logger.error("Orchestrator research error: %s", e)
                self._research_result = f"Research failed: {e}"
            finally:
                self._complete_task = True
    ```
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: examples/demos/deep_research/orchestrator/project/workflow.py
Line: 2192-2196

Comment:
**Missing error handling around `Runner.run()` may hang workflow indefinitely**

The orchestrator's `on_task_event_send` signal handler has no try/except around `Runner.run()`. If the runner throws for any reason (e.g., max turns exceeded, API error, timeout), `_complete_task` will never be set to `True`, causing `on_task_create` to hang forever at `await workflow.wait_condition(lambda: self._complete_task, timeout=None)`.

All three subagent workflows correctly wrap their `Runner.run()` calls in try/except and ensure `_complete_task = True` is always reached. The orchestrator should do the same:

```python
        try:
            result = await Runner.run(agent, self._input_list, hooks=hooks, max_turns=50)
            self._research_result = result.final_output
            self._input_list = result.to_input_list()
        except Exception as e:
            logger.error("Orchestrator research error: %s", e)
            self._research_result = f"Research failed: {e}"
        finally:
            self._complete_task = True
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: examples/demos/deep_research/orchestrator/Dockerfile
Line: 15-18

Comment:
**Hardcoded `arm64` tctl binary breaks builds on `amd64` hosts**

The Dockerfile downloads the `linux_arm64` Temporal CLI binary unconditionally. This works on Apple Silicon (M-series) macs but will fail silently or crash at runtime on any `amd64`/`x86_64` build environment (e.g., Linux CI, most cloud build agents).

The same issue exists in all four Dockerfiles:
- `examples/demos/deep_research/orchestrator/Dockerfile:15`
- `examples/demos/deep_research/github_researcher/Dockerfile:15`
- `examples/demos/deep_research/docs_researcher/Dockerfile:16`
- `examples/demos/deep_research/slack_researcher/Dockerfile:15`

Consider detecting the architecture at build time:

```dockerfile
RUN ARCH=$(dpkg --print-architecture) && \
    curl -L https://github.com/temporalio/tctl/releases/download/v1.18.1/tctl_1.18.1_linux_${ARCH}.tar.gz -o /tmp/tctl.tar.gz && \
    tar -xzf /tmp/tctl.tar.gz -C /usr/local/bin && \
    chmod +x /usr/local/bin/tctl && \
    rm /tmp/tctl.tar.gz
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: examples/demos/deep_research/docs_researcher/project/summarization.py
Line: 94-102

Comment:
**`find_last_summary_index` is dead code — defined but never called**

The `find_last_summary_index` function is defined here but is not imported or used in any workflow. The same dead function exists in all three subagent summarization modules:
- `examples/demos/deep_research/docs_researcher/project/summarization.py:94`
- `examples/demos/deep_research/github_researcher/project/summarization.py:94`
- `examples/demos/deep_research/slack_researcher/project/summarization.py:94`

Either remove it or wire it into the summarization logic if it was intended to be used for a "replace from last summary" strategy.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: examples/demos/deep_research/docs_researcher/project/summarization.py
Line: 1-10

Comment:
**`summarization.py` is duplicated verbatim across all three subagents**

The entire `summarization.py` file — including `COMPACTION_BYTE_THRESHOLD`, `KEEP_RECENT_OUTPUTS`, `estimate_payload_size`, `should_compact`, `compact_tool_outputs`, `new_summarization_agent`, `find_last_summary_index`, and `apply_summary_to_input_list` — is copy-pasted identically into:
- `docs_researcher/project/summarization.py`
- `github_researcher/project/summarization.py`
- `slack_researcher/project/summarization.py`

Any future change (e.g., adjusting `COMPACTION_BYTE_THRESHOLD` or fixing a bug in `compact_tool_outputs`) must be applied in three places. Consider extracting this into a shared package or at minimum leaving a comment noting that changes must be mirrored across all three files.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "Add deep research mu..."

Greptile also left 1 inline comment on this PR.

Demonstrates orchestrator + subagent pattern with shared task ID,
Temporal workflows, MCP server integration, and conversation compaction.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comment on lines +15 to +18
netcat-openbsd \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/**

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Hardcoded arm64 tctl binary breaks builds on amd64 hosts

The Dockerfile downloads the linux_arm64 Temporal CLI binary unconditionally. This works on Apple Silicon (M-series) macs but will fail silently or crash at runtime on any amd64/x86_64 build environment (e.g., Linux CI, most cloud build agents).

The same issue exists in all four Dockerfiles:

  • examples/demos/deep_research/orchestrator/Dockerfile:15
  • examples/demos/deep_research/github_researcher/Dockerfile:15
  • examples/demos/deep_research/docs_researcher/Dockerfile:16
  • examples/demos/deep_research/slack_researcher/Dockerfile:15

Consider detecting the architecture at build time:

RUN ARCH=$(dpkg --print-architecture) && \
    curl -L https://github.com/temporalio/tctl/releases/download/v1.18.1/tctl_1.18.1_linux_${ARCH}.tar.gz -o /tmp/tctl.tar.gz && \
    tar -xzf /tmp/tctl.tar.gz -C /usr/local/bin && \
    chmod +x /usr/local/bin/tctl && \
    rm /tmp/tctl.tar.gz
Prompt To Fix With AI
This is a comment left during a code review.
Path: examples/demos/deep_research/orchestrator/Dockerfile
Line: 15-18

Comment:
**Hardcoded `arm64` tctl binary breaks builds on `amd64` hosts**

The Dockerfile downloads the `linux_arm64` Temporal CLI binary unconditionally. This works on Apple Silicon (M-series) macs but will fail silently or crash at runtime on any `amd64`/`x86_64` build environment (e.g., Linux CI, most cloud build agents).

The same issue exists in all four Dockerfiles:
- `examples/demos/deep_research/orchestrator/Dockerfile:15`
- `examples/demos/deep_research/github_researcher/Dockerfile:15`
- `examples/demos/deep_research/docs_researcher/Dockerfile:16`
- `examples/demos/deep_research/slack_researcher/Dockerfile:15`

Consider detecting the architecture at build time:

```dockerfile
RUN ARCH=$(dpkg --print-architecture) && \
    curl -L https://github.com/temporalio/tctl/releases/download/v1.18.1/tctl_1.18.1_linux_${ARCH}.tar.gz -o /tmp/tctl.tar.gz && \
    tar -xzf /tmp/tctl.tar.gz -C /usr/local/bin && \
    chmod +x /usr/local/bin/tctl && \
    rm /tmp/tctl.tar.gz
```

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant