Skip to content

[Feature] OpenClaw integration #27

@johnlam1968

Description

@johnlam1968

Problem

We can partially integrate an agent in OpenClaw, as a "LLM" for OpenRoom.
But this agent is not aware of the OpenRoom tools.

Proposed Solution

Modify OpenRoom's llmClient.ts to inject OpenClaw-specific tools when configured as the backend. These tools would translate OpenRoom function calls
to OpenClaw tool calls.

Additional Context - OpenClaw and OpenRoom setup

OpenRoom + OpenClaw Integration Project

Created: 2026-03-24
Last Updated: 2026-03-24
Status: Active Development
Priority: High
Estimated Time: 4-6 hours


Overview

Connect OpenClaw's agentic capabilities to OpenRoom's browser-based desktop environment, enabling natural language control of OpenRoom apps through OpenClaw agents.


Architecture

OpenRoom (by MiniMax AI)

  • Browser-based desktop with AI Agent operating apps via natural language
  • Built-in apps: Music Player (ID 3), Twitter, Chess, FreeCell, Email, Diary, Album, CyberNews
  • Uses function calling (tools) to control apps
  • Open source: github.com/MiniMax-AI/OpenRoom

OpenClaw

  • Exposes OpenAI-compatible Chat Completions API at /v1/chat/completions
  • Supports multiple agents with memory, tools, and session management
  • Also has Responses API at /v1/responses (Anthropic-style)

Current Setup

OpenClaw Configuration

{
  "gateway": {
    "http": {
      "endpoints": {
        "chatCompletions": { "enabled": true },
        "responses": { "enabled": true }
      }
    }
  }
}

OpenRoom Configuration (on node host, ssh tunnel into OpenClaw gateway)

  • Base URL: http://localhost:18789
  • API Key: Gateway token
  • Model: openclaw:openroom

Created Agents

  • openroom — Dedicated agent for OpenRoom integration (no channel bindings)
    • Workspace: ~/.openclaw/workspace-openroom/
    • Model: MiniMax-M2.7-highspeed

Testing Results

✅ Working

  1. Basic chat — OpenRoom can chat with OpenClaw agent
  2. Agent awareness — Agent knows what apps are open in OpenRoom
  3. Context awareness — Agent responds to "I opened the music app"
  4. Chat Completions API — Returns responses correctly
  5. Responses API — Also functional at /v1/responses

❌ Not Working

  1. Action execution — Agent outputs action JSON as text, OpenRoom doesn't execute it
  2. Sub-agent spawning — Results route to wrong channel (known bug #29449)
  3. Tool calling — OpenClaw agent lacks OpenRoom's tool definitions

Known Issues

  • Bug #29449/v1/chat/completions hardcodes messageChannel: "webchat" and ignores x-openclaw-message-channel header
  • Sub-agent results go to configured channels (Telegram/Aight), not back to HTTP caller

OpenRoom Tool System

OpenRoom expects the LLM to use function calling with these tools:

Tool Purpose
list_apps Get available apps
file_read Read app's meta.yaml, guide.md, data files
file_write Write data files
app_action Notify app to take action (PLAY, PAUSE, etc.)

System Prompt Flow:

  1. list_apps → discover apps
  2. file_read("apps/{app}/meta.yaml") → learn actions
  3. file_read("apps/{app}/guide.md") → learn data schema
  4. file_write → modify data
  5. app_action → refresh app

App IDs:

ID App Actions
1 OS OPEN_APP, CLOSE_APP, SET_WALLPAPER
3 Music Player PLAY, PAUSE, STOP, NEXT, PREV, VOLUME
11 Email SEND, DELETE, MARK_READ
4 Diary CREATE_ENTRY, EDIT_ENTRY
2 Twitter CREATE_POST, DELETE_POST, LIKE
8 Album VIEW_PHOTO
14 CyberNews REFRESH

Integration Options

Option 1: Modify OpenRoom's LLM Client (Recommended)

Approach: Fork OpenRoom and inject OpenClaw-specific tool definitions

Changes needed:

  1. Modify apps/webuiapps/src/lib/llmClient.ts to inject tools
  2. Add tool: openroom_action(app_id, action_type, params)
  3. Tool executes OpenRoom actions via internal API

Pros:

  • Clean integration
  • Full tool calling support
  • Maintains OpenRoom's architecture

Cons:

  • Requires maintaining a fork
  • Updates may conflict

Option 2: Add Tools to OpenClaw Agent

Approach: Add OpenRoom action tools to the openroom agent

Challenge: OpenRoom runs in browser, gateway can't push actions

Possible solutions:

  • Browser tool: Agent uses browser to control OpenRoom (hacky)
  • WebSocket relay: Local relay both connect to
  • OpenRoom API server: Expose actions via HTTP
  • Hybrid: OpenRoom parses action JSON from responses

Option 3: System Prompt Engineering

Approach: Improve agent's ability to output action JSON

Problem: OpenRoom doesn't parse JSON from text responses, only executes tool calls

Option 4: Wait for OpenClaw Bug Fix

Bug #29449 may enable proper channel routing


Files Created

OpenClaw Agent Workspace

~/.openclaw/workspace-openroom/
├── SOUL.md        # Agent personality with action instructions
├── MEMORY.md      # OpenRoom connection notes
├── IDENTITY.md    # Agent identity
└── BOOTSTRAP.md   # Setup instructions

References


Next Steps

  1. Decide integration approach — Option 1 (fork OpenRoom) vs Option 2 (tools)
  2. Implement action execution — Make agent able to control apps
  3. Test with multiple apps — Music, Twitter, Diary, etc.
  4. Document best practices — How to prompt for app control
  5. Evaluate success — Compare vs native OpenRoom LLM

Implementation Plan

Phase 1: Infrastructure Setup ✅

  • Decision made: Option B (node host)
  • Enable gateway.http.endpoints.chatCompletions.enabled: true in openclaw.json
  • Verify endpoint works: curl -X POST http://127.0.0.1:18789/v1/chat/completions
  • Test with different agents (main, researcher) ✅
  • Document endpoint URL and auth requirements

Phase 2: OpenRoom Configuration ✅

  • Clone/run OpenRoom locally
  • Configure OpenRoom's LLM settings to point to OpenClaw endpoint
  • Test basic chat interaction ✅

Phase 3: Action Integration (Current)

  • Implement action execution (tool calling or JSON parsing)
  • Test with multiple apps

Phase 4: Deep Integration

  • Test OpenClaw agent controlling OpenRoom apps
  • Evaluate success criteria

Resources Needed

  • OpenRoom running locally (port 3000)
  • OpenClaw gateway accessible from OpenRoom
  • Test user for chat interactions
  • Time: ~4-6 hours estimated

Risks & Mitigations

Risk Mitigation
Tool call format mismatch Create translation layer/middleware
Session management complexity Use stable session via user param
Security exposure Keep endpoint on loopback only
OpenRoom updates break integration Pin versions or fork

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions