Skip to content

Add WebSocket voice player for remote audio listening#711

Closed
Alessandro-Improta wants to merge 2 commits intodanielmiessler:mainfrom
Alessandro-Improta:feat/websocket-voice-player
Closed

Add WebSocket voice player for remote audio listening#711
Alessandro-Improta wants to merge 2 commits intodanielmiessler:mainfrom
Alessandro-Improta:feat/websocket-voice-player

Conversation

@Alessandro-Improta
Copy link
Copy Markdown

@Alessandro-Improta Alessandro-Improta commented Feb 17, 2026

Problem

Part of the immersive experience that PAI users are looking for is interacting with their PAI instance using natural language. Daniel Miessler understands this and built into the project a voice server which allows the DA to speak to it's user. However, this only works if the DA lives on the device the user is using.

In the future it seems most likely that the way our DA will be able to be with us always is for it to live remotely and for our many devices to have access to it. This means that we need to solve the problems caused by our DAs living remotely.

Solution

We need a universal, easy to implement, secure way for audio messages to be shared with all sorts of devices. This solution uses WebSocket which means the client device only needs some sort of browser to receive messages. Almost every device that can connect to the internet has a browser, which allows for near universal compatibility.

PR details

This PR implements WebSocket-first audio dispatch allowing voice notifications to be heard in a browser when remoting into the server. When WebSocket clients are connected, audio is streamed to the browser; otherwise falls back to local playback.

Features:

  • Token-authenticated WebSocket endpoint (/ws) with secure random tokens
  • Browser-based audio player UI (/player) with auto-reconnect
  • WebSocket connection manager tracking active clients
  • Audio dispatch logic: check for WebSocket clients before local playback
  • Support for MP3 audio streaming (ElevenLabs TTS)
  • Exponential backoff reconnection (up to 30s delay)
  • WebSocket status endpoint (/ws/status) for monitoring

Files added:

  • auth.ts: Cryptographically secure token generation and validation
  • ws-manager.ts: WebSocket connection tracking and broadcast methods
  • static/player.html: Browser audio player with Web Audio API

Files modified:

  • server.ts: WebSocket endpoint, audio dispatch logic, startup logging

Fixes #721

Alessandro-Improta and others added 2 commits February 16, 2026 12:33
Implements WebSocket-first audio dispatch allowing voice notifications
to be heard in a browser when remoting into the server. When WebSocket
clients are connected, audio is streamed to the browser; otherwise falls
back to local playback.

Features:
- Token-authenticated WebSocket endpoint (/ws) with secure random tokens
- Browser-based audio player UI (/player) with auto-reconnect
- WebSocket connection manager tracking active clients
- Audio dispatch logic: check for WebSocket clients before local playback
- Support for MP3 audio streaming (ElevenLabs TTS)
- Exponential backoff reconnection (up to 30s delay)
- WebSocket status endpoint (/ws/status) for monitoring

Files added:
- auth.ts: Cryptographically secure token generation and validation
- ws-manager.ts: WebSocket connection tracking and broadcast methods
- static/player.html: Browser audio player with Web Audio API

Files modified:
- server.ts: WebSocket endpoint, audio dispatch logic, startup logging

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@danielmiessler
Copy link
Copy Markdown
Owner

Really interesting concept for remote voice playback! This PR was designed for an earlier architecture. The voice system was restructured in v4.0.

If you'd like to revisit this for the current architecture, please see Releases/v4.0.2 for the latest codebase. Remote audio is a compelling use case we'd be interested in supporting. Thanks for the contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Voice Server doesn't work for anyone remoting into their PAI instance

2 participants