Skip to content

Feature Request: Configurable Speech Rate for TTS Providers #295

@jogi-poikola

Description

@jogi-poikola

Not sure if feature ideas should be in discussions rather that in issues, but here it goes this time.

Problem

The speaking rate for Google Cloud TTS is currently hardcoded to 1.0 (normal speed) in the voice server. Users cannot adjust playback speed without modifying the source code.

Proposed Solution

Add configurable SPEAKING_RATE environment variable to control TTS playback speed:

# In $PAI_DIR/.env
SPEAKING_RATE=1.7  # Range: 0.25 to 4.0, default 1.0

Implementation:

// server.ts
const SPEAKING_RATE = parseFloat(process.env.SPEAKING_RATE || "1.0");

// In generateSpeechGoogle()
audioConfig: {
  speakingRate: SPEAKING_RATE,
  pitch: 0,
}

Benefits

  • Users can adjust speech speed to personal preference
  • Faster playback (1.5x - 2.0x) reduces notification time
  • Slower playback (0.5x - 0.75x) improves comprehension for complex messages
  • No code changes needed to adjust speed

Provider Support

  • Google Cloud TTS: ✅ Supports 0.25x to 4.0x
  • ElevenLabs: ❌ No rate control via API (uses default speed)

Default Value

Recommend keeping default at 1.0 (normal speed) for backward compatibility.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions