Voice agents: integrations-first overview + Voice SDK page (hide Flow) by ArchieMcM234 · Pull Request #209 · speechmatics/docs

ArchieMcM234 · 2026-02-20T16:52:29Z

Summary

This PR refactors the Voice agents docs to make it easier for users to get started via integration partners, while keeping the Voice SDK documentation available in a dedicated page.

What changed

Hid “Flow” from the Voice agents sidebar (content remains in the repo; it’s just no longer shown in navigation).
Updated the Voice agents overview to be integrations-first, including cards linking to:
- Vapi
- LiveKit
- Pipecat
Created a dedicated Voice SDK page by renaming the former Features page to voice-sdk, and expanding it with:
- Getting started + quickstart
- Presets overview
- Custom config examples
- Full configuration reference

Navigation / IA

Voice agents sidebar is now:

Overview
Voice SDK

Notes / follow-ups

This keeps the SDK docs in-repo for now; we may want to replace parts of the Voice SDK page with a link to the SDK GitHub README in future.
I intend to revisit the Voice SDK page next to make it more comprehensive in the mean time (expand coverage, improve structure, and fill remaining gaps).
Flow content is intentionally retained but not exposed in navigation.

Testing

Verified docs compile locally after updating sidebar doc IDs and internal links.

Files changed

docs/docs/voice-agents/overview.mdx
docs/docs/voice-agents/sidebar.ts
docs/docs/voice-agents/features.mdx -> docs/docs/voice-agents/voice-sdk.mdx
docs/sidebars.ts

…DK page

vercel · 2026-02-20T16:52:34Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
docs	Ready	Preview, Comment	Feb 20, 2026 4:52pm

Copilot

Pull request overview

Refactors the Voice agents documentation to emphasize partner integrations as the fastest onboarding path, while preserving and expanding Voice SDK documentation on a dedicated page.

Changes:

Reworked the Voice agents overview to be integrations-first and added integration cards (Vapi/LiveKit/Pipecat).
Renamed/expanded the former “Features” doc into a dedicated Voice SDK page and updated the Voice agents sidebar accordingly.
Adjusted global sidebar ordering to place Voice agents after Text to speech.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File	Description
`sidebars.ts`	Reorders the top-level sidebar list to move “Voice agents” below “Text to speech”.
`docs/voice-agents/overview.mdx`	Replaces the overview content with an integrations-first entry point and links to the Voice SDK page.
`docs/voice-agents/sidebar.ts`	Updates the Voice agents sidebar to show only Overview + Voice SDK (hides Flow).
`docs/voice-agents/voice-sdk.mdx`	Introduces a richer Voice SDK landing page (getting started, quickstart, presets, and config guidance).

Comments suppressed due to low confidence (3)

docs/voice-agents/voice-sdk.mdx:10

pythonVoiceConfigSerialization is imported but never used in this MDX file. Removing the unused raw import will avoid unnecessary bundling and keeps the page easier to maintain.
docs/voice-agents/voice-sdk.mdx:101
The code sample comment says "interruptable"; the correct spelling is "interruptible".
docs/voice-agents/voice-sdk.mdx:137
The snippet for listing presets references VoiceAgentConfigPreset.list_presets() without showing where VoiceAgentConfigPreset comes from. Consider adding the relevant import (or fully qualifying it) so the example is copy/paste runnable.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

mnemitz · 2026-02-24T10:32:46Z

docs/voice-agents/voice-sdk.mdx

+
+Here's how to stream microphone audio to the Voice Agent and transcribe finalised segments of speech, with speaker ID:
+
+```python


Could we move this to an external file and import it?

Yes, this slipped through the cracks -
This is fixed in the next pr which refactores the voice sdk page, into (hopefully) quite a comprehensive page.

Just as a note it is stacked as a branch off of this branch.

lgavincrl

I think the voice agents section should be moved to :

"Voice Agents, Integrations and SDK's" - this way, it's a little more 'together' - it's a SDK and encompasses the integration partners.

Otherwise, nice

lgavincrl · 2026-02-20T17:03:47Z

docs/voice-agents/overview.mdx

+If you’re building it yourself, you can also use our Voice SDK. Integrations are built on top of the Voice SDK, which provides features optimized for conversational AI.

-### Voice SDK vs Realtime SDK
+If you’re building an integration and want to work with us, contact support.


Link to contact support like this:
[contact support](https://support.speechmatics.com).

I'd add this at the bottom of the page as a 'next steps'

lgavincrl · 2026-02-20T17:08:27Z

docs/voice-agents/overview.mdx

-Use the Voice SDK when:
+## Features

- Building conversational AI or voice agents
- You need automatic turn detection
- You want speaker-focused transcription
- You need ready-to-use presets for common scenarios
+Speechmatics provides building blocks you can use through integrations and the Voice SDK.

-Use the Realtime SDK when:
+It includes:

- You need the raw stream of word-by-word transcription data
- Building custom segmentation logic
- You want fine-grained control over every event
- Processing audio files or custom workflows
+- **Turn detection**: detect when a speaker has finished talking.
+- **Intelligent segmentation**: group partial transcripts into clean, speaker-attributed segments.
+- **Diarization**: identify and label different speakers.
+- **Speaker focus**: focus on or ignore specific speakers in multi-speaker scenarios.
+- **Preset configurations**: start quickly with ready-to-use settings.
+- **Structured events**: work with clean segments instead of raw word-level events.


Are we diverting from the "when to use x or y" scenario? The presentation of the differing use cases for Rt vs voice sdk's was something that was highlighted as important previously?

lgavincrl · 2026-02-24T13:10:31Z

docs/voice-agents/overview.mdx


-# Voice SDK overview
-The Voice SDK builds on our Realtime API to provide additional features optimized for conversational AI, using Python:
+Our integration partners can be the quickest way to get a production voice agent up and running.


The fastest way to create a production-ready voice agent is through utilizing our integration partners:
(add link cards here perhaps?)
} href="/integrations-and-sdks/vapi" /> } href="/integrations-and-sdks/livekit" /> } href="/integrations-and-sdks/pipecat" />

lgavincrl · 2026-02-24T13:45:14Z

docs/voice-agents/overview.mdx

- **Preset configurations**: offers ready-to-use settings for conversations, note-taking, and captions.
- **Simplified event handling**: delivers clean, structured segments instead of raw word-level events.
+If you’re building it yourself, you can also use our Voice SDK. Integrations are built on top of the Voice SDK, which provides features optimized for conversational AI.



Voice agents overview

Production-ready voice agents with features optimized for conversational AI can be built using the Voice SDK, or through one of our integration partners, which are built on top of the Voice SDK:

(add link cards here perhaps? and add one for the voice sdk)

lgavincrl · 2026-02-24T13:54:03Z

docs/voice-agents/overview.mdx

- You need automatic turn detection
- You want speaker-focused transcription
- You need ready-to-use presets for common scenarios
+Speechmatics provides building blocks you can use through integrations and the Voice SDK.


I'd phrase this differently - try to introduce our features, and they optimized for conversational AI to enhance voice agents

lgavincrl · 2026-02-24T14:10:02Z

docs/voice-agents/voice-sdk.mdx

+
+### Voice SDK vs Realtime SDK
+
+Use the Voice SDK when:


Perhaps, also add a when to use integrations?

lgavincrl · 2026-02-24T14:19:17Z

docs/voice-agents/voice-sdk.mdx

+
+- Building conversational AI or voice agents
+- You need automatic turn detection
+- You want speaker-focused transcription
+- You need ready-to-use presets for common scenarios
+
+Use the Realtime SDK when:
+
+- You need the raw stream of word-by-word transcription data
+- Building custom segmentation logic
+- You want fine-grained control over every event
+- Processing audio files or custom workflows
+
+## Getting started
+
+### 1. Create an API key


Perhaps add this as a 'Voice agents > Quickstart' page, see the Rt Stt quickstart for example: https://docs.speechmatics.com/speech-to-text/realtime/quickstart

lgavincrl · 2026-02-24T14:29:18Z

docs/voice-agents/voice-sdk.mdx

+
+### 3. Quickstart
+
+Here's how to stream microphone audio to the Voice Agent and transcribe finalised segments of speech, with speaker ID:


Is this using very basic configs? Or is this using the presets config? I think it would be good to say

lgavincrl · 2026-02-24T14:33:11Z

docs/voice-agents/voice-sdk.mdx

+
+    # Audio configuration
+    SAMPLE_RATE = 16000         # Hz
+    CHUNK_SIZE = 160            # Samples per read


Samples per read -
state this in a really basic way. It needs to be crystal clear, with no ambiguity as to what this config is / what it does

lgavincrl · 2026-02-24T14:39:41Z

docs/voice-agents/voice-sdk.mdx

@@ -45,7 +199,8 @@ Silence duration in seconds to trigger turn end.
 Maximum delay before forcing turn end.

 `max_delay` (float, default: 0.7)  
-Maximum transcription delay for word emission.
+Maximum transcription delay for word emission. 
+Defaults to 0.7 seconds, but when using turn detection we recommend 1.0s for better accuracy. Turn detection will ensure finalisation latency is not affected.

 ### Speaker configuration
 `speaker_sensitivity` (float, default: 0.5)  


I'd put this on a different page (if you don't use a separate quickstart), or in the tab layout - I don't think its great to have super lengthy pages

Voice agents: hide Flow, add integrations-first overview, add Voice S…

11c28b7

…DK page

ArchieMcM234 requested review from TudorCRL, Copilot and lgavincrl February 20, 2026 16:52

Copilot started reviewing on behalf of ArchieMcM234 February 20, 2026 16:53 View session

Copilot AI reviewed Feb 20, 2026

View reviewed changes

mnemitz reviewed Feb 24, 2026

View reviewed changes

lgavincrl reviewed Feb 24, 2026

View reviewed changes


		Here's how to stream microphone audio to the Voice Agent and transcribe finalised segments of speech, with speaker ID:

		```python


		### 3. Quickstart

		Here's how to stream microphone audio to the Voice Agent and transcribe finalised segments of speech, with speaker ID:

Comments

Conversation

ArchieMcM234 commented Feb 20, 2026

Summary

What changed

Navigation / IA

Notes / follow-ups

Testing

Files changed

Uh oh!

vercel bot commented Feb 20, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lgavincrl left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Voice agents overview

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants