Skip to content

Comments

Voice sdk restructure#210

Open
ArchieMcM234 wants to merge 2 commits intomainfrom
voice-sdk-restructure
Open

Voice sdk restructure#210
ArchieMcM234 wants to merge 2 commits intomainfrom
voice-sdk-restructure

Conversation

@ArchieMcM234
Copy link
Contributor

Note: This PR is stacked on top of deprecate-flow-integrations. Which could/should be merged first.

Summary

Refactors and expands the Voice SDK documentation page with:

  • Reorganized content structure
  • New code examples (quickstart, configuration, event handling, speaker focus, etc.)
  • 10 new Python snippet files for cleaner code embedding

Future consideration

The Voice SDK docs may eventually move to live alongside the SDK repo (like other SDK docs), but
this provides a solid foundation for the integrations pages in the meantime.

@vercel
Copy link

vercel bot commented Feb 24, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docs Ready Ready Preview, Comment Feb 24, 2026 1:36pm

Request Review

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR restructures the Voice Agents documentation by introducing a dedicated Voice SDK page with embedded Python snippets, and updating navigation to highlight the new structure.

Changes:

  • Adds a new Voice SDK documentation page with reorganized explanations and many embedded Python examples.
  • Updates the Voice Agents sidebar to point to the new Voice SDK page (and removes the previous “Features” entry).
  • Reworks the Voice Agents overview page to emphasize integrations with LinkCard-based navigation.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
sidebars.ts Reorders the voiceAgentsSidebar placement in the top-level docs sidebar list.
docs/voice-agents/sidebar.ts Updates Voice Agents nav items to include Overview + Voice SDK (removes Flow sidebar inclusion).
docs/voice-agents/overview.mdx Replaces prior Voice SDK-focused overview with an integrations-first overview using LinkCards.
docs/voice-agents/voice-sdk.mdx New, comprehensive Voice SDK guide with sections on events/segments, presets, configuration, diarization, and serialization.
docs/voice-agents/features.mdx Removes the old “Features” page content.
docs/voice-agents/assets/*.py Adds multiple new Python snippet files used for embedding into the new Voice SDK page; adjusts custom-config.py.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

label: "Features",
id: "voice-agents/voice-sdk",
label: "Voice SDK",
},
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sidebar no longer includes the existing Flow docs category (docs/voice-agents/flow/* still exists and other pages link to /voice-agents/flow). If Flow is intentionally deprecated, consider adding an explicit deprecation/redirect notice (or a minimal link) so users can still discover the page from navigation; otherwise re-add the Flow sidebar entry to avoid orphaning that section.

Suggested change
},
},
{
type: "link",
href: "/voice-agents/flow",
label: "Flow",
},

Copilot uses AI. Check for mistakes.

### Quickstart

Here's how to stream microphone audio to the Voice Agent and transcribe finalised segments of speech, with speaker ID:
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent spelling within the same page: this section uses “finalised” while most of the rest of the doc uses “finalized/finalization”. Pick one variant and use it consistently to avoid confusion (especially for searchability).

Suggested change
Here's how to stream microphone audio to the Voice Agent and transcribe finalised segments of speech, with speaker ID:
Here's how to stream microphone audio to the Voice Agent and transcribe finalized segments of speech, with speaker ID:

Copilot uses AI. Check for mistakes.
`SCRIBE` - note-taking

`CAPTIONS` - live captioning

Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The preset naming/usage is inconsistent: Quickstart passes a lowercase string ("scribe") while this page lists uppercase preset identifiers (e.g., FAST, EXTERNAL) and later refers to external in backticks. Please clarify whether users should pass string preset names to VoiceAgentClient(preset=...) or use VoiceAgentConfigPreset.*, and keep the naming/casing consistent throughout.

Suggested change
You can use presets in two equivalent ways:
- As lowercase strings, e.g. `VoiceAgentClient(preset="scribe")`
- As enum members, e.g. `VoiceAgentClient(preset=VoiceAgentConfigPreset.SCRIBE)`
For new code, we recommend using `VoiceAgentConfigPreset.*` for consistency.

Copilot uses AI. Check for mistakes.
# Connect to the Voice Agent
await client.connect()

# Stream microphone audio (interruptable using keyboard)
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling: “interruptable” should be “interruptible”.

Suggested change
# Stream microphone audio (interruptable using keyboard)
# Stream microphone audio (interruptible using keyboard)

Copilot uses AI. Check for mistakes.
Comment on lines +203 to +214
### Custom configuration

For more control, you can also specify custom configurations or use presets as a starting point and customise with overlays:

<Tabs>
<TabItem value='voice-custom-config' label='Custom configurations'>
Specify configurations in a `VoiceAgentConfig` object:
<CodeBlock language="python">
{pythonVoiceCustomConfig}
</CodeBlock>
</TabItem>
<TabItem value='voice-custom-config-overlays' label='Preset with a custom overlay'>
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Python snippets referenced on this page use different environment variable names for the API key (e.g., quickstart uses SPEECHMATICS_API_KEY, other snippets use YOUR_API_KEY). Consider standardizing on one name across all examples so readers don’t hit copy/paste failures.

Copilot uses AI. Check for mistakes.
Comment on lines +12 to +18
# Focus on specific speakers, exclude everyone else
config = VoiceAgentConfig(
enable_diarization=True,
speaker_config=SpeakerFocusConfig(
focus_speakers=["S1", "S2"],
focus_mode=SpeakerFocusMode.IGNORE
)
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This snippet redefines config multiple times, so if a reader copy/pastes and runs it they’ll only end up with the last configuration. Consider splitting these into separate snippets/sections or using distinct variable names so each example is unambiguous.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant