Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Pull request overview
This PR restructures the Voice Agents documentation by introducing a dedicated Voice SDK page with embedded Python snippets, and updating navigation to highlight the new structure.
Changes:
- Adds a new
Voice SDKdocumentation page with reorganized explanations and many embedded Python examples. - Updates the Voice Agents sidebar to point to the new Voice SDK page (and removes the previous “Features” entry).
- Reworks the Voice Agents overview page to emphasize integrations with LinkCard-based navigation.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
sidebars.ts |
Reorders the voiceAgentsSidebar placement in the top-level docs sidebar list. |
docs/voice-agents/sidebar.ts |
Updates Voice Agents nav items to include Overview + Voice SDK (removes Flow sidebar inclusion). |
docs/voice-agents/overview.mdx |
Replaces prior Voice SDK-focused overview with an integrations-first overview using LinkCards. |
docs/voice-agents/voice-sdk.mdx |
New, comprehensive Voice SDK guide with sections on events/segments, presets, configuration, diarization, and serialization. |
docs/voice-agents/features.mdx |
Removes the old “Features” page content. |
docs/voice-agents/assets/*.py |
Adds multiple new Python snippet files used for embedding into the new Voice SDK page; adjusts custom-config.py. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| label: "Features", | ||
| id: "voice-agents/voice-sdk", | ||
| label: "Voice SDK", | ||
| }, |
There was a problem hiding this comment.
The sidebar no longer includes the existing Flow docs category (docs/voice-agents/flow/* still exists and other pages link to /voice-agents/flow). If Flow is intentionally deprecated, consider adding an explicit deprecation/redirect notice (or a minimal link) so users can still discover the page from navigation; otherwise re-add the Flow sidebar entry to avoid orphaning that section.
| }, | |
| }, | |
| { | |
| type: "link", | |
| href: "/voice-agents/flow", | |
| label: "Flow", | |
| }, |
|
|
||
| ### Quickstart | ||
|
|
||
| Here's how to stream microphone audio to the Voice Agent and transcribe finalised segments of speech, with speaker ID: |
There was a problem hiding this comment.
Inconsistent spelling within the same page: this section uses “finalised” while most of the rest of the doc uses “finalized/finalization”. Pick one variant and use it consistently to avoid confusion (especially for searchability).
| Here's how to stream microphone audio to the Voice Agent and transcribe finalised segments of speech, with speaker ID: | |
| Here's how to stream microphone audio to the Voice Agent and transcribe finalized segments of speech, with speaker ID: |
| `SCRIBE` - note-taking | ||
|
|
||
| `CAPTIONS` - live captioning | ||
|
|
There was a problem hiding this comment.
The preset naming/usage is inconsistent: Quickstart passes a lowercase string ("scribe") while this page lists uppercase preset identifiers (e.g., FAST, EXTERNAL) and later refers to external in backticks. Please clarify whether users should pass string preset names to VoiceAgentClient(preset=...) or use VoiceAgentConfigPreset.*, and keep the naming/casing consistent throughout.
| You can use presets in two equivalent ways: | |
| - As lowercase strings, e.g. `VoiceAgentClient(preset="scribe")` | |
| - As enum members, e.g. `VoiceAgentClient(preset=VoiceAgentConfigPreset.SCRIBE)` | |
| For new code, we recommend using `VoiceAgentConfigPreset.*` for consistency. |
| # Connect to the Voice Agent | ||
| await client.connect() | ||
|
|
||
| # Stream microphone audio (interruptable using keyboard) |
There was a problem hiding this comment.
Spelling: “interruptable” should be “interruptible”.
| # Stream microphone audio (interruptable using keyboard) | |
| # Stream microphone audio (interruptible using keyboard) |
| ### Custom configuration | ||
|
|
||
| For more control, you can also specify custom configurations or use presets as a starting point and customise with overlays: | ||
|
|
||
| <Tabs> | ||
| <TabItem value='voice-custom-config' label='Custom configurations'> | ||
| Specify configurations in a `VoiceAgentConfig` object: | ||
| <CodeBlock language="python"> | ||
| {pythonVoiceCustomConfig} | ||
| </CodeBlock> | ||
| </TabItem> | ||
| <TabItem value='voice-custom-config-overlays' label='Preset with a custom overlay'> |
There was a problem hiding this comment.
The Python snippets referenced on this page use different environment variable names for the API key (e.g., quickstart uses SPEECHMATICS_API_KEY, other snippets use YOUR_API_KEY). Consider standardizing on one name across all examples so readers don’t hit copy/paste failures.
| # Focus on specific speakers, exclude everyone else | ||
| config = VoiceAgentConfig( | ||
| enable_diarization=True, | ||
| speaker_config=SpeakerFocusConfig( | ||
| focus_speakers=["S1", "S2"], | ||
| focus_mode=SpeakerFocusMode.IGNORE | ||
| ) |
There was a problem hiding this comment.
This snippet redefines config multiple times, so if a reader copy/pastes and runs it they’ll only end up with the last configuration. Consider splitting these into separate snippets/sections or using distinct variable names so each example is unambiguous.
Note: This PR is stacked on top of deprecate-flow-integrations. Which could/should be merged first.
Summary
Refactors and expands the Voice SDK documentation page with:
Future consideration
The Voice SDK docs may eventually move to live alongside the SDK repo (like other SDK docs), but
this provides a solid foundation for the integrations pages in the meantime.