Skip to content

docs: feature-first restructure, tested SDK + API examples, and agent skills#83

Open
LordElf wants to merge 6 commits into
mainfrom
feat/docs-update-and-sdk-skill
Open

docs: feature-first restructure, tested SDK + API examples, and agent skills#83
LordElf wants to merge 6 commits into
mainfrom
feat/docs-update-and-sdk-skill

Conversation

@LordElf
Copy link
Copy Markdown

@LordElf LordElf commented Jun 5, 2026

Summary

A feature-first rebuild of the Fish Audio docs: one canonical page per feature, Python + JavaScript + curl examples that are live-tested against the API, cookbooks grouped by feature, and an install-first AI-agent-skill page.

Information architecture

  • Overview menu: Get Started (Overview · Get API Key · Quickstart · Changelog) + a flat Core Features group (Text to Speech, with Emotion/Fine-grained nested · Speech to Text · Voice Cloning · Realtime Streaming · Manage Voices) + Platform + Models & Pricing.
  • Resources (SDK setup · Cookbook-by-feature · Best Practices · Integrations · Self-Hosting) and API Reference tabs.
  • One home per feature: the deep core-features/* and sdk-guide/python|javascript/* content was folded into the feature pages, then the old pages were redirected + deleted — no broken links, legacy URLs still resolve.

Tested code (the headline)

Every example runs against the live API via two harnesses under tests/:

  • tests/cookbooks/ (pytest) — extracts each published Python block, runs it, asserts the audio/transcript, and deletes any voices it creates. 24/24.
  • tests/js/ (node) — the same for JavaScript. 14/14.
  • Both skip cleanly without FISH_API_KEY, so they're CI-safe.

This surfaced and fixed real bugs: duration is seconds (not ms), sample_rate only via TTSConfig, ASR needs multipart + ignore_timestamps=false, async stream_websocket must not be awaited, latency default is balanced, opus_bitrate units, required model header.

New pages & content

  • Get Your API Key, a transport-agnostic Errors page (status codes, retries, Python + JS handlers), and an install-first AI Coding Agents page (npx skills add docs.fish.audio installs the fish-audio-sdk + fish-audio-api skills).
  • JavaScript examples on every feature page and cookbook (fish-audio npm). WebSocket-realtime JS is held — convertRealtime is broken in 0.1.0.
  • 10 cookbooks by feature (SRT/VTT captions, telephony 8 kHz, clone-and-wait, library search, voice-agent loop, …).

Validation

Benchmarked against a saved ElevenLabs corpus across two review passes; the structure now matches EL's "one home per capability" model. Removed hallucinated code from the old AI-agents page (FishAudioClient, client.tts.create — none of which exist).

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added comprehensive feature guides for Text-to-Speech, Speech-to-Text, Voice Cloning, Realtime Streaming, and Voice Management.
    • Added 11 new cookbook recipes with Python and JavaScript examples for common workflows (batch transcription, voice cloning, realtime LLM streaming, caption generation, etc.).
    • Added platform overview and capabilities documentation.
  • Documentation

    • Reorganized SDK documentation with detailed reference pages for installation, authentication, error handling, and API usage.
    • Updated bitrate parameters from relative to absolute values (24→24000 bps).
    • Clarified ASR duration field units and model state documentation.
    • Added redirects for legacy documentation paths.
  • Tests

    • Added Python and JavaScript test harnesses validating cookbook examples against live API.

LordElf and others added 6 commits June 1, 2026 23:20
Add an Agent Skill for the official Fish Audio SDKs — Python (fish-audio-sdk,
imported as fishaudio) and JavaScript/TypeScript (fish-audio) — parallel to the
existing raw-API fish-audio-api skill. SKILL.md plus references/ covering
install + auth, sync/async Python and the TS client, TTS, voice cloning,
speech-to-text, realtime WebSocket TTS, and the real exception/retry/timeout
behavior. Examples are verified against SDK source (Python 1.3.0, JS 0.1.0):
correct method names, s1/s2-pro models, ASR units, and exceptions that are
actually raised (no fictional max_retries/ValidationError).

Advertise the new skill in coding-agents.mdx and agent-quickstart.mdx.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Phase 1 correctness pass on the hand-written SDK docs, aligned to the real
implemented surface (Python fish-audio-sdk 1.3.0, JS fish-audio 0.1.0).

JavaScript:
- fix undefined `client` references (-> fishAudio) and the invalid
  `backend: "s2-pro"` named-arg syntax in convertRealtime
- replace e?.status / message string-matching error handling with the real
  typed errors (FishAudioError.statusCode, UnprocessableEntityError,
  FishAudioTimeoutError)
- document built-in retries, per-call requestOptions, and withRawResponse;
  correct Backends to its 6 literals; convertRealtime accepts Iterable|AsyncIterable

Python:
- remove the never-raised ValidationError from examples (422 surfaces as APIError)
- label ASR segment start/end as seconds (duration stays ms) + add a units note
- complete the "all TTSConfig parameters" example with the 5 missing fields

API reference:
- asyncapi opus_bitrate kbps -> bps (matches openapi.json / the API)
- fish-audio-api skill opus_bitrate kbps -> bps
- get-user-package page title 'Get User Premium' -> 'Get User Package'

Existing MDX is not reformatted (repo's committed MDX is not prettier-clean and
prettier mangles fenced code blocks); edits are surgical.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The published npm `fish-audio` 0.1.0 is out of sync with its git source and the
Python SDK (tested via tsc against the installed package): default backend is
`s1` not `s2-pro`, `s2-pro` is absent from the `Backends` type, `error.body` is
typed `unknown`, and the CJS build is broken. Revert the JS doc edits from the
previous commit until the JS SDK is republished / the team decides direction.
Python SDK + API-reference fixes (validated: Python surface test 32/32) are kept.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ElevenLabs-style SDK information architecture, Python-first (JS held pending the
0.1.0 sync):

- developer-guide/sdk-guide/quickstart — unified install -> auth -> first audio
  (Python + a minimal, type-checked JS snippet)
- developer-guide/sdk-guide/python/errors — dedicated Errors & Retries page
  (real exception hierarchy, retry pattern, timeouts/httpx caveat)
- developer-guide/sdk-guide/cookbook/* — task-focused recipes:
  stream-to-file, instant-voice-cloning, realtime LLM-tokens -> speech
- docs.json — Developer SDKs group reorganized: Quickstart, Python SDK (+Errors),
  JavaScript SDK, Cookbook

All Python examples validated against fish-audio-sdk 1.3.0 (15 blocks compile,
17/17 symbol checks); JS quickstart snippet type-checks against fish-audio 0.1.0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ence)

Clearer split — Overview (what & why) -> Developer Guide (build with code) ->
API Reference (spec):

- New Overview tab: introduction/quickstart/changelog, a new Capabilities page,
  a new Platform (web app) overview page, product guides, and models & pricing.
- Developer Guide (was "Docs"): core features, a single feature-organized SDK
  group (Quickstart, Auth, TTS, Voice Cloning, STT, WebSocket, Errors, Cookbook)
  instead of split Python/JavaScript page trees, best practices, integrations,
  self-hosting, tutorials, resources.
- API Reference: cleaned into REST API + SDK Reference (Python + JS).
- Removed empty/duplicate groups (Advanced Features, second Best Practices,
  Safety & Ethics).

SDK pages are feature-organized so each becomes language-selectable; Python is
filled now, the JavaScript tab lands once fish-audio (npm) is synced (see
sdk-docs-comparison/TS-SDK-DISCREPANCIES.md). No JS SDK content changed.

All 68 nav pages verified present.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Restructure into one Overview menu (Get Started + a flat Core Features
group) plus Resources and API Reference, with one canonical page per
feature. Fold the deep core-features/sdk-guide content into the feature
pages and redirect the orphaned pages. Add Python, JavaScript, and curl
examples on every feature page and cookbook — all live-tested.

- Feature pages (Text to Speech, Speech to Text, Voice Cloning, Realtime
  Streaming, Manage Voices): use cases, quick start, implementation
  details, and directional cards (web app / API ref / cookbooks / AI agent).
- Cookbooks: 10 recipes grouped by feature; every code block run live
  against the API (Python 24/24, JS 14/14) via the tests/cookbooks and
  tests/js harnesses (extract the published code, run, assert, clean up).
- JavaScript SDK examples added (fish-audio npm); WebSocket realtime held
  (convertRealtime is broken in 0.1.0).
- New pages: Get Your API Key, transport-agnostic Errors, and an
  install-first AI Coding Agents page (npx skills add docs.fish.audio).
- Redirect + delete superseded pages: core-features/*, sdk-guide/python/*,
  sdk-guide/javascript/*, products/*, and the duplicate introduction.
- Correctness fixes surfaced by live testing: duration in seconds,
  sample_rate via TTSConfig, ASR multipart/timestamps, async
  stream_websocket (no await), latency default (balanced), opus_bitrate,
  required model header.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 5, 2026 22:23
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 5, 2026

Ready to act? Review this PR in Change Stack to turn feedback into patch suggestions you can inspect and refine.

Review Change Stack

📝 Walkthrough

Walkthrough

This PR comprehensively restructures Fish Audio documentation from a developer-guide-focused organization to a feature-first model. It introduces official SDK skill manifests with cross-language reference docs, consolidates scattered guides into cohesive feature pages, adds 11 cookbook recipes demonstrating real-world workflows, and implements a full test harness validating code examples against the live API. Navigation is reorganized with 70+ redirect mappings from legacy paths to new destinations.

Changes

SDK Contracts & Reference Documentation

Layer / File(s) Summary
SDK skill manifests and error/installation references
.mintlify/skills/fish-audio-sdk/SKILL.md, .mintlify/skills/fish-audio-sdk/references/errors.md, installation.md, speech-to-text.md, text-to-speech.md, voice-cloning.md, websocket.md
Defines the fish-audio-sdk skill scope (Python fishaudio / JavaScript fish-audio), documents comprehensive exception hierarchies, authentication patterns, and feature references with code examples for both SDKs. Clarifies that Python has no automatic retries while JavaScript auto-retries on 408/429/5xx; 422 validation errors surface as APIError in Python and UnprocessableEntityError in JavaScript.
Python SDK error handling and type documentation
developer-guide/sdk-guide/python/errors.mdx, api-reference/sdk/python/overview.mdx, api-reference/sdk/python/types.mdx
Adds error handling reference with exception hierarchy, timeout configuration, and manual retry patterns. Updates Voice model states to "created", "training", "trained", "failed". Corrects ASRResponse duration from milliseconds to seconds. Updates async streaming example and error handler imports in overview.

Feature Documentation Pages

Layer / File(s) Summary
Core feature pages (TTS, ASR, voice cloning)
features/text-to-speech.mdx, features/speech-to-text.mdx, features/voice-cloning.mdx
Introduces user-focused feature pages with quick-start examples, model/format selection guidance, voice reuse patterns, and implementation details. Covers model selection (s2-pro vs s1), output formats, latency tuning, instant cloning via reference audio, and state transitions for persistent voices.
Advanced features (realtime streaming, voice management)
features/realtime-streaming.mdx, features/manage-voices.mdx
Documents low-latency streaming via HTTP and WebSocket with LLM token integration, and voice model lifecycle management including listing with pagination, metadata updates, and visibility states.

API Contract & Reference Updates

Layer / File(s) Summary
TTS bitrate schema updates
.mintlify/skills/fish-audio-api/SKILL.md, api-reference/asyncapi.yml
Updates opus_bitrate enum from relative values (-24/-32/-48/-64) to absolute bps values (24000/32000/48000/64000), clarifying units and keeping -1000 as auto-mode.
Error response and retry documentation
api-reference/errors.mdx, api-reference/introduction.mdx
Adds comprehensive error documentation with JSON shape definition, HTTP status table, retry guidance (exponential backoff for 429 and 5xx), and SDK-specific exception handling examples for Python (RateLimitError, APIError, NotFoundError) and JavaScript (UnprocessableEntityError, statusCode branching).

Navigation Restructure & Redirects

Layer / File(s) Summary
docs.json navigation and redirect restructuring
docs.json
Replaces "Docs" tab with "Overview" tab, reorganizes Core Features grouping with emotion/fine-grained-control subpages, adds Integrations and Tutorials sections, introduces SDK Reference group with Python/JavaScript entries, and creates 70+ redirect mappings from legacy /developer-guide/... and /developer-guide/core-features/... paths to consolidated /features/..., /overview/..., and /api-reference/sdk/... destinations.

Getting Started & Developer Experience

Layer / File(s) Summary
Getting started pages (API key setup, SDK quickstart)
developer-guide/getting-started/api-key.mdx, developer-guide/sdk-guide/quickstart.mdx
Introduces "Get Your API Key" with step-by-step account and environment setup. Adds comprehensive SDK Quickstart with pip/npm install, FISH_API_KEY configuration, and first TTS example for Python and JavaScript.
Cookbook recipes (11 pages covering common tasks)
developer-guide/sdk-guide/cookbook/batch-transcribe-with-language-hint.mdx, clone-and-wait-until-ready.mdx, discover-library-voice.mdx, instant-voice-cloning.mdx, oneshot-vs-persistent-cloning.mdx, realtime-llm-to-speech.mdx, streaming-to-file.mdx, telephony-8khz-audio.mdx, transcribe-to-captions.mdx, voice-agent-loop.mdx
Adds 11 cookbook recipes with Python/JavaScript examples: batch ASR with language hints, voice cloning workflows (instant and persistent with polling), realtime LLM token-to-speech, file streaming, telephony 8 kHz configuration, SRT/VTT caption generation, voice library discovery, and voice agent loop (ASR → LLM → TTS streaming).
Developer guides reorganization
developer-guide/getting-started/quickstart.mdx, developer-guide/resources/agent-quickstart.mdx, developer-guide/resources/coding-agents.mdx, developer-guide/tutorials/tutorials.mdx, and removed pages
Updates quickstart to use environment-based auth and adds error troubleshooting. Restructures agent-quickstart with dual skill installation options. Shifts coding-agents to skill-first approach with condensed MCP section. Updates tutorial links. Removes legacy developer-guide pages (text-to-speech, speech-to-text, voice-cloning, websocket, creating-models) now consolidated into feature pages.

Test Infrastructure for Documentation Recipes

Layer / File(s) Summary
Python cookbook test harness
tests/cookbooks/conftest.py, tests/cookbooks/harness.py, tests/cookbooks/extract.py, tests/cookbooks/specs.py, tests/cookbooks/test_cookbooks.py, tests/cookbooks/requirements.txt, tests/cookbooks/README.md, tests/.gitignore
Implements pytest fixtures for API key resolution, client shims, sample audio, and voice tracking/cleanup. Adds MDX code block extraction and audio sniffing. Defines spec-driven test cases with substitution support. Runs parameterized tests validating file creation, audio format, and variable state against live API.
JavaScript test runner
tests/js/run.mjs, tests/js/specs.mjs, tests/js/package.json
Implements Node.js ES module test runner extracting JavaScript code blocks from MDX, applying substitutions, executing in isolated directories with sample WAV generation, validating output files via audio sniffing and SRT cue checking, and performing voice cleanup.

Overview & Platform Pages

Layer / File(s) Summary
Capabilities and platform overview
overview/capabilities.mdx, overview/platform.mdx
Introduces capabilities overview listing core features (TTS, ASR, voice cloning, realtime, voice management), web-app-only tools (voice changer, story studio, music), model descriptions, and a platform page describing web app interface (Create audio, Voices, Projects, Library, Account & billing) with links to SDK/API.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • fishaudio/docs#70: Updates the fish-audio-api skill documentation, including the opus_bitrate field enum change from relative to absolute bps values (24–64 kbps range).
  • fishaudio/docs#77: Modifies developer-guide/core-features/fine-grained-control.mdx and adds language subpages, overlapping with the same fine-grained-control documentation area.
  • fishaudio/docs#81: Updates developer-guide/core-features/creating-models.mdx (voice model documentation) that is removed in this PR.

Suggested labels

documentation, python, javascript

Suggested reviewers

  • twangodev

🐰 Hop hop! A grand restructure of the warren,
Feature nests and tests all newly charred-in!
SDKs now speak in skill, recipes proliferate,
Documentation paths consolidated—simply first-rate! 🐇✨

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/docs-update-and-sdk-skill

@mintlify
Copy link
Copy Markdown

mintlify Bot commented Jun 5, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
hanabiaiinc 🟢 Ready View Preview Jun 5, 2026, 10:26 PM

💡 Tip: Enable Workflows to automatically generate PRs for you.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reorganizes the Fish Audio documentation into feature-first pages and adds end-to-end test harnesses that execute published Python and JavaScript examples against the live API, alongside new/updated agent skills and expanded error-handling guidance.

Changes:

  • Rebuilds docs IA around one canonical page per core feature, with updated navigation + redirects.
  • Adds live-tested example runners under tests/ for Python cookbooks (pytest) and JavaScript snippets (node).
  • Introduces/updates feature pages, cookbook recipes, and agent-skill references (SDK + raw API), plus a new API “Errors” page.

Reviewed changes

Copilot reviewed 76 out of 76 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tests/js/specs.mjs Auto-generated spec list mapping docs pages to runnable JS blocks.
tests/js/run.mjs Node runner that extracts fenced JS blocks from MDX and executes them.
tests/js/package.json JS test package definition for the node-based docs runner.
tests/cookbooks/test_cookbooks.py Pytest runner that executes extracted Python cookbook blocks and asserts outputs.
tests/cookbooks/specs.py Central per-recipe spec configuration for Python cookbook coverage.
tests/cookbooks/requirements.txt Python deps for cookbook e2e testing.
tests/cookbooks/README.md How-to for running/adding cookbook tests.
tests/cookbooks/harness.py Shared Python harness (key resolution, httpx client setup, sniff/consume helpers).
tests/cookbooks/extract.py MDX fenced code-block extractor for Python snippets.
tests/cookbooks/conftest.py Pytest fixtures for API key setup, generated sample wav, temp cwd, cleanup tracking.
tests/.gitignore Ignores pytest caches + JS node_modules and run artifacts.
snippets/support.mdx Updates support links to point at new feature page for realtime streaming.
overview/platform.mdx New “Platform (Web App)” orientation page for no-code usage.
overview/capabilities.mdx New overview landing page listing core features and platform capabilities.
features/voice-cloning.mdx New canonical Voice Cloning feature page with Python/curl/JS examples.
features/text-to-speech.mdx New canonical TTS feature page with formats, tuning, and MessagePack notes.
features/speech-to-text.mdx New canonical STT feature page with timestamps guidance and SDK/API examples.
features/realtime-streaming.mdx New canonical realtime streaming page (HTTP streaming + WebSocket token streaming).
features/manage-voices.mdx New canonical voice management page with list/get/update/delete examples.
docs.json New navigation structure + redirect mapping to preserve legacy URLs.
developer-guide/tutorials/tutorials.mdx Updates links to point at new feature-first pages.
developer-guide/sdk-guide/quickstart.mdx New SDK quickstart (Python + early-release JS notes).
developer-guide/sdk-guide/python/websocket.mdx Deleted legacy page (content moved into feature-first structure).
developer-guide/sdk-guide/python/voice-cloning.mdx Deleted legacy page (content moved into feature-first structure).
developer-guide/sdk-guide/python/speech-to-text.mdx Deleted legacy page (content moved into feature-first structure).
developer-guide/sdk-guide/python/errors.mdx New Python SDK error/retry/timeout guidance.
developer-guide/sdk-guide/python/authentication.mdx Updates next-step links to new feature pages.
developer-guide/sdk-guide/javascript/websocket.mdx Deleted legacy JS websocket page (content moved/withheld).
developer-guide/sdk-guide/javascript/voice-cloning.mdx Deleted legacy JS voice cloning page (content moved into features/cookbooks).
developer-guide/sdk-guide/javascript/text-to-speech.mdx Deleted legacy JS TTS page (content moved into features/cookbooks).
developer-guide/sdk-guide/javascript/speech-to-text.mdx Deleted legacy JS STT page (content moved into features/cookbooks).
developer-guide/sdk-guide/javascript/installation.mdx Deleted legacy JS installation page (superseded by new quickstart/reference).
developer-guide/sdk-guide/javascript/authentication.mdx Deleted legacy JS auth page (superseded by API key page).
developer-guide/sdk-guide/cookbook/voice-agent-loop.mdx New cookbook recipe chaining ASR → LLM → TTS.
developer-guide/sdk-guide/cookbook/transcribe-to-captions.mdx New cookbook recipe for writing SRT/VTT from ASR segments.
developer-guide/sdk-guide/cookbook/telephony-8khz-audio.mdx New cookbook recipe for 8kHz WAV/PCM generation.
developer-guide/sdk-guide/cookbook/streaming-to-file.mdx New cookbook recipe for streaming TTS to disk without buffering.
developer-guide/sdk-guide/cookbook/realtime-llm-to-speech.mdx New cookbook recipe for token streaming to WebSocket TTS.
developer-guide/sdk-guide/cookbook/oneshot-vs-persistent-cloning.mdx New cookbook guide comparing inline references vs saved voice models.
developer-guide/sdk-guide/cookbook/instant-voice-cloning.mdx New cookbook recipe for inline reference-audio cloning.
developer-guide/sdk-guide/cookbook/discover-library-voice.mdx New cookbook recipe for finding and using a public voice as reference_id.
developer-guide/sdk-guide/cookbook/clone-and-wait-until-ready.mdx New cookbook recipe for training a persistent voice and polling readiness.
developer-guide/sdk-guide/cookbook/batch-transcribe-with-language-hint.mdx New cookbook recipe for batch ASR with explicit language.
developer-guide/resources/agent-quickstart.mdx Updates agent-skill installation instructions (API vs SDK skills).
developer-guide/products/voice-cloning.mdx Deleted placeholder product guide (replaced by feature pages).
developer-guide/products/tts.mdx Deleted placeholder product guide (replaced by feature pages).
developer-guide/products/story-studio.mdx Deleted placeholder product guide (replaced by platform guide).
developer-guide/getting-started/quickstart.mdx Updates quickstart to read FISH_API_KEY + adds troubleshooting and model header fix.
developer-guide/getting-started/introduction.mdx Deleted old overview page (replaced by overview/capabilities).
developer-guide/getting-started/api-key.mdx New “Get Your API Key” page with first request examples.
developer-guide/core-features/speech-to-text.mdx Deleted legacy core-features STT page (replaced by feature page).
developer-guide/core-features/fine-grained-control.mdx Adds sidebar title + “try it live” card linking to API playground.
developer-guide/core-features/emotions.mdx Adds “try it live” card + updates link to new TTS feature page.
developer-guide/core-features/creating-models.mdx Deleted legacy “creating models” page (replaced by voice-cloning feature page).
archive/python-sdk-legacy/text-to-speech.mdx Updates legacy-page links to new feature pages.
archive/python-sdk-legacy/migration-guide.mdx Updates legacy migration guide links to new feature pages.
api-reference/sdk/python/types.mdx Updates state/duration docs (voice states and ASR duration unit).
api-reference/sdk/python/overview.mdx Updates links and fixes exception guidance + async websocket example.
api-reference/introduction.mdx Adds Errors section + updates realtime streaming links to new feature page.
api-reference/errors.mdx New API errors page with status table + retry guidance + SDK handling examples.
api-reference/endpoint/wallet/get-user-package.mdx Renames the endpoint title (minor metadata update).
api-reference/endpoint/openapi-v1/text-to-speech.mdx Updates MessagePack “direct upload” link to new feature page anchor.
api-reference/emotion-reference.mdx Updates “see also” links to new TTS feature page.
api-reference/asyncapi.yml Updates opus bitrate enum values (unit change reflected in schema).
.mintlify/skills/fish-audio-sdk/SKILL.md Adds/updates SDK-focused agent skill index and guidance.
.mintlify/skills/fish-audio-sdk/references/websocket.md New realtime WebSocket SDK reference for agents.
.mintlify/skills/fish-audio-sdk/references/voice-cloning.md New voice cloning SDK reference for agents.
.mintlify/skills/fish-audio-sdk/references/text-to-speech.md New TTS SDK reference for agents.
.mintlify/skills/fish-audio-sdk/references/speech-to-text.md New STT SDK reference for agents.
.mintlify/skills/fish-audio-sdk/references/installation.md New installation/auth reference for agents.
.mintlify/skills/fish-audio-sdk/references/errors.md New SDK error/retry reference for agents.
.mintlify/skills/fish-audio-api/SKILL.md Updates raw-API skill docs (opus bitrate unit/value update).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +19 to +41
_WORKSPACE_ENV = "/Users/shawnlai/project/fish-audio/.env"
_LOCAL_KEYFILE = "/tmp/claude/fishdoctest/fishkey"

_created_voice_ids = []


def resolve_key():
k = os.environ.get("FISH_API_KEY")
if k:
return k.strip()
if os.path.isfile(_LOCAL_KEYFILE):
v = Path(_LOCAL_KEYFILE).read_text().strip()
if v:
return v
try:
from dotenv import dotenv_values
for p in (_WORKSPACE_ENV, str(Path.cwd() / ".env")):
v = dotenv_values(p).get("FISH_API_KEY")
if v:
return v.strip()
except Exception:
pass
return None
Comment thread tests/js/run.mjs
writeFileSync(join(dir, "run.mjs"), code);
let ok = true, err = "";
try {
execFileSync(process.execPath, ["run.mjs"], { cwd: dir, env: { ...process.env }, stdio: "pipe", timeout: 180000 });
Comment on lines +118 to +119
- Python defines a `ValidationError` class but **never raises it** — don't catch it expecting validation failures; a 422 surfaces as `APIError`. The JS SDK throws `UnprocessableEntityError` on 422.
- ASR segment `start` / `end` are in **seconds**, but `duration` is in **milliseconds**. See [speech-to-text](references/speech-to-text.md).
Comment on lines +31 to +40
result.text # str — full transcript
result.duration # float — total audio duration in MILLISECONDS
result.segments # list[ASRSegment]
# each segment:
seg.text # str
seg.start # float — seconds
seg.end # float — seconds
```

> **Unit gotcha (verified in source):** segment `start` / `end` are in **seconds**, but `duration` is in **milliseconds**. Don't assume they share a unit.
audio2 = client.tts.convert(text="Second line.", config=config)
```

`TTSConfig` fields (with defaults): `format="mp3"`, `sample_rate=None`, `mp3_bitrate=128` (`64|128|192`), `opus_bitrate=32` (kbps: `-1000|24|32|48|64`, `-1000`=auto), `normalize=True`, `chunk_length=200`, `latency="balanced"`, `reference_id=None`, `references=[]`, `prosody=None`, `top_p=0.7`, `temperature=0.7`, `max_new_tokens=1024`, `repetition_penalty=1.2`, `min_chunk_length=50`, `condition_on_previous_chunks=True`, `early_stop_threshold=1.0`.
Comment thread api-reference/errors.mdx
Comment on lines +78 to +97
import {
UnauthorizedError, // 401
TooEarlyError, // 429
NotFoundError, // 404
BadRequestError, // 400
UnprocessableEntityError, // 422
FishAudioError, // base — has .statusCode and .body
} from "fish-audio";

try {
await client.textToSpeech.convert({ text: "Hello!" }, "s2-pro");
} catch (err) {
if (err instanceof UnauthorizedError) {
// invalid or missing key
} else if (err instanceof TooEarlyError) {
// back off and retry
} else if (err instanceof FishAudioError) {
console.error(err.statusCode, err.body);
}
}
Comment on lines +3 to 4
title: 'Get User Package'
description: 'Get current user premium information'
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 19

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs.json (1)

375-388: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Broken legacy redirects still target deprecated core-feature paths.

Line 375 and Line 379 redirect to /developer-guide/core-features/text-to-speech and /developer-guide/core-features/speech-to-text, but this PR’s feature-first structure uses /features/text-to-speech and /features/speech-to-text. These old destinations risk dead-end redirects.

Suggested fix
-      "destination": "/developer-guide/core-features/text-to-speech"
+      "destination": "/features/text-to-speech"
...
-      "destination": "/developer-guide/core-features/speech-to-text"
+      "destination": "/features/speech-to-text"
...
-      "destination": "/developer-guide/core-features/creating-models"
+      "destination": "/features/voice-cloning"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs.json` around lines 375 - 388, The redirect entries in docs.json where
the objects with "source": "/resources/best-practices/text-to-speech" and
"source": "/resources/best-practices/speech-to-text" still point to legacy
destinations "/developer-guide/core-features/text-to-speech" and
"/developer-guide/core-features/speech-to-text"; update those two "destination"
values to the new feature-first paths "/features/text-to-speech" and
"/features/speech-to-text" so the redirects resolve to the current structure
(edit the destination fields in the corresponding objects).
🧹 Nitpick comments (3)
features/realtime-streaming.mdx (1)

38-224: ⚡ Quick win

Add prerequisites before the first streaming walkthrough.

The page provides executable flows but does not first state required setup (API key env var, SDK install, and optional audio playback/output dependencies). A brief prerequisites section will make first-run success more reliable.
As per coding guidelines, “Include prerequisites at the start of procedural content.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@features/realtime-streaming.mdx` around lines 38 - 224, Add a short
"Prerequisites" section before the first streaming walkthrough that lists
required setup: set FISH_API_KEY environment variable, install the SDK for each
language shown (reference FishAudio, AsyncFishAudio, FishAudioClient), and
optional playback/output dependencies (e.g., ffmpeg or simpleaudio) for examples
that call play or write to files; mention that the Python examples use FishAudio
and fishaudio.utils.play and the JS examples use FishAudioClient and
createWriteStream, and note the WebSocket/streaming methods (client.tts.stream,
client.tts.stream_websocket) require network access and the correct model
header. Ensure the section is concise and placed immediately before the "Stream
text you already have" walkthrough.
features/manage-voices.mdx (1)

38-113: ⚡ Quick win

Add a prerequisites block before the first runnable examples.

This page is procedural and starts directly with execution examples (API key, package install, and sample voice ID assumptions are implicit). Add a short prerequisites section near the top to reduce setup failures.
As per coding guidelines, “Include prerequisites at the start of procedural content.”

Also applies to: 120-134

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@features/manage-voices.mdx` around lines 38 - 113, Add a short
"Prerequisites" section before the "List your voices" runnable examples that
states required setup: how to set FISH_API_KEY (env var), how to install the SDK
if applicable (e.g., pip/npm install fish-audio), and that example uses a
placeholder voice id ("YOUR_VOICE_ID"); update both the Python/JS/bash examples
near the "List your voices" and the "Get, update, and delete" sections to
reference that prerequisites block so readers know to set up the API key,
install the client, and replace the sample voice id.
developer-guide/sdk-guide/cookbook/discover-library-voice.mdx (1)

28-32: ⚡ Quick win

Clarify the fallback placeholder or update the comment.

The comment states "fall back to a known id if the search is empty," but <voice-id> is a placeholder that will cause an API error if used. When page.items is empty, users copying this code will encounter a failure. Consider either:

  1. Updating the comment to clarify that the placeholder must be replaced: "Pick the first result; replace <voice-id> with a real voice ID if needed"
  2. Providing an actual fallback ID as an example
  3. Simplifying the logic to make the placeholder usage more obvious
♻️ Clearer alternative implementation

Python:

-# Pick the first result; fall back to a known id if the search is empty
-reference_id = "<voice-id>"
-for voice in page.items:
-    print(voice.id, voice.title, voice.languages)
-    reference_id = reference_id if reference_id != "<voice-id>" else voice.id
+# Pick the first result; you'll need a real voice ID if the search is empty
+reference_id = page.items[0].id if page.items else "<voice-id>"
+for voice in page.items:
+    print(voice.id, voice.title, voice.languages)

JavaScript:

-// Pick the first result; fall back to a known id if the search is empty
-let referenceId = "<voice-id>";
-for (const voice of page.items) {
-  console.log(voice._id, voice.title, voice.languages);
-  referenceId = referenceId !== "<voice-id>" ? referenceId : voice._id;
-}
+// Pick the first result; you'll need a real voice ID if the search is empty
+let referenceId = page.items.length > 0 ? page.items[0]._id : "<voice-id>";
+for (const voice of page.items) {
+  console.log(voice._id, voice.title, voice.languages);
+}

Also applies to: 52-56

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@developer-guide/sdk-guide/cookbook/discover-library-voice.mdx` around lines
28 - 32, The snippet uses a placeholder reference_id = "<voice-id>" that will
cause API errors if left unchanged; update the code around reference_id and the
loop over page.items so that when page.items is empty you either (a) require the
user to replace the placeholder by clarifying the comment (“replace <voice-id>
with a real voice ID”), (b) provide a real example fallback ID as the initial
reference_id, or (c) explicitly handle the empty result (e.g., set reference_id
= None and raise/return a clear error) — modify the comment and the reference_id
initialization and the post-loop handling to reflect one of these options and
ensure any downstream use of reference_id checks for a valid value before
calling the API.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api-reference/emotion-reference.mdx`:
- Line 135: The internal link text "Text-to-Speech Best Practices" in
emotion-reference.mdx currently uses an absolute path
"/features/text-to-speech"; update that href to a relative path (e.g.,
"../features/text-to-speech") so the internal link follows the docs guideline of
using relative paths for internal links.

In `@api-reference/endpoint/openapi-v1/text-to-speech.mdx`:
- Line 14: Replace the absolute internal link
"/features/text-to-speech#direct-api-messagepack" with a relative path from this
file (api-reference/endpoint/openapi-v1/text-to-speech.mdx) — use
"../../../features/text-to-speech#direct-api-messagepack" in the markdown link
so the reference remains correct across route/base-path changes; update the link
target in the sentence that starts "To upload audio clips directly..."
accordingly.

In `@api-reference/endpoint/wallet/get-user-package.mdx`:
- Around line 3-4: Update the YAML frontmatter description to accurately reflect
the renamed page "Get User Package": change the current description value ('Get
current user premium information') to a concise, accurate summary like 'Get
current user package information' or 'Retrieve current user package details' so
the frontmatter description and title ("Get User Package") are consistent for
SEO/navigation; edit the description field in the frontmatter near the
title/description entries.

In `@api-reference/sdk/python/overview.mdx`:
- Line 141: Replace the absolute internal docs URLs in the "Learn more" links
with repo-standard relative paths: locate the "Learn more" anchor(s) whose href
is "https://docs.fish.audio/features/text-to-speech" and change it to the
corresponding relative path (for example "/features/text-to-speech" or the
appropriate relative MDX path used across the repo), and apply the same
replacement for the other occurrences of absolute "https://docs.fish.audio/..."
in this file so all internal links use relative paths.

In `@developer-guide/getting-started/api-key.mdx`:
- Around line 7-35: Add a short "Prerequisites" section and place it before the
"## 1. Create an account and key" heading (i.e., before the <Steps> block); list
minimal requirements such as an account (or signup link), a modern
browser/terminal, and required tools for examples (e.g., Node or Python if SDK
examples are used), and explicitly mention that the FISH_API_KEY environment
variable must be set (export FISH_API_KEY="your_api_key_here") and kept secret;
update any mention of examples to note that SDKs/readme expect FISH_API_KEY to
be present.

In `@developer-guide/resources/agent-quickstart.mdx`:
- Around line 78-80: Replace the absolute docs URLs with relative internal
routes: change the three links shown (link texts "JavaScript Authentication",
"Python SDK Overview", "JavaScript Installation") from
"https://docs.fish.audio/..." to their corresponding relative paths (for example
"/developer-guide/getting-started/api-key.md",
"/api-reference/sdk/python/overview.md",
"/api-reference/sdk/javascript/api-reference.md"); also apply the same
conversion to the other occurrences noted (lines referenced in the comment
ranges) so all internal MDX links use relative paths instead of absolute URLs.

In `@developer-guide/resources/coding-agents.mdx`:
- Around line 15-21: Add a short "Prerequisites" section immediately before the
"## Install the skill" heading that lists required runtime/tools and supported
agents (e.g., Node.js and npm availability, minimum Node version, and which
agents like Claude Code and Cursor are supported), and then keep the existing
install command (`npx skills add https://docs.fish.audio`) unchanged; ensure the
new section is a brief bullet or sentence block so it appears at the start of
the procedural flow as required by the coding guidelines.
- Line 53: Replace the absolute docs URLs in
developer-guide/resources/coding-agents.mdx with relative paths: change any
occurrences of "https://docs.fish.audio/.well-known/agent-skills/index.json" and
"https://docs.fish.audio/.well-known/agent-skills/<name>/SKILL.md" to
"/.well-known/agent-skills/index.json" and
"/.well-known/agent-skills/<name>/SKILL.md" respectively so internal links are
environment-agnostic and follow the project guideline to use relative paths.

In `@developer-guide/sdk-guide/cookbook/batch-transcribe-with-language-hint.mdx`:
- Line 7: The import and internal links use root-absolute paths (e.g. "import
Prerequisites from \"/snippets/prerequisites.mdx\"") which must be converted to
relative paths; update the import to the correct relative path (for example
"./snippets/prerequisites.mdx" or the appropriate "../" prefix based on this
file's folder) and change any internal markdown links that start with "/" to
equivalent relative links, ensuring all references (including the other
occurrences noted around the Prerequisites import and the links later in the
file) use relative internal paths.

In `@developer-guide/sdk-guide/cookbook/clone-and-wait-until-ready.mdx`:
- Line 7: Replace root-absolute internal references with relative ones: update
the import statement that brings in Prerequisites (the import of Prerequisites)
and the other root-absolute links referenced around the file (the occurrences
noted) so they use relative paths instead of starting with a leading slash;
ensure each internal import or link follows the repo docs convention (relative
linking from the current document) so the imports resolve correctly within the
docs tree.

In `@developer-guide/sdk-guide/cookbook/voice-agent-loop.mdx`:
- Line 7: The import and internal links use root-absolute paths (e.g., the
import statement "import Prerequisites from \"/snippets/prerequisites.mdx\"" and
several other internal links at the referenced ranges) — change these to
relative paths instead (for example "./snippets/prerequisites.mdx" or the
correct relative path from this document) so all internal imports/links use
relative routing; update every occurrence mentioned (around lines 15, 106,
110-113, 117, 121-123) to follow the "Use relative paths for internal links"
guideline and verify links still resolve in the docs build.

In `@developer-guide/sdk-guide/quickstart.mdx`:
- Around line 18-37: Move the <Prerequisites /> block to the top of the
procedure so prerequisites appear before "## 1. Install": place the
<Prerequisites /> component immediately above the "## 1. Install" heading (or
replace the existing order so the Prerequisites section is the first item),
update surrounding prose if needed so numbering/flow remains correct, and ensure
any references to environment variables (FISH_API_KEY) remain under the
Authentication section after the install step.

In `@features/speech-to-text.mdx`:
- Around line 38-42: Insert a short "Prerequisites" section immediately above
the "Quick start" header in features/speech-to-text.mdx that lists required
items: setting the API key (how to provide it via environment variable or
config) and installing the SDK (npm/yarn install command or pip if appropriate),
plus any runtime requirements (Node version or browser support) so readers have
the API key and the SDK installed before following the Quick start steps.
- Line 76: The doc line stating "duration is in seconds" conflicts with the SDK
reference that documents duration as milliseconds; update the
features/speech-to-text.mdx description of the duration field to explicitly
match the SDK (use milliseconds) or clearly differentiate API vs SDK units
(e.g., "duration (API: seconds / SDK: milliseconds)"), and then propagate that
change to any examples/tests or sample code that use duration so they use the
corrected unit; ensure you update the mention of the duration field and any
related examples or segments parsing logic to avoid unit mismatch.

In `@features/text-to-speech.mdx`:
- Around line 39-43: Add a new "Prerequisites" section immediately before the
"## Quick start" heading that lists required items: obtaining and setting the
API key (how/where it's provided), required package installs (e.g., SDK or CLI
packages and exact install commands), and runtime assumptions (supported
Node/Python versions, browser vs server limitations). Reference the existing "##
Quick start" heading to locate where to insert the section and keep the style
consistent with other docs (use a second-level heading and bullet list).

In `@features/voice-cloning.mdx`:
- Around line 38-43: Insert a concise "Prerequisites" block immediately before
the "Quick start" heading that lists required items: obtaining and setting the
API key, expected audio sample formats and length/quality requirements (e.g.,
sample rate, mono/stereo, minimum duration), and SDK/CLI install instructions
(e.g., pip/npm install or link to SDK). Ensure the block is brief, uses
bullet-style lines, and references the "Quick start" section so readers know
it's required before sending samples or choosing an implementation.

In `@overview/capabilities.mdx`:
- Around line 72-74: The SDK card copy currently reads "The Python library for
your application" which inaccurately excludes JavaScript; update the Card
component (title "Build with the SDK") copy to neutral wording such as "Official
Python and JavaScript SDKs" or similar inclusive text so it reflects both SDK
paths introduced in the docs restructure.

In `@tests/cookbooks/harness.py`:
- Around line 19-20: Replace the hardcoded absolute paths _WORKSPACE_ENV and
_LOCAL_KEYFILE in tests/cookbooks/harness.py with configurable lookups: read
them from environment variables (e.g., os.environ.get('WORKSPACE_ENV') and
os.environ.get('LOCAL_KEYFILE')) with safe repo-relative defaults (or a relative
.env/keyfile location under the repo's test resources) so machine-specific
secrets are not committed; update any code that references
_WORKSPACE_ENV/_LOCAL_KEYFILE to use these variables and ensure the defaults are
non-sensitive and documented for local test setup.
- Around line 33-40: The try/except around dotenv parsing (the block using
dotenv_values, _WORKSPACE_ENV, Path.cwd() and FISH_API_KEY) currently swallows
all errors; replace the bare "except Exception: pass" with targeted exception
handling (e.g., catch ImportError, OSError/FileNotFoundError, and ValueError)
and emit a clear warning or logger.warning that includes context (which path was
attempted and the exception message); for truly unexpected exceptions either
re-raise or log them as errors. Apply the same change to the voice-deletion
teardown code (the cleanup block referenced at lines 77-80), replacing its broad
silent except with explicit catches and warnings so credential parsing and
cleanup failures surface in CI/local runs.

---

Outside diff comments:
In `@docs.json`:
- Around line 375-388: The redirect entries in docs.json where the objects with
"source": "/resources/best-practices/text-to-speech" and "source":
"/resources/best-practices/speech-to-text" still point to legacy destinations
"/developer-guide/core-features/text-to-speech" and
"/developer-guide/core-features/speech-to-text"; update those two "destination"
values to the new feature-first paths "/features/text-to-speech" and
"/features/speech-to-text" so the redirects resolve to the current structure
(edit the destination fields in the corresponding objects).

---

Nitpick comments:
In `@developer-guide/sdk-guide/cookbook/discover-library-voice.mdx`:
- Around line 28-32: The snippet uses a placeholder reference_id = "<voice-id>"
that will cause API errors if left unchanged; update the code around
reference_id and the loop over page.items so that when page.items is empty you
either (a) require the user to replace the placeholder by clarifying the comment
(“replace <voice-id> with a real voice ID”), (b) provide a real example fallback
ID as the initial reference_id, or (c) explicitly handle the empty result (e.g.,
set reference_id = None and raise/return a clear error) — modify the comment and
the reference_id initialization and the post-loop handling to reflect one of
these options and ensure any downstream use of reference_id checks for a valid
value before calling the API.

In `@features/manage-voices.mdx`:
- Around line 38-113: Add a short "Prerequisites" section before the "List your
voices" runnable examples that states required setup: how to set FISH_API_KEY
(env var), how to install the SDK if applicable (e.g., pip/npm install
fish-audio), and that example uses a placeholder voice id ("YOUR_VOICE_ID");
update both the Python/JS/bash examples near the "List your voices" and the
"Get, update, and delete" sections to reference that prerequisites block so
readers know to set up the API key, install the client, and replace the sample
voice id.

In `@features/realtime-streaming.mdx`:
- Around line 38-224: Add a short "Prerequisites" section before the first
streaming walkthrough that lists required setup: set FISH_API_KEY environment
variable, install the SDK for each language shown (reference FishAudio,
AsyncFishAudio, FishAudioClient), and optional playback/output dependencies
(e.g., ffmpeg or simpleaudio) for examples that call play or write to files;
mention that the Python examples use FishAudio and fishaudio.utils.play and the
JS examples use FishAudioClient and createWriteStream, and note the
WebSocket/streaming methods (client.tts.stream, client.tts.stream_websocket)
require network access and the correct model header. Ensure the section is
concise and placed immediately before the "Stream text you already have"
walkthrough.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8c654264-8656-4596-9372-64b6ad86b1e5

📥 Commits

Reviewing files that changed from the base of the PR and between 9184a08 and c4d801d.

📒 Files selected for processing (76)
  • .mintlify/skills/fish-audio-api/SKILL.md
  • .mintlify/skills/fish-audio-sdk/SKILL.md
  • .mintlify/skills/fish-audio-sdk/references/errors.md
  • .mintlify/skills/fish-audio-sdk/references/installation.md
  • .mintlify/skills/fish-audio-sdk/references/speech-to-text.md
  • .mintlify/skills/fish-audio-sdk/references/text-to-speech.md
  • .mintlify/skills/fish-audio-sdk/references/voice-cloning.md
  • .mintlify/skills/fish-audio-sdk/references/websocket.md
  • api-reference/asyncapi.yml
  • api-reference/emotion-reference.mdx
  • api-reference/endpoint/openapi-v1/text-to-speech.mdx
  • api-reference/endpoint/wallet/get-user-package.mdx
  • api-reference/errors.mdx
  • api-reference/introduction.mdx
  • api-reference/sdk/python/overview.mdx
  • api-reference/sdk/python/types.mdx
  • archive/python-sdk-legacy/migration-guide.mdx
  • archive/python-sdk-legacy/text-to-speech.mdx
  • developer-guide/core-features/creating-models.mdx
  • developer-guide/core-features/emotions.mdx
  • developer-guide/core-features/fine-grained-control.mdx
  • developer-guide/core-features/speech-to-text.mdx
  • developer-guide/core-features/text-to-speech.mdx
  • developer-guide/getting-started/api-key.mdx
  • developer-guide/getting-started/introduction.mdx
  • developer-guide/getting-started/quickstart.mdx
  • developer-guide/products/story-studio.mdx
  • developer-guide/products/tts.mdx
  • developer-guide/products/voice-cloning.mdx
  • developer-guide/resources/agent-quickstart.mdx
  • developer-guide/resources/coding-agents.mdx
  • developer-guide/sdk-guide/cookbook/batch-transcribe-with-language-hint.mdx
  • developer-guide/sdk-guide/cookbook/clone-and-wait-until-ready.mdx
  • developer-guide/sdk-guide/cookbook/discover-library-voice.mdx
  • developer-guide/sdk-guide/cookbook/instant-voice-cloning.mdx
  • developer-guide/sdk-guide/cookbook/oneshot-vs-persistent-cloning.mdx
  • developer-guide/sdk-guide/cookbook/realtime-llm-to-speech.mdx
  • developer-guide/sdk-guide/cookbook/streaming-to-file.mdx
  • developer-guide/sdk-guide/cookbook/telephony-8khz-audio.mdx
  • developer-guide/sdk-guide/cookbook/transcribe-to-captions.mdx
  • developer-guide/sdk-guide/cookbook/voice-agent-loop.mdx
  • developer-guide/sdk-guide/javascript/authentication.mdx
  • developer-guide/sdk-guide/javascript/installation.mdx
  • developer-guide/sdk-guide/javascript/speech-to-text.mdx
  • developer-guide/sdk-guide/javascript/text-to-speech.mdx
  • developer-guide/sdk-guide/javascript/voice-cloning.mdx
  • developer-guide/sdk-guide/javascript/websocket.mdx
  • developer-guide/sdk-guide/python/authentication.mdx
  • developer-guide/sdk-guide/python/errors.mdx
  • developer-guide/sdk-guide/python/overview.mdx
  • developer-guide/sdk-guide/python/speech-to-text.mdx
  • developer-guide/sdk-guide/python/text-to-speech.mdx
  • developer-guide/sdk-guide/python/voice-cloning.mdx
  • developer-guide/sdk-guide/python/websocket.mdx
  • developer-guide/sdk-guide/quickstart.mdx
  • developer-guide/tutorials/tutorials.mdx
  • docs.json
  • features/manage-voices.mdx
  • features/realtime-streaming.mdx
  • features/speech-to-text.mdx
  • features/text-to-speech.mdx
  • features/voice-cloning.mdx
  • overview/capabilities.mdx
  • overview/platform.mdx
  • snippets/support.mdx
  • tests/.gitignore
  • tests/cookbooks/README.md
  • tests/cookbooks/conftest.py
  • tests/cookbooks/extract.py
  • tests/cookbooks/harness.py
  • tests/cookbooks/requirements.txt
  • tests/cookbooks/specs.py
  • tests/cookbooks/test_cookbooks.py
  • tests/js/package.json
  • tests/js/run.mjs
  • tests/js/specs.mjs
💤 Files with no reviewable changes (18)
  • developer-guide/sdk-guide/javascript/text-to-speech.mdx
  • developer-guide/sdk-guide/python/websocket.mdx
  • developer-guide/sdk-guide/python/overview.mdx
  • developer-guide/core-features/creating-models.mdx
  • developer-guide/products/voice-cloning.mdx
  • developer-guide/sdk-guide/javascript/voice-cloning.mdx
  • developer-guide/products/tts.mdx
  • developer-guide/sdk-guide/javascript/installation.mdx
  • developer-guide/sdk-guide/python/voice-cloning.mdx
  • developer-guide/sdk-guide/python/text-to-speech.mdx
  • developer-guide/products/story-studio.mdx
  • developer-guide/sdk-guide/javascript/speech-to-text.mdx
  • developer-guide/core-features/speech-to-text.mdx
  • developer-guide/getting-started/introduction.mdx
  • developer-guide/core-features/text-to-speech.mdx
  • developer-guide/sdk-guide/python/speech-to-text.mdx
  • developer-guide/sdk-guide/javascript/websocket.mdx
  • developer-guide/sdk-guide/javascript/authentication.mdx


- [Emotion Control Guide](/developer-guide/core-features/emotions) - Technical implementation
- [Text-to-Speech Best Practices](/developer-guide/core-features/text-to-speech)
- [Text-to-Speech Best Practices](/features/text-to-speech)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use a relative internal link here.

Line 135 uses an absolute internal path; switch it to a relative link to match docs standards and avoid path-coupling.

Suggested fix
-- [Text-to-Speech Best Practices](/features/text-to-speech)
+- [Text-to-Speech Best Practices](../features/text-to-speech)

As per coding guidelines: “Use relative paths for internal links” and “Do not use absolute URLs for internal links”.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- [Text-to-Speech Best Practices](/features/text-to-speech)
- [Text-to-Speech Best Practices](../features/text-to-speech)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api-reference/emotion-reference.mdx` at line 135, The internal link text
"Text-to-Speech Best Practices" in emotion-reference.mdx currently uses an
absolute path "/features/text-to-speech"; update that href to a relative path
(e.g., "../features/text-to-speech") so the internal link follows the docs
guideline of using relative paths for internal links.

For best results, upload reference audio using the [create model](/api-reference/endpoint/model/create-model) before using this one. This improves speech quality and reduces latency.

To upload audio clips directly, without pre-uploading, serialize the request body with MessagePack as per the [instructions](/developer-guide/core-features/text-to-speech#direct-api-usage).
To upload audio clips directly, without pre-uploading, serialize the request body with MessagePack as per the [instructions](/features/text-to-speech#direct-api-messagepack).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Convert this internal link to a relative path.

Line 14 uses an absolute docs path; make it relative for consistency and portability across route/base-path changes.

Suggested fix
-To upload audio clips directly, without pre-uploading, serialize the request body with MessagePack as per the [instructions](/features/text-to-speech#direct-api-messagepack).
+To upload audio clips directly, without pre-uploading, serialize the request body with MessagePack as per the [instructions](../../../features/text-to-speech#direct-api-messagepack).

As per coding guidelines: “Use relative paths for internal links” and “Do not use absolute URLs for internal links”.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
To upload audio clips directly, without pre-uploading, serialize the request body with MessagePack as per the [instructions](/features/text-to-speech#direct-api-messagepack).
To upload audio clips directly, without pre-uploading, serialize the request body with MessagePack as per the [instructions](../../../features/text-to-speech#direct-api-messagepack).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api-reference/endpoint/openapi-v1/text-to-speech.mdx` at line 14, Replace the
absolute internal link "/features/text-to-speech#direct-api-messagepack" with a
relative path from this file
(api-reference/endpoint/openapi-v1/text-to-speech.mdx) — use
"../../../features/text-to-speech#direct-api-messagepack" in the markdown link
so the reference remains correct across route/base-path changes; update the link
target in the sentence that starts "To upload audio clips directly..."
accordingly.

Comment on lines +3 to 4
title: 'Get User Package'
description: 'Get current user premium information'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Align frontmatter description with the renamed endpoint.

After renaming the page to “Get User Package,” the description still says “premium information,” which is inconsistent for SEO/navigation metadata.

Suggested fix
-description: 'Get current user premium information'
+description: 'Get current user package information'

As per coding guidelines: “Prioritize accuracy and usability of information” and “Include description in YAML frontmatter: Concise summary for SEO/navigation”.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
title: 'Get User Package'
description: 'Get current user premium information'
title: 'Get User Package'
description: 'Get current user package information'
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api-reference/endpoint/wallet/get-user-package.mdx` around lines 3 - 4,
Update the YAML frontmatter description to accurately reflect the renamed page
"Get User Package": change the current description value ('Get current user
premium information') to a concise, accurate summary like 'Get current user
package information' or 'Retrieve current user package details' so the
frontmatter description and title ("Get User Package") are consistent for
SEO/navigation; edit the description field in the frontmatter near the
title/description entries.

```

[Learn more](https://docs.fish.audio/developer-guide/sdk-guide/python/text-to-speech)
[Learn more](https://docs.fish.audio/features/text-to-speech)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Switch internal “Learn more” links to relative paths.

These links point to https://docs.fish.audio/... instead of relative internal paths, which breaks the MDX linking standard used in this repo.

Suggested edit
-[Learn more](https://docs.fish.audio/features/text-to-speech)
+[Learn more](/features/text-to-speech)

-[Learn more](https://docs.fish.audio/features/speech-to-text)
+[Learn more](/features/speech-to-text)

-[Learn more](https://docs.fish.audio/features/realtime-streaming)
+[Learn more](/features/realtime-streaming)

-[Learn more](https://docs.fish.audio/features/voice-cloning)
+[Learn more](/features/voice-cloning)

As per coding guidelines, “Use relative paths for internal links” and “Do not use absolute URLs for internal links.”

Also applies to: 157-157, 197-197, 235-235

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api-reference/sdk/python/overview.mdx` at line 141, Replace the absolute
internal docs URLs in the "Learn more" links with repo-standard relative paths:
locate the "Learn more" anchor(s) whose href is
"https://docs.fish.audio/features/text-to-speech" and change it to the
corresponding relative path (for example "/features/text-to-speech" or the
appropriate relative MDX path used across the repo), and apply the same
replacement for the other occurrences of absolute "https://docs.fish.audio/..."
in this file so all internal links use relative paths.

Comment on lines +7 to +35
Everything you build with Fish Audio — the API, the Python library, JavaScript — authenticates with a single **API key**. Here's how to get one and make your first call in a couple of minutes.

## 1. Create an account and key

<Steps>
<Step title="Sign up">
Go to [fish.audio/auth/signup](https://fish.audio/auth/signup), create an account, and verify your email.
</Step>
<Step title="Open the API Keys page">
Sign in and open [fish.audio/app/api-keys](https://fish.audio/app/api-keys).
</Step>
<Step title="Create a key">
Click **Create New Key**, give it a descriptive name (and an expiration if you want), then **copy the key and store it securely** — treat it like a password.

<Warning>Never commit your API key to version control or share it publicly.</Warning>
</Step>
</Steps>

## 2. Store it as an environment variable

The SDKs and the examples throughout these docs read your key from `FISH_API_KEY`:

```bash
export FISH_API_KEY="your_api_key_here"
```

This keeps the key out of your code and lets you use different keys for development and production.

## 3. Make your first request
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a prerequisites section before the step flow.

This is procedural content, but it starts directly with setup steps instead of listing prerequisites first.

As per coding guidelines, “Include prerequisites at the start of procedural content.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@developer-guide/getting-started/api-key.mdx` around lines 7 - 35, Add a short
"Prerequisites" section and place it before the "## 1. Create an account and
key" heading (i.e., before the <Steps> block); list minimal requirements such as
an account (or signup link), a modern browser/terminal, and required tools for
examples (e.g., Node or Python if SDK examples are used), and explicitly mention
that the FISH_API_KEY environment variable must be set (export
FISH_API_KEY="your_api_key_here") and kept secret; update any mention of
examples to note that SDKs/readme expect FISH_API_KEY to be present.

Comment on lines +39 to +43
## Quick start

Send text, get back audio. Choose your implementation:

<CodeGroup>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add prerequisites section before the Quick start steps

Please add prerequisites (API key, package install, runtime assumptions) right before this procedure. As per coding guidelines, "Include prerequisites at the start of procedural content".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@features/text-to-speech.mdx` around lines 39 - 43, Add a new "Prerequisites"
section immediately before the "## Quick start" heading that lists required
items: obtaining and setting the API key (how/where it's provided), required
package installs (e.g., SDK or CLI packages and exact install commands), and
runtime assumptions (supported Node/Python versions, browser vs server
limitations). Reference the existing "## Quick start" heading to locate where to
insert the section and keep the style consistent with other docs (use a
second-level heading and bullet list).

Comment on lines +38 to +43
## Quick start

Send one or more audio samples, get back a voice model. Choose your implementation:

<CodeGroup>
```python Python
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add prerequisites before this procedural flow

Please add a concise prerequisites block before “Quick start” (API key, sample file expectations, SDK install). As per coding guidelines, "Include prerequisites at the start of procedural content".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@features/voice-cloning.mdx` around lines 38 - 43, Insert a concise
"Prerequisites" block immediately before the "Quick start" heading that lists
required items: obtaining and setting the API key, expected audio sample formats
and length/quality requirements (e.g., sample rate, mono/stereo, minimum
duration), and SDK/CLI install instructions (e.g., pip/npm install or link to
SDK). Ensure the block is brief, uses bullet-style lines, and references the
"Quick start" section so readers know it's required before sending samples or
choosing an implementation.

Comment thread overview/capabilities.mdx
Comment on lines +72 to +74
<Card title="Build with the SDK" icon="code" href="/developer-guide/sdk-guide/quickstart">
The Python library for your application.
</Card>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

SDK card copy is narrower than current SDK support.

Line 73 implies only Python support, while this docs restructure introduces both Python and JavaScript SDK paths. Consider neutral wording (for example: “Official Python and JavaScript SDKs”).
As per coding guidelines, “Prioritize accuracy and usability of information.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@overview/capabilities.mdx` around lines 72 - 74, The SDK card copy currently
reads "The Python library for your application" which inaccurately excludes
JavaScript; update the Card component (title "Build with the SDK") copy to
neutral wording such as "Official Python and JavaScript SDKs" or similar
inclusive text so it reflects both SDK paths introduced in the docs restructure.

Comment on lines +19 to +20
_WORKSPACE_ENV = "/Users/shawnlai/project/fish-audio/.env"
_LOCAL_KEYFILE = "/tmp/claude/fishdoctest/fishkey"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove hardcoded machine-specific secret paths.

Embedding a user-specific absolute .env path and fixed temp keyfile path makes the harness non-portable and leaks local environment details in-repo. Prefer env-driven/configurable paths (or only repo-relative lookup) for key discovery.

🧰 Tools
🪛 Ruff (0.15.15)

[error] 20-20: Probable insecure usage of temporary file or directory: "/tmp/claude/fishdoctest/fishkey"

(S108)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/cookbooks/harness.py` around lines 19 - 20, Replace the hardcoded
absolute paths _WORKSPACE_ENV and _LOCAL_KEYFILE in tests/cookbooks/harness.py
with configurable lookups: read them from environment variables (e.g.,
os.environ.get('WORKSPACE_ENV') and os.environ.get('LOCAL_KEYFILE')) with safe
repo-relative defaults (or a relative .env/keyfile location under the repo's
test resources) so machine-specific secrets are not committed; update any code
that references _WORKSPACE_ENV/_LOCAL_KEYFILE to use these variables and ensure
the defaults are non-sensitive and documented for local test setup.

Comment on lines +33 to +40
try:
from dotenv import dotenv_values
for p in (_WORKSPACE_ENV, str(Path.cwd() / ".env")):
v = dotenv_values(p).get("FISH_API_KEY")
if v:
return v.strip()
except Exception:
pass
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t silently swallow key-resolution and cleanup failures.

except Exception: pass hides real failures in both API-key loading and voice-deletion teardown. At minimum, catch expected exceptions and emit a warning with context so CI/local runs can surface cleanup drift and credential parsing issues.

Also applies to: 77-80

🧰 Tools
🪛 Ruff (0.15.15)

[error] 39-40: try-except-pass detected, consider logging the exception

(S110)


[warning] 39-39: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/cookbooks/harness.py` around lines 33 - 40, The try/except around
dotenv parsing (the block using dotenv_values, _WORKSPACE_ENV, Path.cwd() and
FISH_API_KEY) currently swallows all errors; replace the bare "except Exception:
pass" with targeted exception handling (e.g., catch ImportError,
OSError/FileNotFoundError, and ValueError) and emit a clear warning or
logger.warning that includes context (which path was attempted and the exception
message); for truly unexpected exceptions either re-raise or log them as errors.
Apply the same change to the voice-deletion teardown code (the cleanup block
referenced at lines 77-80), replacing its broad silent except with explicit
catches and warnings so credential parsing and cleanup failures surface in
CI/local runs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants