{}

Listen to Page Powered by Fish Audio S1 {voices.length > 1 ?

{isDropdownOpen &&

{voices.map((voice, index) => )}

}

{}

; }; This guide helps you migrate from the legacy `fish_audio_sdk` (Session-based API) to the new `fishaudio` (client-based API) available in `fish-audio-sdk` v1.0+. ## Quick Migration ```bash theme={null} pip uninstall fish-audio-sdk pip install fish-audio-sdk ``` The package name stays the same, but the import changes from `fish_audio_sdk` to `fishaudio`. You can still keep using the `fish_audio_sdk` package if you'd like though. Just be aware that it will not receive any new features or updates. Since this is a versioning jump from `v2025.6.3` to `v1.0.0`, you may need to pin the version explicitly with `fish-audio-sdk==1.0.0`. ```python theme={null} # Before from fish_audio_sdk import Session, TTSRequest, ASRRequest # After from fishaudio import FishAudio from fishaudio.types import TTSConfig, ReferenceAudio ``` ```python theme={null} # Before session = Session("your_api_key") # After client = FishAudio(api_key="your_api_key") # Or use environment variable client = FishAudio() # Reads from FISH_API_KEY ``` See the quick reference below for common operations. ## Key Changes at a Glance | Legacy | New | Notes | | ------------------------ | ------------------------------- | ----------------------------- | | `Session()` | `FishAudio()` | Client-based architecture | | `session.tts()` | `client.tts.convert()` | Returns complete audio bytes | | `session.asr()` | `client.asr.transcribe()` | Clearer method name | | `session.create_model()` | `client.voices.create()` | "Model" → "Voice" terminology | | `session.list_models()` | `client.voices.list()` | Resource namespacing | | `TTSRequest(...)` | Direct parameters | No request objects | | `WebSocketSession` | `client.tts.stream_websocket()` | Integrated into client | | `HttpCodeErr` | Specific exceptions | Better error handling | ## Text-to-Speech Migration ```python Legacy theme={null} from fish_audio_sdk import Session, TTSRequest session = Session("your_api_key") # Basic TTS - returns chunks audio = b"" for chunk in session.tts(TTSRequest(text="Hello, world!")): audio += chunk with open("output.mp3", "wb") as f: f.write(audio) ``` ```python New theme={null} from fishaudio import FishAudio from fishaudio.utils import save client = FishAudio() # Basic TTS - returns complete audio audio = client.tts.convert(text="Hello, world!") save(audio, "output.mp3") ``` The new SDK's `convert()` returns complete audio bytes instead of chunks. Use `stream()` for chunk-by-chunk transfer or `stream_websocket()` for real-time streaming. ## Voice Cloning Migration ```python Legacy theme={null} from fish_audio_sdk import Session, TTSRequest, ReferenceAudio session = Session("your_api_key") # Instant cloning with open("voice.wav", "rb") as f: request = TTSRequest( text="Cloned voice", references=[ReferenceAudio( audio=f.read(), text="Reference transcript" )] ) audio = b"".join(session.tts(request)) # Create voice model model = session.create_model( title="My Voice", voices=[voice_data], texts=["Sample text"] ) ``` ```python New theme={null} from fishaudio import FishAudio from fishaudio.types import ReferenceAudio client = FishAudio() # Instant cloning with open("voice.wav", "rb") as f: audio = client.tts.convert( text="Cloned voice", references=[ReferenceAudio( audio=f.read(), text="Reference transcript" )] ) # Create voice model voice = client.voices.create( title="My Voice", voices=[voice_data], texts=["Sample text"] ) ``` ## Speech-to-Text Migration ```python Legacy theme={null} from fish_audio_sdk import Session, ASRRequest session = Session("your_api_key") with open("audio.mp3", "rb") as f: response = session.asr(ASRRequest( audio=f.read(), language="en" )) print(response.text) # Timestamps in SECONDS for segment in response.segments: print(f"[{segment.start}s - {segment.end}s]") ``` ```python New theme={null} from fishaudio import FishAudio client = FishAudio() with open("audio.mp3", "rb") as f: result = client.asr.transcribe( audio=f.read(), language="en" ) print(result.text) # Timestamps in MILLISECONDS for segment in result.segments: print(f"[{segment.start}ms - {segment.end}ms]") ``` ASR timestamps changed from seconds to milliseconds. Divide by 1000 to convert: `seconds = segment.start / 1000` ## WebSocket Streaming Migration ```python Legacy theme={null} from fish_audio_sdk import WebSocketSession, TTSRequest ws_session = WebSocketSession("your_api_key") def text_stream(): yield "Hello, " yield "streaming!" with ws_session: for chunk in ws_session.tts(TTSRequest(text=""), text_stream()): # Process audio chunks pass ``` ```python New theme={null} from fishaudio import FishAudio client = FishAudio() def text_chunks(): yield "Hello, " yield "streaming!" # No empty text required, no context manager needed audio_stream = client.tts.stream_websocket(text_chunks()) for chunk in audio_stream: # Process audio chunks pass ``` ## Error Handling Migration ```python Legacy theme={null} from fish_audio_sdk.exceptions import HttpCodeErr try: audio = session.tts(request) except HttpCodeErr as e: if e.status_code == 429: print("Rate limited") elif e.status_code == 401: print("Auth failed") ``` ```python New theme={null} from fishaudio.exceptions import ( RateLimitError, AuthenticationError, FishAudioError ) try: audio = client.tts.convert(text="...") except RateLimitError as e: print(f"Rate limited. Retry after {e.retry_after}s") except AuthenticationError: print("Auth failed") except FishAudioError as e: print(f"General error: {e}") ``` ## Async Support The new SDK has full async support with `AsyncFishAudio`: ```python theme={null} import asyncio from fishaudio import AsyncFishAudio async def main(): client = AsyncFishAudio() # All methods work with await audio = await client.tts.convert(text="Async speech") result = await client.asr.transcribe(audio=audio_bytes) voices = await client.voices.list() asyncio.run(main()) ``` ## Breaking Changes Summary **Before:** Iterator of chunks ```python theme={null} audio = b"" for chunk in session.tts(request): audio += chunk ``` **After:** Complete audio bytes ```python theme={null} audio = client.tts.convert(text="...") ``` Use `stream()` or `stream_websocket()` if you need chunks. **Before:** ```python theme={null} request = TTSRequest(text="...", format="mp3") audio = session.tts(request) ``` **After:** ```python theme={null} audio = client.tts.convert(text="...", format="mp3") ``` Pass parameters directly to methods. **Before:** `segment.start` in seconds (e.g., 1.5) **After:** `segment.start` in milliseconds (e.g., 1500) Convert: `seconds = segment.start / 1000` * `session.create_model()` → `client.voices.create()` * `session.list_models()` → `client.voices.list()` * `session.get_model()` → `client.voices.get()` Plus new methods: `client.voices.update()` and `client.voices.delete()` ## Common Issues Upgrade the package: ```bash theme={null} pip install --upgrade fish-audio-sdk python -c "import fishaudio; print(fishaudio.__version__)" ``` The new `convert()` returns complete audio. Use `stream()` for chunks: ```python theme={null} audio_stream = client.tts.stream(text="...") for chunk in audio_stream: process_chunk(chunk) ``` Remove the empty text. Just pass your generator: ```python theme={null} # Before ws_session.tts(TTSRequest(text=""), text_stream()) # After client.tts.stream_websocket(text_stream()) ``` New SDK uses milliseconds instead of seconds: ```python theme={null} seconds = segment.start / 1000 ``` ## Next Steps Complete guide for the new SDK Detailed API documentation TTS features and examples Clone voices and manage models ## Need Help? * [GitHub Repository](https://github.com/fishaudio/fish-audio-python) - Report issues or request features * [Discord Community](https://discord.gg/fishaudio) - Get help from the community * [PyPI Package](https://pypi.org/project/fish-audio-sdk/) - Package information