{}

Listen to Page Powered by Fish Audio S1 {voices.length > 1 ?

{isDropdownOpen &&

{voices.map((voice, index) => )}

}

{}

; }; ## Prerequisites Sign up for a free Fish Audio account to get started with our API. 1. Go to [fish.audio/auth/signup](https://fish.audio/auth/signup) 2. Fill in your details to create an account, complete steps to verify your account. 3. Log in to your account and navigate to the [API section](https://fish.audio/app/api-keys) Once you have an account, you'll need an API key to authenticate your requests. 1. Log in to your [Fish Audio Dashboard](https://fish.audio/app/api-keys/) 2. Navigate to the API Keys section 3. Click "Create New Key" and give it a descriptive name, set a expiration if desired 4. Copy your key and store it securely Keep your API key secret! Never commit it to version control or share it publicly. ## Overview Voice cloning allows you to generate speech that matches a specific voice using reference audio. Fish Audio supports two approaches: * Using pre-trained voice models (reference\_id) * Providing reference audio directly in your request Use `reference_id` when you'll reuse a voice multiple times - it's faster and more efficient. Use `references` for one-off voice cloning or testing different voices without creating models. ## Using Reference Audio Clone a voice by providing reference audio directly: ```typescript theme={null} import { FishAudioClient } from "fish-audio"; import type { TTSRequest, ReferenceAudio } from "fish-audio"; import { readFile } from "fs/promises"; const fishAudio = new FishAudioClient(); const audioBuffer = await readFile("voice_sample.wav"); const referenceFile = new File([audioBuffer], "voice_sample.wav"); const referenceAudio: ReferenceAudio = { audio: referenceFile, text: "Text spoken in the reference audio" }; const request: TTSRequest = { text: "Hello, world!", references: [referenceAudio] }; const audio = await client.textToSpeech.convert(request); ``` ## Multiple References Improve voice quality by providing multiple reference samples: ```typescript theme={null} import type { TTSRequest, ReferenceAudio } from "fish-audio"; import { readFile } from "fs/promises"; const references = [] as ReferenceAudio[]; for (const i of [0, 1, 2]) { const buf = await readFile(`sample_${i}.wav`); references.push({ audio: new File([buf], `sample_${i}.wav`), text: `Text from sample ${i}` }); } const request: TTSRequest = { text: "Better voice quality with multiple references", references, }; ``` ## Creating Voice Models For repeated use, create a persistent voice model: ```typescript theme={null} import { FishAudioClient } from "fish-audio"; import { createReadStream } from "fs"; const fishAudio = new FishAudioClient(); // Create a voice model from samples const response = await fishAudio.voices.ivc.create({ title: "My Custom Voice", voices: [ createReadStream("voice_0.wav"), createReadStream("voice_1.wav"), createReadStream("voice_2.wav"), ], cover_image: createReadStream("cover.png"), }); console.log("Created model:", response._id); // Use the model const audio = await fishAudio.textToSpeech.convert({ text: "Using my saved voice model", reference_id: response._id, }); ``` ## Best Practices ### Audio Quality For best results, reference audio should: * Be 10-30 seconds long per sample * Have clear speech without background noise * Match the language you'll generate * Include varied intonation and emotion ### Sample Text The text parameter in ReferenceAudio should: * Match exactly what's spoken in the audio * Include punctuation for proper prosody * Be in the same language as generation ### Performance Tips 1. **Pre-upload models** for frequently used voices 2. **Use 2-3 reference samples** for optimal quality 3. **Keep samples under 30 seconds** each 4. **Normalize audio levels** before uploading ## Audio Format Requirements Supported formats for reference audio: * WAV (recommended) * MP3 * M4A * Other common audio formats Sample rates: * 16kHz minimum * 44.1kHz recommended * Mono or stereo (converted to mono) ## Example: Voice Bank Build a library of cloned voices: ```typescript theme={null} import { FishAudioClient } from "fish-audio"; const fishAudio = new FishAudioClient(); async function createVoiceBank() { const voiceBank: Record = {}; const models = await fishAudio.voices.search(); for (const m of models.items ?? []) voiceBank[m.title] = m._id as string; return voiceBank; } async function generateWithVoice(text: string, voiceName: string) { const bank = await createVoiceBank(); const modelId = bank[voiceName]; if (!modelId) throw new Error(`Voice '${voiceName}' not found`); return fishAudio.textToSpeech.convert({ text, reference_id: modelId }); } ``` ## Combining with Emotions Add emotions to cloned voices: ```typescript theme={null} // With a saved model await fishAudio.textToSpeech.convert({ text: "(happy) This is exciting news! (calm) Let me explain the details.", reference_id: "your_model_id", }); // Or with direct references await fishAudio.textToSpeech.convert({ text: "(excited) Amazing discovery!", references: [referenceAudio], }); ``` ## Error Handling Common issues and solutions: ```typescript theme={null} try { await fishAudio.textToSpeech.convert({ text: "Test speech", references: [referenceAudio] }); } catch (e: any) { const msg = String(e?.message || e); if (msg.includes("Invalid audio format")) console.error("Check audio format - use WAV or MP3"); else if (msg.includes("Audio too short")) console.error("Reference audio should be at least 10 seconds"); else throw e; } ```