> ## Documentation Index
> Fetch the complete documentation index at: https://hanabiaiinc-auto-go-api-docs.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Types

<a id="fishaudio.types.voices" />

# fishaudio.types.voices

Voice and model management types.

<a id="fishaudio.types.voices.Sample" />

## Sample Objects

```python theme={null}
class Sample(BaseModel)
```

A sample audio for a voice model.

**Attributes**:

* `title` - Title/name of the audio sample
* `text` - Transcription of the spoken content in the sample
* `task_id` - Unique identifier for the sample task
* `audio` - URL or path to the audio file

<a id="fishaudio.types.voices.Author" />

## Author Objects

```python theme={null}
class Author(BaseModel)
```

Voice model author information.

**Attributes**:

* `id` - Unique author identifier
* `nickname` - Author's display name
* `avatar` - URL to author's avatar image

<a id="fishaudio.types.voices.Voice" />

## Voice Objects

```python theme={null}
class Voice(BaseModel)
```

A voice model.

Represents a TTS voice that can be used for synthesis.

**Attributes**:

* `id` - Unique voice model identifier (use as reference\_id in TTS)
* `type` - Model type. Options: "svc" (singing voice conversion), "tts" (text-to-speech)
* `title` - Voice model title/name
* `description` - Detailed description of the voice model
* `cover_image` - URL to the voice model's cover image
* `train_mode` - Training mode used. Options: "fast"
* `state` - Current model state (e.g., "ready", "training", "failed")
* `tags` - List of tags for categorization (e.g., \["male", "english", "young"])
* `samples` - List of audio samples demonstrating the voice
* `created_at` - Timestamp when the model was created
* `updated_at` - Timestamp when the model was last updated
* `languages` - List of supported language codes (e.g., \["en", "zh"])
* `visibility` - Model visibility. Options: "public", "private", "unlisted"
* `lock_visibility` - Whether visibility setting is locked
* `like_count` - Number of likes the model has received
* `mark_count` - Number of bookmarks/favorites
* `shared_count` - Number of times the model has been shared
* `task_count` - Number of times the model has been used for generation
* `liked` - Whether the current user has liked this model. Default: False
* `marked` - Whether the current user has bookmarked this model. Default: False
* `author` - Information about the voice model's creator

<a id="fishaudio.types.tts" />

# fishaudio.types.tts

TTS-related types.

<a id="fishaudio.types.tts.ReferenceAudio" />

## ReferenceAudio Objects

```python theme={null}
class ReferenceAudio(BaseModel)
```

Reference audio for voice cloning/style.

**Attributes**:

* `audio` - Audio file bytes for the reference sample
* `text` - Transcription of what is spoken in the reference audio. Should match exactly
  what's spoken and include punctuation for proper prosody.

<a id="fishaudio.types.tts.Prosody" />

## Prosody Objects

```python theme={null}
class Prosody(BaseModel)
```

Speech prosody settings (speed and volume).

**Attributes**:

* `speed` - Speech speed multiplier. Range: 0.5-2.0. Default: 1.0.
* `Examples` - 1.5 = 50% faster, 0.8 = 20% slower
* `volume` - Volume adjustment in decibels. Range: -20.0 to 20.0. Default: 0.0 (no change).
  Positive values increase volume, negative values decrease it.

<a id="fishaudio.types.tts.Prosody.from_speed_override" />

#### from\_speed\_override

```python theme={null}
@classmethod
def from_speed_override(cls,
                        speed: float,
                        base: Optional["Prosody"] = None) -> "Prosody"
```

Create Prosody with speed override, preserving volume from base.

**Arguments**:

* `speed` - Speed value to use
* `base` - Base prosody to preserve volume from (if any)

**Returns**:

New Prosody instance with overridden speed

<a id="fishaudio.types.tts.TTSConfig" />

## TTSConfig Objects

```python theme={null}
class TTSConfig(BaseModel)
```

TTS generation configuration.

Reusable configuration for text-to-speech requests. Create once, use multiple times.
All parameters have sensible defaults.

**Attributes**:

* `format` - Audio output format. Options: "mp3", "wav", "pcm", "opus". Default: "mp3"
* `sample_rate` - Audio sample rate in Hz. If None, uses format-specific default.
* `mp3_bitrate` - MP3 bitrate in kbps. Options: 64, 128, 192. Default: 128
* `opus_bitrate` - Opus bitrate in kbps. Options: -1000, 24, 32, 48, 64. Default: 32
* `normalize` - Whether to normalize/clean the input text. Default: True
* `chunk_length` - Characters per generation chunk. Range: 100-300. Default: 200.
  Lower values = faster initial response, higher values = better quality
* `latency` - Generation mode. Options: "normal" (higher quality), "balanced" (faster). Default: "balanced"
* `reference_id` - Voice model ID from fish.audio (e.g., "802e3bc2b27e49c2995d23ef70e6ac89").
  Find IDs in voice URLs or via voices.list()
* `references` - List of reference audio samples for instant voice cloning. Default: \[]
* `prosody` - Speech speed and volume settings. Default: None (uses natural prosody)
* `top_p` - Nucleus sampling parameter for token selection. Range: 0.0-1.0. Default: 0.7
* `temperature` - Randomness in generation. Range: 0.0-1.0. Default: 0.7.
  Higher = more varied, lower = more consistent

<a id="fishaudio.types.tts.TTSRequest" />

## TTSRequest Objects

```python theme={null}
class TTSRequest(BaseModel)
```

Request parameters for text-to-speech generation.

This model is used internally for WebSocket streaming.
For the HTTP API, parameters are passed directly to methods.

**Attributes**:

* `text` - Text to synthesize into speech
* `chunk_length` - Characters per generation chunk. Range: 100-300. Default: 200
* `format` - Audio output format. Options: "mp3", "wav", "pcm", "opus". Default: "mp3"
* `sample_rate` - Audio sample rate in Hz. If None, uses format-specific default
* `mp3_bitrate` - MP3 bitrate in kbps. Options: 64, 128, 192. Default: 128
* `opus_bitrate` - Opus bitrate in kbps. Options: -1000, 24, 32, 48, 64. Default: 32
* `references` - List of reference audio samples for voice cloning. Default: \[]
* `reference_id` - Voice model ID for using a specific voice. Default: None
* `normalize` - Whether to normalize/clean the input text. Default: True
* `latency` - Generation mode. Options: "normal", "balanced". Default: "balanced"
* `prosody` - Speech speed and volume settings. Default: None
* `top_p` - Nucleus sampling for token selection. Range: 0.0-1.0. Default: 0.7
* `temperature` - Randomness in generation. Range: 0.0-1.0. Default: 0.7

<a id="fishaudio.types.tts.StartEvent" />

## StartEvent Objects

```python theme={null}
class StartEvent(BaseModel)
```

WebSocket start event to initiate TTS streaming.

**Attributes**:

* `event` - Event type identifier, always "start"
* `request` - TTS configuration for the streaming session

<a id="fishaudio.types.tts.TextEvent" />

## TextEvent Objects

```python theme={null}
class TextEvent(BaseModel)
```

WebSocket event to send a text chunk for synthesis.

**Attributes**:

* `event` - Event type identifier, always "text"
* `text` - Text chunk to synthesize

<a id="fishaudio.types.tts.FlushEvent" />

## FlushEvent Objects

```python theme={null}
class FlushEvent(BaseModel)
```

WebSocket event to force immediate audio generation from buffered text.

Use this to ensure all buffered text is synthesized without waiting for more input.

**Attributes**:

* `event` - Event type identifier, always "flush"

<a id="fishaudio.types.tts.CloseEvent" />

## CloseEvent Objects

```python theme={null}
class CloseEvent(BaseModel)
```

WebSocket event to end the streaming session.

**Attributes**:

* `event` - Event type identifier, always "stop"

<a id="fishaudio.types.account" />

# fishaudio.types.account

Account-related types (credits, packages, etc.).

<a id="fishaudio.types.account.Credits" />

## Credits Objects

```python theme={null}
class Credits(BaseModel)
```

User's API credit balance.

**Attributes**:

* `id` - Unique credits record identifier
* `user_id` - User identifier
* `credit` - Current credit balance (decimal for precise accounting)
* `created_at` - Timestamp when the credits record was created
* `updated_at` - Timestamp when the credits were last updated
* `has_phone_sha256` - Whether the user has a verified phone number. Optional
* `has_free_credit` - Whether the user has received free credits. Optional

<a id="fishaudio.types.account.Package" />

## Package Objects

```python theme={null}
class Package(BaseModel)
```

User's prepaid package information.

**Attributes**:

* `id` - Unique package identifier
* `user_id` - User identifier
* `type` - Package type identifier
* `total` - Total units in the package
* `balance` - Remaining units in the package
* `created_at` - Timestamp when the package was purchased
* `updated_at` - Timestamp when the package was last updated
* `finished_at` - Timestamp when the package was fully consumed. None if still active

<a id="fishaudio.types.asr" />

# fishaudio.types.asr

ASR (Automatic Speech Recognition) related types.

<a id="fishaudio.types.asr.ASRSegment" />

## ASRSegment Objects

```python theme={null}
class ASRSegment(BaseModel)
```

A timestamped segment of transcribed text.

**Attributes**:

* `text` - The transcribed text for this segment
* `start` - Segment start time in seconds
* `end` - Segment end time in seconds

<a id="fishaudio.types.asr.ASRResponse" />

## ASRResponse Objects

```python theme={null}
class ASRResponse(BaseModel)
```

Response from speech-to-text transcription.

**Attributes**:

* `text` - Complete transcription of the entire audio
* `duration` - Total audio duration in milliseconds
* `segments` - List of timestamped text segments. Empty if include\_timestamps=False

<a id="fishaudio.types.asr.ASRResponse.duration" />

#### duration

Duration in milliseconds

<a id="fishaudio.types.shared" />

# fishaudio.types.shared

Shared types used across the SDK.

<a id="fishaudio.types.shared.PaginatedResponse" />

## PaginatedResponse Objects

```python theme={null}
class PaginatedResponse(BaseModel, Generic[T])
```

Generic paginated response.

**Attributes**:

* `total` - Total number of items across all pages
* `items` - List of items on the current page
