> ## Documentation Index
> Fetch the complete documentation index at: https://hanabiaiinc-auto-go-api-docs.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Docker Deployment

> Deploy Fish Audio models using Docker containers

export const AudioTranscript = ({voices = []}) => {
  const [selectedVoice, setSelectedVoice] = useState(0);
  const [isPlaying, setIsPlaying] = useState(false);
  const [currentTime, setCurrentTime] = useState(0);
  const [duration, setDuration] = useState(0);
  const [isDropdownOpen, setIsDropdownOpen] = useState(false);
  const audioRef = useRef(null);
  const dropdownRef = useRef(null);
  useEffect(() => {
    const audio = audioRef.current;
    if (!audio) return;
    const updateTime = () => setCurrentTime(audio.currentTime);
    const updateDuration = () => setDuration(audio.duration);
    const handleEnded = () => setIsPlaying(false);
    audio.addEventListener('timeupdate', updateTime);
    audio.addEventListener('loadedmetadata', updateDuration);
    audio.addEventListener('ended', handleEnded);
    return () => {
      audio.removeEventListener('timeupdate', updateTime);
      audio.removeEventListener('loadedmetadata', updateDuration);
      audio.removeEventListener('ended', handleEnded);
    };
  }, []);
  useEffect(() => {
    const handleClickOutside = event => {
      if (dropdownRef.current && !dropdownRef.current.contains(event.target)) {
        setIsDropdownOpen(false);
      }
    };
    if (isDropdownOpen) {
      document.addEventListener('mousedown', handleClickOutside);
    }
    return () => {
      document.removeEventListener('mousedown', handleClickOutside);
    };
  }, [isDropdownOpen]);
  useEffect(() => {
    if (audioRef.current) {
      audioRef.current.pause();
      audioRef.current.load();
      setIsPlaying(false);
      setCurrentTime(0);
    }
  }, [selectedVoice]);
  const togglePlay = () => {
    if (isPlaying) {
      audioRef.current.pause();
    } else {
      audioRef.current.play();
    }
    setIsPlaying(!isPlaying);
  };
  const handleProgressChange = e => {
    const newTime = parseFloat(e.target.value);
    audioRef.current.currentTime = newTime;
    setCurrentTime(newTime);
  };
  const formatTime = time => {
    if (isNaN(time)) return '0:00';
    const minutes = Math.floor(time / 60);
    const seconds = Math.floor(time % 60);
    return `${minutes}:${seconds.toString().padStart(2, '0')}`;
  };
  const currentVoice = voices[selectedVoice];
  return <div className="border rounded-lg bg-card border-gray-200 dark:border-gray-800">
      {}
      <div className="grid grid-cols-3 items-center px-3 py-1.5 bg-muted border-b border-gray-200 dark:border-gray-800">
        <span className="text-xs font-medium">Listen to Page</span>

        <span className="text-xs font-semibold text-muted-foreground text-center">Powered by Fish Audio S1</span>

        {voices.length > 1 ? <div className="relative justify-self-end" ref={dropdownRef}>
            <button onClick={() => setIsDropdownOpen(!isDropdownOpen)} className="flex items-center gap-1.5 px-3 py-1 rounded-full bg-muted hover:bg-gray-200 dark:hover:bg-gray-700 transition-all duration-200 cursor-pointer text-xs">
              <span className="text-muted-foreground">Voice:</span>
              <span className="font-medium">{voices[selectedVoice]?.name}</span>
              <svg className={`w-3 h-3 transition-transform duration-200 ${isDropdownOpen ? 'rotate-180' : ''}`} fill="none" stroke="currentColor" viewBox="0 0 24 24">
                <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M19 9l-7 7-7-7" />
              </svg>
            </button>

            {isDropdownOpen && <div className="absolute right-0 mt-1 w-auto bg-white dark:bg-black border border-gray-200 dark:border-gray-700 rounded-lg overflow-hidden z-50">
                {voices.map((voice, index) => <button key={index} onClick={() => {
    setSelectedVoice(index);
    setIsDropdownOpen(false);
  }} className={`w-full px-3 py-1.5 text-left text-xs hover:bg-gray-100 dark:hover:bg-gray-800 transition-colors flex items-center gap-2 ${index === selectedVoice ? 'bg-gray-100 dark:bg-gray-800 font-medium' : ''}`}>
                    {voice.id && <img src={`https://public-platform.r2.fish.audio/coverimage/${voice.id}`} alt={voice.name} className="w-5 h-5 rounded-full m-0 flex-shrink-0 object-cover" />}
                    <span className="flex-1 whitespace-nowrap">{voice.name}</span>
                  </button>)}
              </div>}
          </div> : <div className="justify-self-end" />}
      </div>

      {}
      <div className="px-3 py-1.5 bg-card">
        <audio ref={audioRef} src={currentVoice?.url} preload="metadata" />

        <div className="flex items-center gap-2">
          {}
          <button onClick={togglePlay} className="flex-shrink-0 w-6 h-6 flex items-center justify-center bg-gray-300 dark:bg-gray-600 text-gray-800 dark:text-gray-200 rounded-full hover:opacity-80 transition-opacity relative overflow-hidden" aria-label={isPlaying ? 'Pause' : 'Play'}>
            <div className="transition-transform duration-300 ease-in-out" style={{
    transform: isPlaying ? 'rotate(180deg)' : 'rotate(0deg)'
  }}>
              {isPlaying ? <svg className="w-3 h-3" fill="currentColor" viewBox="0 0 24 24">
                  <path d="M6 4h4v16H6V4zm8 0h4v16h-4V4z" />
                </svg> : <svg className="w-3 h-3 ml-0.5" fill="currentColor" viewBox="0 0 24 24">
                  <path d="M8 5v14l11-7z" />
                </svg>}
            </div>
          </button>

          {}
          <div className="flex-1 flex items-center gap-2">
            <span className="text-xs font-mono text-gray-500 dark:text-gray-400 min-w-[35px]">
              {formatTime(currentTime)}
            </span>

            <div className="flex-1 relative h-1 bg-gray-200 dark:bg-gray-700 rounded-full overflow-hidden">
              <div className="absolute top-0 left-0 h-full bg-gray-400 dark:bg-gray-500 transition-all duration-100" style={{
    width: `${duration ? currentTime / duration * 100 : 0}%`
  }} />
              <input type="range" min="0" max={duration || 0} value={currentTime} onChange={handleProgressChange} className="absolute top-0 left-0 w-full h-full opacity-0 cursor-pointer" />
            </div>
            <span className="text-xs font-mono text-gray-500 dark:text-gray-400 min-w-[35px]">
              {formatTime(duration)}
            </span>
          </div>
        </div>
      </div>
    </div>;
};

<AudioTranscript
  voices={[
{
  "id": "8ef4a238714b45718ce04243307c57a7",
  "name": "E-girl",
  "url": "https://pub-b995142090474379a930b856ab79b4d4.r2.dev/audio/self-hosting-docker-deployment/8ef4a238714b45718ce04243307c57a7.mp3"
},
{
  "id": "802e3bc2b27e49c2995d23ef70e6ac89",
  "name": "Energetic Male",
  "url": "https://pub-b995142090474379a930b856ab79b4d4.r2.dev/audio/self-hosting-docker-deployment/802e3bc2b27e49c2995d23ef70e6ac89.mp3"
},
{
  "id": "933563129e564b19a115bedd57b7406a",
  "name": "Sarah",
  "url": "https://pub-b995142090474379a930b856ab79b4d4.r2.dev/audio/self-hosting-docker-deployment/933563129e564b19a115bedd57b7406a.mp3"
},
{
  "id": "bf322df2096a46f18c579d0baa36f41d",
  "name": "Adrian",
  "url": "https://pub-b995142090474379a930b856ab79b4d4.r2.dev/audio/self-hosting-docker-deployment/bf322df2096a46f18c579d0baa36f41d.mp3"
},
{
  "id": "b347db033a6549378b48d00acb0d06cd",
  "name": "Selene",
  "url": "https://pub-b995142090474379a930b856ab79b4d4.r2.dev/audio/self-hosting-docker-deployment/b347db033a6549378b48d00acb0d06cd.mp3"
},
{
  "id": "536d3a5e000945adb7038665781a4aca",
  "name": "Ethan",
  "url": "https://pub-b995142090474379a930b856ab79b4d4.r2.dev/audio/self-hosting-docker-deployment/536d3a5e000945adb7038665781a4aca.mp3"
}
]}
/>

Fish Audio provides Docker images for both WebUI and API server deployments. You can use pre-built images from Docker Hub or build custom images locally.

## Prerequisites

Before deploying with Docker, ensure you have:

* **Docker** and **Docker Compose** installed
* **NVIDIA Docker runtime** (for GPU support)
* At least **12GB GPU memory** for CUDA inference
* Downloaded model weights (see [Running Inference](/developer-guide/self-hosting/running-inference#download-weights))

## Pre-built Images

Fish Audio provides ready-to-use Docker images on Docker Hub:

| Image                                      | Description             | Best For                         |
| ------------------------------------------ | ----------------------- | -------------------------------- |
| `fishaudio/fish-speech:latest-webui-cuda`  | WebUI with CUDA support | Interactive development with GPU |
| `fishaudio/fish-speech:latest-webui-cpu`   | WebUI CPU-only          | Testing without GPU              |
| `fishaudio/fish-speech:latest-server-cuda` | API server with CUDA    | Production deployments with GPU  |
| `fishaudio/fish-speech:latest-server-cpu`  | API server CPU-only     | Low-traffic CPU deployments      |

<Note>
  For production use, we recommend using specific version tags instead of `latest` to ensure consistency across deployments.
</Note>

## Quick Start with Docker Run

The fastest way to get started is using `docker run`:

### WebUI Deployment

```bash theme={null}
# Create directories for model weights and reference audio
mkdir -p checkpoints references

# Start WebUI with CUDA support (recommended)
docker run -d \
    --name fish-speech-webui \
    --gpus all \
    -p 7860:7860 \
    -v ./checkpoints:/app/checkpoints \
    -v ./references:/app/references \
    -e COMPILE=1 \
    fishaudio/fish-speech:latest-webui-cuda

# For CPU-only deployment
docker run -d \
    --name fish-speech-webui-cpu \
    -p 7860:7860 \
    -v ./checkpoints:/app/checkpoints \
    -v ./references:/app/references \
    fishaudio/fish-speech:latest-webui-cpu
```

Access the WebUI at `http://localhost:7860`

### API Server Deployment

```bash theme={null}
# Start API server with CUDA support
docker run -d \
    --name fish-speech-server \
    --gpus all \
    -p 8080:8080 \
    -v ./checkpoints:/app/checkpoints \
    -v ./references:/app/references \
    -e COMPILE=1 \
    fishaudio/fish-speech:latest-server-cuda

# For CPU-only deployment
docker run -d \
    --name fish-speech-server-cpu \
    -p 8080:8080 \
    -v ./checkpoints:/app/checkpoints \
    -v ./references:/app/references \
    fishaudio/fish-speech:latest-server-cpu
```

Access the API documentation at `http://localhost:8080`

<Tip>
  Enable the `COMPILE=1` environment variable for \~10x faster inference on CUDA deployments. This uses `torch.compile` to optimize the model.
</Tip>

## Docker Compose Deployment

For development or customization, Docker Compose provides easier configuration management:

### Setup

```bash theme={null}
# Clone the repository
git clone https://github.com/fishaudio/fish-speech.git
cd fish-speech
```

### Start Services

```bash theme={null}
# Start WebUI with CUDA
docker compose --profile webui up

# Start WebUI with compile optimization
COMPILE=1 docker compose --profile webui up

# Start API server
docker compose --profile server up

# Start API server with compile optimization
COMPILE=1 docker compose --profile server up

# For CPU-only deployment
BACKEND=cpu docker compose --profile webui up
```

<Tip>
  Run containers in detached mode by adding the `-d` flag: `docker compose --profile webui up -d`
</Tip>

### Environment Variables

Customize deployment using environment variables or a `.env` file:

```bash theme={null}
# .env file example
BACKEND=cuda              # or cpu
COMPILE=1                 # Enable compile optimization
GRADIO_PORT=7860         # WebUI port
API_PORT=8080            # API server port
UV_VERSION=0.8.15        # UV package manager version
```

## Manual Docker Build

For advanced users who need custom configurations:

### Build WebUI Image

```bash theme={null}
# Build with CUDA support
docker build \
    --platform linux/amd64 \
    -f docker/Dockerfile \
    --build-arg BACKEND=cuda \
    --build-arg CUDA_VER=12.6.0 \
    --build-arg UV_EXTRA=cu126 \
    --target webui \
    -t fish-speech-webui:cuda .

# Build CPU-only (supports multi-platform)
docker build \
    --platform linux/amd64,linux/arm64 \
    -f docker/Dockerfile \
    --build-arg BACKEND=cpu \
    --target webui \
    -t fish-speech-webui:cpu .
```

### Build API Server Image

```bash theme={null}
# Build with CUDA support
docker build \
    --platform linux/amd64 \
    -f docker/Dockerfile \
    --build-arg BACKEND=cuda \
    --build-arg CUDA_VER=12.6.0 \
    --build-arg UV_EXTRA=cu126 \
    --target server \
    -t fish-speech-server:cuda .
```

### Build Development Image

```bash theme={null}
# Build development image with all tools
docker build \
    --platform linux/amd64 \
    -f docker/Dockerfile \
    --build-arg BACKEND=cuda \
    --target dev \
    -t fish-speech-dev:cuda .
```

### Build Arguments

| Argument     | Options                   | Default  | Description         |
| ------------ | ------------------------- | -------- | ------------------- |
| `BACKEND`    | `cuda`, `cpu`             | `cuda`   | Compute backend     |
| `CUDA_VER`   | `12.6.0`, etc.            | `12.6.0` | CUDA version        |
| `UV_EXTRA`   | `cu126`, `cu128`, `cu129` | `cu126`  | UV extra for CUDA   |
| `UBUNTU_VER` | `24.04`, etc.             | `24.04`  | Ubuntu base version |
| `PY_VER`     | `3.12`, etc.              | `3.12`   | Python version      |

## Volume Mounts

Both Docker run and Compose methods require these volume mounts:

| Host Path       | Container Path     | Purpose                                 |
| --------------- | ------------------ | --------------------------------------- |
| `./checkpoints` | `/app/checkpoints` | Model weights directory                 |
| `./references`  | `/app/references`  | Reference audio files for voice cloning |

<Warning>
  Ensure model weights are downloaded and placed in the `./checkpoints` directory before starting containers. See [Running Inference](/developer-guide/self-hosting/running-inference#download-weights) for download instructions.
</Warning>

## Environment Variables Reference

### WebUI Configuration

| Variable             | Default   | Description                  |
| -------------------- | --------- | ---------------------------- |
| `GRADIO_SERVER_NAME` | `0.0.0.0` | WebUI server host            |
| `GRADIO_SERVER_PORT` | `7860`    | WebUI server port            |
| `GRADIO_SHARE`       | `false`   | Enable Gradio public sharing |

### API Server Configuration

| Variable          | Default   | Description     |
| ----------------- | --------- | --------------- |
| `API_SERVER_NAME` | `0.0.0.0` | API server host |
| `API_SERVER_PORT` | `8080`    | API server port |

### Model Configuration

| Variable                  | Default                                   | Description                |
| ------------------------- | ----------------------------------------- | -------------------------- |
| `LLAMA_CHECKPOINT_PATH`   | `checkpoints/openaudio-s1-mini`           | Path to model weights      |
| `DECODER_CHECKPOINT_PATH` | `checkpoints/openaudio-s1-mini/codec.pth` | Path to decoder weights    |
| `DECODER_CONFIG_NAME`     | `modded_dac_vq`                           | Decoder configuration name |

### Performance Optimization

| Variable  | Default | Description                                        |
| --------- | ------- | -------------------------------------------------- |
| `COMPILE` | `0`     | Enable torch.compile for \~10x speedup (CUDA only) |

## Container Management

### View Logs

```bash theme={null}
# Docker run
docker logs fish-speech-webui

# Docker Compose
docker compose logs webui
```

### Stop Containers

```bash theme={null}
# Docker run
docker stop fish-speech-webui

# Docker Compose
docker compose down
```

### Update Images

```bash theme={null}
# Pull latest images
docker pull fishaudio/fish-speech:latest-webui-cuda

# Restart containers with new image
docker compose --profile webui up -d
```

## GPU Support

### Prerequisites

Install NVIDIA Container Toolkit:

```bash theme={null}
# Ubuntu/Debian
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
    sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
```

### Verify GPU Access

```bash theme={null}
docker run --rm --gpus all nvidia/cuda:12.6.0-base-ubuntu24.04 nvidia-smi
```

<Warning>
  GPU support requires NVIDIA Docker runtime. For CPU-only deployment, remove the `--gpus all` flag and use CPU images.
</Warning>

## Troubleshooting

### Container Won't Start

Check logs for errors:

```bash theme={null}
docker logs fish-speech-webui
```

Common issues:

* Missing model weights in `./checkpoints`
* Port already in use (change port mapping)
* Insufficient GPU memory

### GPU Not Detected

Verify NVIDIA Docker runtime is installed:

```bash theme={null}
docker run --rm --gpus all nvidia/cuda:12.6.0-base-ubuntu24.04 nvidia-smi
```

### Performance Issues

1. Enable compile optimization: `COMPILE=1`
2. Ensure GPU is being used (check with `nvidia-smi`)
3. Verify sufficient GPU memory is available

## Next Steps

* **[Run inference](/developer-guide/self-hosting/running-inference)** - Learn how to generate speech
* **[Download models](https://huggingface.co/fishaudio)** - Get pre-trained weights
* **[API documentation](/api-reference/introduction)** - Integrate with your applications
