Table of contents

cloudtts — MOSS-TTS + Chatterbox TTS

Two best-in-class open TTS engines in one image: MOSS-TTS (OpenMOSS) and Chatterbox TTS. Runs each in its own conda env to sidestep dependency conflicts.

Image tag

docker.io/manvarharsh/cloudtts:cuda12

What’s in this image

  • Base: nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04
  • MOSS-TTS in a Python 3.12 conda environment (with Gradio UI)
  • Chatterbox TTS in a Python 3.11 conda environment (REST API)
  • Flash Attention
  • OpenSSH server
  • No JupyterHub — interact via the TTS UIs or SSH

Default ports

PortService
22SSH
7860MOSS-TTS Gradio UI
8000Chatterbox TTS API

Use cases

  • Voice cloning and zero-shot TTS via MOSS-TTS
  • Programmatic TTS via Chatterbox’s REST API
  • Comparing TTS engines side-by-side on the same hardware
  • Audio dataset generation for fine-tuning

Environment variables

VariableDescription
ENABLE_SSHEnable SSH server
ENABLE_MOSS_TTSStart MOSS-TTS Gradio UI on port 7860
ENABLE_CHATTERBOXStart Chatterbox TTS server on port 8000
MOSS_TTS_PORTOverride MOSS-TTS port
CHATTERBOX_PORTOverride Chatterbox port
MOSS_TTS_EXTRA_ARGSExtra CLI args for MOSS-TTS
SSH_PUBLIC_KEYPublic key for SSH

Persistence

Mount at /data. Voice samples in /data/voices/, generated audio in /data/output/.

See also