Skip to main content
WSS
wss://api.fish.audio
/
v1
/
tts
/
live
Messages
bearerAuth
type:http

API key authentication using Bearer token.

Get your API key from https://fish.audio/app/api-keys

Pass the token in the Authorization header: Authorization: Bearer YOUR_API_KEY

headers
type:object
model
type:string

TTS model to use for this session

Start TTS Session
type:object

Initiates a TTS streaming session with configuration.

This must be the first message sent after connecting. It contains all the configuration for voice, audio format, and generation parameters.

Send Text Chunk
type:object

Sends a chunk of text for synthesis.

You can send multiple TextEvent messages in sequence. The server will buffer and synthesize text according to the chunk_length parameter from StartEvent.

Flush Buffered Text
type:object

Forces immediate synthesis of all buffered text.

Use this when you want audio generated immediately without waiting for more text or for the buffer to fill up. Useful for ensuring low latency in interactive applications.

End TTS Session
type:object

Signals the end of the text stream.

After sending this event, the server will finish synthesizing any remaining buffered text and send a FinishEvent before closing the connection.

Audio Chunk
type:object

Contains generated audio bytes.

You will receive multiple AudioEvent messages as audio is generated. Each message contains a chunk of audio in the format you specified. Concatenate all chunks to get the complete audio.

Session Complete
type:object

Signals that the TTS session has completed.

  • If reason='stop', synthesis completed successfully
  • If reason='error', an error occurred (client should handle gracefully)

The WebSocket connection will close after this event.

The WebSocket TTS endpoint enables bidirectional streaming for low-latency text-to-speech generation with MessagePack serialization.