Skip to main content

Prerequisites

Sign up for a free Fish Audio account to get started with our API.
  1. Go to fish.audio/auth/signup
  2. Fill in your details to create an account, complete steps to verify your account.
  3. Log in to your account and navigate to the API section
Once you have an account, you’ll need an API key to authenticate your requests.
  1. Log in to your Fish Audio Dashboard
  2. Navigate to the API Keys section
  3. Click “Create New Key” and give it a descriptive name, set a expiration if desired
  4. Copy your key and store it securely
Keep your API key secret! Never commit it to version control or share it publicly.

Basic Transcription

Transcribe audio files to text with automatic language detection using asr.transcribe():
from fishaudio import FishAudio

client = FishAudio()

# Transcribe audio
with open("audio.mp3", "rb") as f:
    result = client.asr.transcribe(audio=f.read())

print(f"Transcription: {result.text}")
print(f"Duration: {result.duration}ms")
The ASRResponse object contains the full transcription and segment details.

Language Specification

Specify the language for more accurate transcription:
from fishaudio import FishAudio

client = FishAudio()

# Specify language code
with open("chinese_audio.mp3", "rb") as f:
    result = client.asr.transcribe(
        audio=f.read(),
        language="zh"  # Chinese
    )

print(result.text)
Auto-detection works well for most cases, but specifying the language can improve accuracy, especially for languages with similar phonetics.

Segment Timestamps

Access word-level or phrase-level timestamps:
from fishaudio import FishAudio

client = FishAudio()

# Transcribe with segments
with open("audio.mp3", "rb") as f:
    result = client.asr.transcribe(audio=f.read())

# Access full text
print(f"Full text: {result.text}")

# Iterate through segments
for segment in result.segments:
    print(f"[{segment.start}ms - {segment.end}ms]: {segment.text}")

Next Steps