ElevenLabs TTS API

Text to SpeechSound EffectsMusicSpeech to Text

Complete audio AI platform. Text-to-speech, multi-voice dialogue, sound effects, vocal isolation, and speech-to-text.

·Features·FAQ

Documentation

Text *

Text to convert to speech

Voice *

Voice ID to use

Model

TTS model

Output Format

Audio output format

Language Code

ISO 639-1 language code (auto for auto-detect)

Seed

Seed for deterministic output (0-4294967295)

Voice Stability

Voice stability (0-1). Lower = more emotional range

Similarity Boost

How closely to match the original voice (0-1)

Style

Amplifies speaker style (0-1). Increases latency

Speed

Speech speed multiplier (1.0 = normal)

Speaker Boost

Boost similarity to original voice. Slightly increases latency

Output

Your generated audio will appear here

Features

What ElevenLabs TTS API offers

Text input with required ElevenLabs voice id from the live voice list

Model id choices: Flash v2.5, Turbo v2.5, Multilingual v2, and V3 in the playground

Wide output_format list: MP3, Opus, PCM, WAV, and telephony ulaw or alaw

Optional language_code ISO 639 1 with auto when you want detection

Optional seed for repeatable renders in the supported range

Voice settings object: stability, similarity_boost, style, speed, speaker boost

Async generate then status polling endpoints on Unifically

Metered per character; Flash v2.5 and Turbo v2.5 list at $0.000092 per character in public pricing

Use cases

Built for

Primary

Narration for ads, explainers, and product tours

Audiobook and long form chapter audio from manuscript text

IVR and telephony when you pick ulaw or alaw output formats

Game and app dialogue lines with stable seeds for retakes

Localized releases using multilingual or V3 models where supported

Assistive playback of UI strings with consistent voice settings

FAQ

About ElevenLabs TTS API

It wraps ElevenLabs text to speech. You post text and voice id, choose model_id and output_format, optionally voice_settings, then poll until audio is ready.

eleven_flash_v2_5 (~75 ms, 32 languages), eleven_turbo_v2_5 (~250 ms, 32 languages), eleven_multilingual_v2 (29 languages), and eleven_v3 (70 plus languages, most expressive).

The site pricing file lists Flash v2.5 and Turbo v2.5 at $0.000092 per character. It does not publish separate character prices for Multilingual v2 or V3, so check live usage after runs that use those model ids.

Set output_format to the codec, sample rate, and bitrate you need. Higher bitrate MP3 or WAV trades size for fidelity. Telephony formats target eight kHz voice circuits.