Unifically LogoUnificAlly
Model logo

ElevenLabs TTS API

Text to SpeechSound EffectsMusicSpeech to Text

Complete audio AI platform. Text-to-speech, multi-voice dialogue, sound effects, vocal isolation, and speech-to-text.

Speaker Boost
Boost similarity to original voice. Slightly increases latency
Output

Your generated audio will appear here

Features

What ElevenLabs TTS API offers

Text input with required ElevenLabs voice id from the live voice list
Model id choices: Flash v2.5, Turbo v2.5, Multilingual v2, and V3 in the playground
Wide output_format list: MP3, Opus, PCM, WAV, and telephony ulaw or alaw
Optional language_code ISO 639 1 with auto when you want detection
Optional seed for repeatable renders in the supported range
Voice settings object: stability, similarity_boost, style, speed, speaker boost
Async generate then status polling endpoints on Unifically
Metered per character; Flash v2.5 and Turbo v2.5 list at $0.000092 per character in public pricing

Use cases

Built for

Primary

Narration for ads, explainers, and product tours

#2

Audiobook and long form chapter audio from manuscript text

#3

IVR and telephony when you pick ulaw or alaw output formats

#4

Game and app dialogue lines with stable seeds for retakes

#5

Localized releases using multilingual or V3 models where supported

#6

Assistive playback of UI strings with consistent voice settings

FAQ

About ElevenLabs TTS API

It wraps ElevenLabs text to speech. You post text and voice id, choose model_id and output_format, optionally voice_settings, then poll until audio is ready.

eleven_flash_v2_5 (~75 ms, 32 languages), eleven_turbo_v2_5 (~250 ms, 32 languages), eleven_multilingual_v2 (29 languages), and eleven_v3 (70 plus languages, most expressive).

The site pricing file lists Flash v2.5 and Turbo v2.5 at $0.000092 per character. It does not publish separate character prices for Multilingual v2 or V3, so check live usage after runs that use those model ids.

Set output_format to the codec, sample rate, and bitrate you need. Higher bitrate MP3 or WAV trades size for fidelity. Telephony formats target eight kHz voice circuits.