Browse and try 69 model families from 13 providers. Open any card to use sub-models in the playground.

Per-variant pricing for every sub-model is on the pricing page .

Type(5)

Provider(13)

Capabilities(6)

Coming soon: 30s single-pass videos, 50 mixed references, native 4K 10-bit, region editing.

View details

SeeDance 2.5

ByteDance

Video

SeeDance 2.5 pricing

1M context, 128k output, 88.8% Terminal-Bench 2.1, 80 Coding Agent Index, ultra subagent mode.

Try this model

GPT 5.6 Sol

OpenAI

Text

GPT 5.6 Sol pricing

1M context, 128k output, 77.4 Coding Agent Index, beats Fable 5 on Agents' Last Exam.

Try this model

GPT 5.6 Terra

OpenAI

Text

GPT 5.6 Terra pricing

Fastest GPT 5.6, 1M context, beats Fable 5 on Agents' Last Exam, ahead of Opus 4.8 on coding.

Try this model

GPT 5.6 Luna

OpenAI

Text

GPT 5.6 Luna pricing

Up to 10 reference images, 8 aspect ratios, 1K/1.5K/2K output, one image per call.

Try this model

SeeDream 5.0 Pro

ByteDance

Image

SeeDream 5.0 Pro pricing

1M context, 128k output, text and image input, adaptive thinking, top Arena Text and Vision rank.

Try this model

Claude Fable 5

Anthropic

Text

Claude Fable 5 pricing

200k context, 64k output, 77.2% SWE-bench Verified, 61.4% OSWorld, extended thinking.

Try this model

Claude Sonnet 4.5

Anthropic

Text

Claude Sonnet 4.5 pricing

200k context, 64k output, 80.9% SWE-bench Verified, effort controls from low to high.

Try this model

Claude Opus 4.5

Anthropic

Text

Claude Opus 4.5 pricing

200k context, 64k output, 73.3% SWE-bench Verified, 50.7% OSWorld, extended thinking.

Try this model

Claude Haiku 4.5

Anthropic

Text

Claude Haiku 4.5 pricing

Agentic coding, 1M context, 128k output, image input, 85.2% SWE-bench Verified.

Try this model

Claude Sonnet 5

Anthropic

Text

Claude Sonnet 5 pricing

Sub-2s generation, 14 aspect ratios, 1K output, up to 14 reference images.

Try this model

Nano Banana 2 Lite

Google

Image

Nano Banana 2 Lite pricing

Up to 9 ref images, 720p or 1080p, 3–15s, native audio and lip-sync in one pass.

Try this model

HappyHorse 1.1

Alibaba

Video

HappyHorse 1.1 pricing

T2V or I2V, 3–15s at 720p Standard or 1080p Pro, no multi-shot or native audio.

Try this model

Kling 3.0 Turbo

Kling

Video

Kling 3.0 Turbo pricing

1M context, 128k output, 88.6% SWE-bench Verified, 83.4% OSWorld, adaptive thinking.

Try this model

Claude Opus 4.8

Anthropic

Text

Claude Opus 4.8 pricing

Text, image, and character references with voice presets; 4-10s videos plus conversational editing.

Try this model

Gemini Omni Flash

Google

Video

Gemini Omni Flash pricing

Agentic coding, 79.8% SWE-Bench Multilingual, 69.3% Terminal-Bench 2.0, tuned for long agent runs.

Try this model

Composer 2.5

Cursor

Text

Composer 2.5 pricing

Low-latency default variant of Composer 2.5, same intelligence, 79.8% SWE-bench Multilingual.

Try this model

Composer 2.5 Fast

Cursor

Text

Composer 2.5 Fast pricing

T2V, I2V, or R2V up to 9 reference images, 720P or 1080P, 3–15s clips with joint audio-video.

Try this model

HappyHorse 1.0

Alibaba

Video

HappyHorse 1.0 pricing

1M context, 128k output, 82.7% Terminal-Bench 2.0, 78.7% OSWorld, 85% ARC-AGI-2.

Try this model

GPT 5.5

OpenAI

Text

GPT 5.5 pricing

T2I or reference editing, 5 aspect ratios, 1K, 2K, or 4K resolution control.

Try this model

GPT Image 2

OpenAI

Image

GPT Image 2 pricing

1M context, 128k output, 87.6% SWE-bench Verified, 78% OSWorld, 2576px image input.

Try this model

Claude Opus 4.7

Anthropic

Text

Claude Opus 4.7 pricing

Up to 9 image/video/audio references, T2V or first/last frame, 4–15s, up to 4K on Pro.

Try this model

SeeDance 2.0

ByteDance

Video

SeeDance 2.0 pricing

T2I or editing up to 2K, thinking mode for better quality, up to 9 reference images.

Try this model

Wan 2.7

Alibaba

Image

Wan 2.7 pricing

T2V, I2V, or R2V with first/last frame, video continuation, lip-sync, 2–15s at 720P or 1080P.

Try this model

Wan 2.7

Alibaba

Video

Wan 2.7 pricing

400k context, 128k output, 54.4% SWE-Bench Pro, 72.1% OSWorld, over 2x faster than GPT-5 mini.

Try this model

GPT 5.4 Mini

OpenAI

Text

GPT 5.4 Mini pricing

400k context, 128k output, 52.4% SWE-Bench Pro, 82.8% GPQA Diamond, effort levels none to xhigh.

Try this model

GPT 5.4 Nano

OpenAI

Text

GPT 5.4 Nano pricing

Native computer use, 75% OSWorld-Verified, up to 1M context, 128k output, tool search for agents.

Try this model

GPT 5.4

OpenAI

Text

GPT 5.4 pricing

1M context, 128k output, 79.6% SWE-bench Verified, 72.5% OSWorld, adaptive thinking.

Try this model

Claude Sonnet 4.6

Anthropic

Text

Claude Sonnet 4.6 pricing

1M context, 128k output, effort controls, top Arena Document and Search rank.

Try this model

Claude Opus 4.6

Anthropic

Text

Claude Opus 4.6 pricing

Multi-shot 2–6 scenes per call, 3–15s single-shot, 4K mode, with Audio 2.0 lip-sync.

Try this model

Kling 3.0

Kling

Video

Kling 3.0 pricing

Motion transfer onto a character with high facial consistency, 720p Std or 1080p Pro.

Try this model

Kling 3.0 Motion Control

Kling

Video

Kling 3.0 Motion Control pricing

T2I or editing with up to 10 reference images, 9 aspect ratios incl. auto, and 1K to 4K.

Try this model

Kling 3.0 Omni

Kling

Image

Kling 3.0 Omni pricing

Up to 7 references plus elements, video reference/transform, multi-shot 2–6, 4K mode.

Try this model

Kling 3.0 Omni

Kling

Video

Kling 3.0 Omni pricing

Edit videos with reference (style guide) or transform mode, plus image and video elements.

Try this model

Kling 3.0 Omni Edit

Kling

Video

Kling 3.0 Omni Edit pricing

Up to 6 reference images, 8 aspect ratios, 2K or 4K, returns 4 images per call.

Try this model

SeeDream 5.0 Lite

ByteDance

Image

SeeDream 5.0 Lite pricing

T2I with Chinese and English text rendering, 7 aspect ratios. No editing.

Try this model

Z-Image Turbo

Alibaba

Image

Z-Image Turbo pricing

T2V, I2V, or R2V up to 5 references, 720p or 1080p, 2–15s, with auto audio and multi-shot.

Try this model

Wan 2.6

Alibaba

Video

Wan 2.6 pricing

T2I, editing, or style transfer with 1–4 reference images and 7 aspect ratios.

Try this model

Wan 2.6

Alibaba

Image

Wan 2.6 pricing

Transfer motion from a reference video onto a character image, 720p Std or 1080p Pro.

Try this model

Kling 2.6 Motion Control

Kling

Video

Kling 2.6 Motion Control pricing

T2V or I2V at 720p Standard or 1080p Pro, 5 or 10s, with optional end frame and audio.

Try this model

Kling 2.6

Kling

Video

Kling 2.6 pricing

Flex, Pro, and Max variants, 7 aspect ratios up to 4MP, reference image support.

Try this model

Flux.2

Black Forest Labs

Image

Flux.2 pricing

9B Flux.2 Klein, 7 aspect ratios up to 4MP, reference image support.

Try this model

Flux.2 Klein 9B

Black Forest Labs

Image

Flux.2 Klein 9B pricing

4B Flux.2 Klein, smallest in the line, 7 aspect ratios up to 4MP, reference image support.

Try this model

Flux.2 Klein 4B

Black Forest Labs

Image

Flux.2 Klein 4B pricing

T2I plus multi-image fusion editing, 5 aspect ratios, smart prompt rewriting on by default.

Try this model

Qwen Image 2.0

Alibaba

Image

Qwen Image 2.0 pricing

Reasoning video gen with up to 7 references, image-only elements, single-shot 3–10s, 720p/1080p.

Try this model

Kling O1

Kling

Video

Kling O1 pricing

Reasoning image gen with up to 10 reference images, 9 aspect ratios, and 1K or 2K output.

Try this model

Kling O1

Kling

Image

Kling O1 pricing

Reasoning video editing with reference or transform mode, up to 4 reference images.

Try this model

Kling O1 Edit

Kling

Video

Kling O1 Edit pricing

Gemini 3 image gen with 1K, 2K, or 4K, 10 aspect ratios, and reference image support.

Try this model

Nano Banana Pro

Google

Image

Nano Banana Pro pricing

T2V or I2V with optional end frame, 5 or 10s at 720p Standard or 1080p Pro.

Try this model

Kling 2.5 Turbo

Kling

Video

Kling 2.5 Turbo pricing

TTS, multi-voice dialogue, sound effects, vocal isolation, and speech-to-text.

Try this model

ElevenLabs

Speech
Sound Effects
Text

ElevenLabs pricing

Up to 6 reference images, 8 aspect ratios, 2K or 4K, returns 4 images per call.

Try this model

SeeDream 4.5

ByteDance

Image

SeeDream 4.5 pricing

4 model variants, 720p–4K, lip-synced dialogue and SFX in one call.

Try this model

Veo 3.1

Google

Video

Veo 3.1 pricing

Gemini image gen with 1K, 2K, or 4K resolution control and 10 aspect ratios.

Try this model

Nano Banana 2

Google

Image

Nano Banana 2 pricing

T2I or editing with 1–3 reference images, 7 aspect ratios, prompt extend on by default.

Try this model

Wan 2.5

Alibaba

Image

Wan 2.5 pricing

T2V or I2V at 480p, 720p, or 1080p, 5 or 10s, with auto audio or custom audio sync.

Try this model

Wan 2.5

Alibaba

Video

Wan 2.5 pricing

T2I plus reference editing, 5 aspect ratios, smart prompt rewriting, negative prompts.

Try this model

Qwen Image

Alibaba

Image

Qwen Image pricing

Up to 6 reference images, 8 aspect ratios, 2K or 4K, returns 4 images per call.

Try this model

SeeDream 4.0

ByteDance

Image

SeeDream 4.0 pricing

T2V or first/last frame at 720p or 1080p, with 5, 10, or 12s clips.

Try this model

SeeDance 1.5 Pro

ByteDance

Video

SeeDance 1.5 Pro pricing

3 variants (2.0, 2.3, 2.3 Fast), end-frame on 2.0, 6 or 10s at 768p–1080p (512p I2V on 2.0).

Try this model

MiniMax Hailuo

MiniMax

Video

MiniMax Hailuo pricing

Text-to-image only, 5 aspect ratios, negative prompts and seeds for reproducible output.

Try this model

Wan 2.2

Alibaba

Image

Wan 2.2 pricing

Silent T2V or I2V, fixed 5s clips at 480p or 1080p, with negative prompts and seeds.

Try this model

Wan 2.2

Alibaba

Video

Wan 2.2 pricing

Gemini 2.5 Flash image gen with 10 aspect ratios and reference image support.

Try this model

Nano Banana

Google

Image

Nano Banana pricing

1.25x–4x video upscale, optional FPS retiming, 4K cap, source FPS preserved by default.

Try this model

Topaz Video Upscale

Topaz Labs

Video

Topaz Video Upscale pricing

AI image upscale 1x–16x with denoise, sharpen, and face enhancement controls.

Try this model

Topaz Image Upscale

Topaz Labs

Image

Topaz Image Upscale pricing

T2V or I2V at Pro 1080p only, 5 or 10s, with optional sound effects and music.

Try this model

Kling 2.1 Master

Kling

Video

Kling 2.1 Master pricing

1.5 Video: I2V, 1–15s at up to 1080p. Grok Imagine Video: speed-optimized T2V or I2V, 1–10s at 720p.

Try this model

Grok Imagine

xAI

Video
Image

Grok Imagine pricing

I2V with optional end frame, 5 or 10s, Standard 720p or Pro 1080p, optional sound effects.

Try this model

Kling 2.1

Kling

Video

Kling 2.1 pricing

1.0 for 480p/720p T2V or I2V; Pro for first/last frame up to 1080p; Pro Fast for faster T2V/I2V up to 1080p.

Try this model

SeeDance 1.0

ByteDance

Video

SeeDance 1.0 pricing

Generate music, extend songs, add vocals, isolate stems, and write lyrics.

Try this model

Suno

Music
Sound Effects

Suno pricing

AI Model Directory — Video, Image & Audio Generation