Veo 3.1 API: Pricing, Specs, and How to Use It
Veo 3.1 is Google's video generation model with native audio, 720p/1080p/4K output, and Lite/Fast/Quality variants. Here is how to call it and what it costs.
Veo 3.1 is Google's video generation model. It produces clips with native audio at 720p, 1080p, or up to 4K (preview), and comes in three quality variants: Lite, Fast, and Quality. This guide covers what the model does, how it is priced on Unifically against the official Google APIs, and how to call each variant.
TL;DR: Veo 3.1 generates 4-, 6-, or 8-second clips at 24 FPS in 16:9 or 9:16 with synchronized audio. On Unifically the line starts at $0.075 per video for Lite Relaxed and tops out at $0.60 for Quality, with separate Extend and Upscale endpoints. Google's direct Vertex / Gemini API charges per second ($0.15/s Fast, $0.40/s Standard), so a single 8-second Quality clip is materially cheaper through Unifically than direct. Live rates: pricing page.
Update (May 10, 2026): Google deprecated the Fast Relaxed variant. It is no longer available on Unifically. Use Fast for the same output on the standard queue, or Lite Relaxed for the cheapest draft path.
What is Veo 3.1?
Veo 3.1 is a text-to-video and image-to-video model from Google that generates MP4 clips with synchronized native audio. It launched on November 17, 2025. Google extended it in the Gemini API and Vertex AI in early 2026 with Lite / Fast / Quality variants, Scene Extension, and 4K upscaling.
Each generation accepts a prompt, optional start and end frames, and optional reference images on the Fast variant. The model returns a 24 FPS MP4 in 16:9 or 9:16 with audio in the same file.
What's new in Veo 3.1
Veo 3.1 is the first widely available video model that produces synchronized audio in the same call as the video. Sound effects, ambience, and dialogue arrive already locked to on-screen action. That removes a separate sound design pass for short clips and makes Veo 3.1 a strong default for ad cutdowns, social verticals, and storyboard-to-pitch workflows where audio is the bottleneck.
The 4K preview variant and the dedicated Upscale endpoints also let teams keep raw generation costs low (generate at the cheapest variant, upscale only the clips that survive review), which matters once you generate at production volumes.
What you can do with Veo 3.1
Three generation variants
- Veo 3.1 Lite. Cheapest variant; standard frame-mode generation.
- Veo 3.1 Fast. Faster generation; adds reference-image mode (up to three images) and optional voice presets.
- Veo 3.1 Quality. Google's top-end output; frame-mode only, highest quality per clip.
Lite has a Lite Relaxed counterpart that runs at lower priority for a lower listed price.
Inputs
- Text-to-video. A prompt is enough on every variant.
- Image-to-video (frame mode). Supply a start frame and an optional end frame to bracket the motion. Available on Lite, Lite Relaxed, and Quality.
- Reference images. Fast accepts up to three reference images to steer subject and style, with optional voice presets.
Extend and Upscale
- Veo 3.1 Extend continues a completed clip from its task ID using a chosen base model (Lite through Quality). Useful for narrative cuts longer than a single 8-second generation.
- Veo 3.1 Upscale 1080p and Veo 3.1 Upscale 4K take a finished task ID and return an upscaled MP4. Use them as a final pass on clips that survive review.
Output
- Resolutions: 720p, 1080p, and 4K (4K is preview on Google's side).
- Durations: 4, 6, or 8 seconds per clip; chain with Extend for longer narratives.
- Aspect ratios: 16:9 and 9:16.
- Frame rate: 24 FPS.
- Format: MP4 with native audio.
Veo 3.1 pricing and how it compares
All Unifically prices are per video, billed per generation. Google's direct API is per second, so the comparison row below normalises to an 8-second clip. Numbers below were accurate at the time of writing; check the pricing page for live rates.
| Source | Variant | Price (8s clip) | Audio | Notes |
|---|---|---|---|---|
| Unifically | Veo 3.1 Lite Relaxed | $0.075 | included | Lower priority queue |
| Unifically | Veo 3.1 Lite | $0.15 | included | Standard Lite |
| Unifically | Veo 3.1 Fast | $0.30 | included | Adds reference images, voice presets |
| Unifically | Veo 3.1 Quality | $0.60 | included | Highest quality |
| Vertex AI / Gemini API | Veo 3.1 Fast | ~$1.20 (8s × $0.15/s) | included | Standard public pricing |
| Vertex AI / Gemini API | Veo 3.1 Standard | ~$3.20 (8s × $0.40/s) | included | Standard public pricing |
Two add-on endpoints sit alongside generation:
| Endpoint | Price (per video) | Use |
|---|---|---|
| Veo 3.1 Extend | $0.15 to $0.60 | Continue a prior clip (price follows the chosen base model) |
| Veo 3.1 Upscale 1080p | $0.05 | Upscale an existing task to 1080p |
| Veo 3.1 Upscale 4K | $0.50 | Upscale an existing task to 4K |
Source for Google direct rates: Vertex AI / Gemini API public per-second pricing.
How to use Veo 3.1 on Unifically
The API is async: POST a generation, then poll a task endpoint until the MP4 is ready.
const API = 'https://api.unifically.com';
const headers = {
Authorization: `Bearer ${process.env.UNIFICALLY_API_KEY}`,
'Content-Type': 'application/json',
};
const start = await fetch(`${API}/veo-3.1-fast/generate`, {
method: 'POST',
headers,
body: JSON.stringify({
prompt: 'Aerial shot of a coastal cliff at golden hour, gulls circling overhead, waves crashing below',
aspect_ratio: '16:9',
}),
}).then((r) => r.json());
while (true) {
await new Promise((r) => setTimeout(r, 3000));
const task = await fetch(`${API}/v1/tasks/${start.task_id}`, { headers }).then((r) => r.json());
if (task.status === 'completed') {
console.log(task.video_url);
break;
}
if (task.status === 'failed') throw new Error(task.error);
}
Swap the generate path for the variant you want: /veo-3.1-lite/generate, /veo-3.1-quality/generate, /veo-3.1-extend/generate, /veo-3.1-upscale-1080p/generate, or /veo-3.1-upscale-4k/generate. The polling endpoint is the same for every Veo task.
Things to know
- Generating at Quality before iterating. Lock the prompt and framing on Lite Relaxed first. A failed Quality clip costs 8x more than a failed Lite Relaxed clip.
- Using reference-image mode on the wrong variant. Reference images are Fast and Lite only. Quality uses start / end frame mode.
- Forgetting Extend needs a finished task ID. Extend is a separate endpoint that takes the prior task ID plus the base model. You cannot extend a clip that has not finished generating.
- Upscaling everything by default. Upscale 4K is $0.50 per clip. Use it as a finishing pass on the clips that survived review, not on every generation.
- Treating 4K as production-locked. Google still labels 4K as preview on Vertex. Expect occasional capacity or quality variance vs 1080p output.
Frequently asked questions
What is Veo 3.1?
Veo 3.1 is Google's text-to-video and image-to-video model. It generates 4 to 8 second 24 FPS MP4 clips at 720p, 1080p, or 4K with synchronized native audio, in 16:9 or 9:16 aspect ratios.
How much does Veo 3.1 cost on Unifically?
Generation runs from $0.075 per video for Lite Relaxed up to $0.60 per video for Quality, with Fast at $0.30. Extend is $0.15 to $0.60. Upscale 1080p is $0.05 and Upscale 4K is $0.50. Check the pricing page for current rates.
Does Veo 3.1 generate audio?
Yes. All three Veo 3.1 variants generate synchronized audio (speech, sound effects, and ambience) inside the same MP4 as the video. There is no separate audio call.
What is the longest Veo 3.1 clip I can generate?
A single Veo 3.1 generation returns 4, 6, or 8 seconds. To go longer, use Veo 3.1 Extend with the prior task ID and a chosen base model to continue the clip.
Which variant should I start with?
Start on Lite Relaxed at $0.075 to lock the prompt and framing. Move to Fast at $0.30 once you need reference images or voice presets. Use Quality at $0.60 for the final clip.
Related reading
- Veo 3.1 model page: live playground and full parameter reference.
- Veo 3.1 vs SeeDance 2.0: head-to-head comparison and pricing.
- Kling 2.6 and MiniMax Hailuo: other video APIs on Unifically.
- Nano Banana 2: pair with Veo 3.1 for image-to-video pipelines.



