Wan 2.2
What is Wan 2.2?
Wan 2.2 is an earlier-generation Wan video model from Alibaba. Two modes are exposed on Unifically: text-to-video from a prompt, and image-to-video that takes a start_image_url for the first frame. Output is silent and each clip is a fixed 5 seconds. Resolution is 480p or 1080p. Controls include a negative prompt, a watermark toggle, optional intelligent prompt rewriting, and a seed for reproducibility. It is the right choice when your pipeline is already wired up to the Wan 2.2 parameter shape and you do not need audio, multi-shot, or reference-to-video.
Key features of Wan 2.2
Four features cover what the model actually exposes day to day.

Text-to-video and image-to-video
Run text-only or hand in a start frame to drive image-to-video. Same model ID, same parameter shape, switch with the `mode` field.

Fixed 5-second clips at 480p or 1080p
No duration slider. Predictable cost per clip and predictable wall-clock per render. Pick 480p for drafts and 1080p for delivery.

Negative prompts for brand-safe output
Suppress unwanted content with `negative_prompt`. Useful for guarding against logos, faces, and other content categories that have to stay out.

Seed control for reproducibility
Pass an integer seed and the same prompt and input image returns the same clip. Useful for A/B variants and regression tests in CI/CD pipelines.
Best for
Simple motion from a still
Image-to-video with a clear start frame. Predictable behaviour from the same input.
Prompt-only short clips
Text-to-video at 480p or 1080p for short demos and feed previews.
Stable legacy pipelines
Mature parameter set for older Wan integrations that do not need 2.5 audio or 2.6 reference-to-video.
Negative-prompt control
Suppress unwanted content for brand-safe output without a separate moderation step.
Reproducible regression tests
Seed plus the same prompt and input image returns the same clip.
Limitations
No audio. No multi-shot. No reference-to-video. Duration is fixed at 5 seconds. Resolution is 480p or 1080p only (Wan 2.2 Fast adds 720p in image-to-video). If you need any of those, jump to Wan 2.5 (audio, 5 or 10 seconds), Wan 2.6 (multi-shot, reference-to-video, up to 15 seconds), or Wan 2.7 (last-frame, video continuation, lip-sync).
API examples
Call Wan 2.2 from any language by POSTing to /v1/tasks. Full parameter docs live at docs.unifically.com/models/video/alibaba/wan-2.2-video.
curl -X POST https://api.unifically.com/v1/tasks \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "alibaba/wan-2.2-video",
"input": {
"prompt": "A street musician plays guitar in a vintage subway station",
"mode": "t2v",
"resolution": "1080p"
}
}'
Successful submission returns a task_id. Poll GET /v1/tasks/<task_id> or set a callback_url on the request to receive the finished result.
FAQs
People also ask
Wan 2.2 is an Alibaba video model that supports text-to-video and image-to-video at 480p or 1080p. Each clip is a fixed 5 seconds. Output is silent. Controls include negative prompts, optional intelligent prompt rewriting, an optional watermark, and a seed for reproducibility.
No. Text-to-video runs from a prompt only. Image-to-video requires a start_image_url for the first frame.
Each call produces a single 5-second clip. There is no duration slider on this model.
No. Wan 2.2 outputs silent video. Audio generation arrived in Wan 2.5.
Move to Wan 2.5 when you need 5 or 10 second clips with optional generated or custom audio. Move to Wan 2.6 or 2.7 when you need multi-shot, reference-to-video, or longer durations up to 15 seconds.
Yes. Pass an integer seed alongside the prompt and any input image. Re-sending the same prompt, image, and seed returns the same render.