Kling 2.1
What is Kling 2.1?
Kling 2.1 is Kuaishou's image-to-video model and the original member of the Kling 2.x line. The job it does well is animating a supplied still: hand it a starting frame and the model returns a 5 or 10 second clip with motion that respects the input composition. There is no text-only mode here. If a prompt is the only input you have, use Kling 2.6 or Kling 2.1 Master instead. The 2.1 endpoint lets you choose Standard (720p) or Pro (1080p) output, pass an optional end frame to anchor the closing composition, and turn on a sound effects track for ambient sound or music.
Key features of Kling 2.1
Four features cover what teams build with Kling 2.1 in production.
Image-to-video specialist
The endpoint takes a start frame as a required input and turns it into a 5 or 10 second clip. The composition stays anchored to the still, which makes 2.1 the right tool for product photography, brand portraits, and key art that has to read clearly once it moves.
Optional end frame for bookending
Pass a closing image alongside the start frame and the model interpolates the camera move and motion between them. Useful when you have both the open and the close locked and want the middle handled in one call.
Standard or Pro output modes
Standard renders at 720p for fast iteration and cheaper previews. Pro renders at 1080p for paid placements and hero spots. Same model, different ceiling on resolution.
Sound effects track in the same call
The sound_effects object adds ambient sound, background music, or an ASMR-style audio pass to the result. Skip the field to keep the output silent.
Best for
Product packshot animation
Animate a flat product photo into a 5 or 10 second loop. Reliable for ecommerce video, PDP hero clips, and store-listing previews.
Banner and retail screen loops
Short cycles for digital signage and display ad placements where the input still already nailed the composition.
Frame-bookended hero shots
Optional end frame anchors the closing composition. Good when you have the open and close locked and want the middle handled.
Cost-controlled 1080p delivery
Pro mode at 1080p covers paid placements without the per-second cost of Kling 3.0.
Single-still campaigns
Brand stills that need to become motion without a fresh shoot. Add ambient audio or music in the same call.
Use cases
Animate a packshot for a product detail page by uploading the studio photo, choosing 5 seconds at Pro, and sending the loop straight to the PDP. Build a digital signage spot from a single brand still by passing the still as the start frame, the closing logo card as the end frame, and 10 seconds at Standard for the in-store screen. Turn a campaign hero image into a paid social asset by running it at Pro with a short prompt that describes the camera move, then layering ambient sound through the sound_effects object so the final cut needs no extra audio pass.
API examples
Call Kling 2.1 from any language by POSTing to /v1/tasks. Full parameter docs live at docs.unifically.com/models/video/kling/kling-2.1.
curl -X POST https://api.unifically.com/v1/tasks \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "kuaishou/kling-2.1-video",
"input": {
"prompt": "A butterfly landing on a flower in slow motion",
"start_frame_url": "https://example.com/butterfly.jpg",
"duration": 5,
"mode": "pro"
}
}'
Successful submission returns a task_id. Poll GET /v1/tasks/<task_id> or set a callback_url on the request to receive the finished result.
FAQs
People also ask
Kling 2.1 is Kuaishou's image-to-video model. You give it a starting frame and it animates the still for 5 or 10 seconds. Optional fields include a prompt for motion guidance, an end frame, an output mode of Standard (720p) or Pro (1080p), and a sound effects track.
No. The 2.1 endpoint requires a start frame. For prompt-only generation, use Kling 2.6, Kling 2.5 Turbo, or Kling 2.1 Master.
2.6 adds prompt-only text-to-video, optional end frames in the text path, and native audio with voice references. 2.1 stays focused on animating a supplied still and keeps the parameter surface small.
Use 2.1 when the brief is "animate this still" and you already have a clean reference image. Use 2.6 when you want to start from a prompt instead of a frame.
Pro renders at 1080p, Standard renders at 720p. Same model, different output resolution. Pick Pro for paid placements and hero spots; Standard for internal review and fast iteration.