Kling Motion Control 3.0 API

Image to VideoReference to Video

Transfer motion from reference videos onto characters with high facial consistency. Upgraded from v2.6.

·Features·FAQ

Documentation

Character Image *

Reference image with character (head, shoulders, and torso must be clearly visible)

0/1

Click or drag & dropPNG, JPG, WEBP, GIF · Max 100MB

Reference Video *

Reference video containing the motion to transfer

0/1

Click or drag & dropMP4, WEBM, MOV · Max 100MB

Prompt

Optional text guidance

Keep audio

Preserve audio from the motion reference video

Character Orientation

Orientation mode for the output video

Mode

Output quality mode

Output

Your generated video will appear here

Features

What Kling Motion Control 3.0 API offers

Motion transfer from a reference video onto a character image with stronger face consistency than Motion Control 2.6

Reference clips up to 30 seconds in the playground

Optional prompt text for scene guidance

Keep audio from the motion reference clip when you need sound in the result

Character orientation Video (suited to complex motion) or Image (suited to camera style motion)

Standard (720p) or Pro (1080p) output

REST API with JSON request and response bodies

Use cases

Built for

Primary

Dance and performance - Copy choreography onto a new character plate

Short social skits - Replace the actor while keeping timing from the reference clip

Character IP - Put a mascot or illustrated character through recorded motion

Presentations - Explainers where a static brand figure should follow a recorded gesture

Education - Show a motion path once, then apply it to another subject for teaching clips

FAQ

About Kling Motion Control 3.0 API

It produces a new video where a supplied character image performs the motion from a supplied reference video. Version 3.0 is tuned for higher facial consistency relative to Motion Control 2.6.

Both take a motion clip and a character image. This endpoint targets better face stability and keeps the same practical controls: keep audio, orientation, and Standard or Pro output.

Video orientation favors full-body motion such as dance. Image orientation favors shots where camera motion or framing drives the effect more than complex body articulation.

Standard maps to 720p output and Pro to 1080p. Exact pricing is listed on the model pricing page for your account.

Use a clear view of the head, shoulders, and torso so the model can anchor the face and upper body during transfer.