Vidu AI
Developed by Shengshu Technology in collaboration with Tsinghua University, Vidu AI empowers creators across various industries—including film, animation, and advertising—to produce engaging content efficiently.
Evolution of Vidu AI Models
Since its inception, Vidu AI has undergone significant advancements:
- Vidu 1.0: Launched with the Universal Vision Transformer (U-ViT) model, Vidu 1.0 enabled users to generate 16-second, 1080p videos from simple text prompts.
- Vidu 2.0: The latest version enhances video generation speed—producing content in under 10 seconds—and reduces costs to as low as 4 cents per second. It also offers improved style and subject consistency, smoother transitions, and batch generation capabilities.
Text-to-Video Conversion
Vidu AI's text-to-video feature enables users to transform written descriptions into dynamic, high-quality videos. By inputting a text prompt, the AI analyzes the content and generates a corresponding video, selecting relevant visuals and animations to bring the narrative to life.

Reference-to-Video Generation
The reference-to-video feature allows users to create videos that maintain consistency in style, tone, and structure based on a provided sample. By uploading a reference video, Vidu AI analyzes its key elements—such as visuals, pacing, and transitions—and generates a new video that aligns with the original's aesthetic.

Image-to-Video Transformation
Beyond text prompts, Vidu AI allows users to animate static images, converting them into engaging video content. By uploading an image, the AI applies animations, transitions, and effects, creating a polished video that adds depth and motion to the original picture.
