Vidu: China’s competitor to OpenAI’s Sora in AI video generation

Tsinghua University and Shengshu Technology introduce Vidu, a Chinese AI tool rivaling OpenAI’s Sora in video generation, incorporating Chinese cultural elements.

Tsinghua University in collaboration with Shengshu Technology has unveiled ‘Vidu,’ a new AI tool capable of generating short videos from text prompts, positioning itself as a competitor to OpenAI’s Sora.  

Vidu is designed to generate 1080p quality videos lasting up to 16 seconds, in contrast to Sora, which produces videos up to 60 seconds long. However, it is tailored specifically to incorporate elements of Chinese culture into its outputs. This feature distinctly positions Vidu as a cultural venture in addition to it being a technological innovation. During the announcement at the Zhongguancun Forum in Beijing, Zhu Jun, Chief Scientist at Shengshu Technology and Deputy Dean at Tsinghua’s Institute for AI, highlighted Vidu’s unique features. According to Beijing News, he described Vidu as ‘imaginative,’ capable of simulating the physical world and producing 16-second videos with consistent characters, scenes, and timelines’ that comprehend and integrate Chinese elements. The launch event showcased Vidu’s capability to interpret and visually represent text inputs into engaging video clips with vivid details.

This initiative is part of a broader strategy by China to develop cutting-edge technologies, particularly generative AI, to compete on the global stage. The announcement of Vidu comes at a time when AI-generated media is gaining significant attention worldwide, with applications ranging from entertainment to educational content creation. The ability of tools like Vidu to produce high-quality video content rapidly opens up new possibilities for content creators and industries looking to leverage AI for creative expression and communication.

Industry experts, however, have pointed out insufficient computing power as a significant barrier to the progress of Chinese firms like the developers of Vidu. For comparison, OpenAI’s Sora requires eight Nvidia A100 GPUs, running for over three hours, to produce a one-minute video clip. “Sora demands extensive computing resources for inferencing,” says Li Yangwei, a Beijing-based technical consultant specializing in intelligent computing. Further complicating the situation, the US has tightened export controls on advanced chips, including Nvidia’s A100 and H100 GPUs. These components are crucial for training AI systems but are now restricted from being shipped to China. This poses additional challenges to the development and deployment of advanced AI technologies like Vidu, reflecting the ongoing geopolitical tensions that impact the global technology landscape.