05/06 2025
500
In early 2024, OpenAI's announcement of the Sora AI video generation model stirred the domestic large model industry. Shortly thereafter, domestic players turned their focus to large video models, with Kuaishou unveiling its Keling video generation model and ByteDance releasing the Doubao model, marking the commencement of a domestic AI video generation race.
Numerous companies have hastily entered the market, eager to seize the initiative in this burgeoning sector, often at the expense of refining technological maturity and application standards.
Take the proliferation of AI-generated videos on social platforms as an example. Quan Hongchan's family was maliciously impersonated to attract traffic and sell goods, while celebrities like Liu Xiaoqing and Zhang Xinyu also reported instances of AI-generated videos being used to impersonate them on video accounts. These forged videos not only infringe upon the rights and interests of others but are also driven by commercial interests, filled with false marketing that dupes unsuspecting consumers.
The field of AI video generation stands on the cusp of an application explosion. Players like Kuaishou and ByteDance aim to expand their products' reach through technological iteration, yet they also find themselves ensnared in the quagmire of disorderly competition.
The AI Video Track: Immature Yet Promising
While the AI video generation sector appears vibrant, it is still in its nascent stages of development.
Despite major players feverishly investing resources to continually push the boundaries of model performance indicators, from shortening generation time to enhancing image quality, seemingly propelling the AI video generation track toward a "technological breakthrough period," imperfections persist in the application of AI video technology.
Kuaishou, for instance, recently released the Keling 2.0 video generation model and the Ketu 2.0 image generation model. Since its initial open beta on June 6, 2024, Keling AI has undergone over 20 rapid iterations, excelling in dimensions such as dynamic quality, semantic response, and image aesthetics.
Similarly, ByteDance's Seed team launched Seedream 3.0. According to the third-party ranking by Artificial Analysis, Seedream 3.0's comprehensive performance rivals the text-to-image SOTA model GPT-4, positioning it in the global first tier.
However, these advancements also expose myriad issues within the AI video generation field.
On one hand, some companies overly prioritize the speed of technological iteration to distinguish themselves in the competition, neglecting the practical application scenarios of their products, resulting in numerous flaws in the real-world application of AI video generation technology.
For example, when a user inputs "Dunhuang Flying Apsaras," the AI might generate a virtually distorted image. When asked to produce science popularization videos, objects often appear suspended, violating physical laws. MIT Media Lab tests reveal that the accuracy rate of current mainstream AI video models in maintaining complex narrative coherence is less than 63%, severely limiting the boundaries of AI video generation technology in commercial applications.
On the other hand, with the rapid technological advancement, copyright issues have increasingly come to the fore, becoming a significant "minefield" in the realm of AI video generation.
When AI integrates multiple copyrighted materials to create new content, the existing principle of "fair use" no longer applies, exposing enterprises to considerable legal risks regarding copyright. From Getty Images suing Stability AI for using its copyrighted images to train models to the Japan Cartoonists Association banning the use of AI to generate storyboards, and the global first AI video copyright case with a judgment amount of up to $230 million, copyright issues remain unresolved.
More importantly, despite significant progress in image quality and generation speed, current AI video generation technology still lags in the logical coherence of video content, the depth of emotional expression, and the understanding of complex scenes, failing to fully meet the market's demand for high-quality and diverse video content.
This underscores the lengthy and arduous journey for Kuaishou to establish Keling AI as the world's top-earning video generation AI application.
ByteDance and Kuaishou: A Race to Dominate
Currently, the AI video generation field is in turmoil, with ByteDance and Kuaishou, as industry giants, vigorously competing to establish their dominance. Their strategies differ, presenting a distinct competitive landscape.
Kuaishou advocates for commercialization first, creating a diversified monetization model, and has achieved remarkable success in the AI video generation sector.
Technologically, Kuaishou's Keling AI demonstrates robust capabilities, providing a solid foundation for its commercial expansion. According to rankings released by Artificial Analysis, a globally renowned AI benchmark testing agency, Kuaishou's Keling 1.6pro (high-quality mode) topped the image-to-video track with an Arena ELO benchmark test score of 1000, surpassing strong competitors like Google Veo 2 and Pika Art.
In terms of commercialization, Kuaishou leads the pack. Keling AI employs a diversified monetization model encompassing "C-end subscription + B-end API services + customized scenario solutions," catering to both individual users' creative needs and providing potent video generation tools for enterprise customers.
It is reported that the global user base of the Keling video large model has surpassed 22 million, attracting over 15,000 developers and enterprise customers worldwide to apply the Keling API in various industry scenarios. As of the end of February 2025, the cumulative operating revenue of Keling AI exceeded 100 million yuan.
ByteDance, on the other hand, showcases different ambitions and strategies, with its AI video generation layout focusing more on the exploration of AGI (Artificial General Intelligence).
Earlier this year, ByteDance's Doubao large model team internally established an AGI long-term research team codenamed "Seed Edge," encouraging project members to delve into longer-term, uncertain, and ambitious AGI research topics. The goal is to explore novel AGI methodologies and foster cross-modal and cross-team collaboration.
ByteDance's recently released Seaweed-7B model boasts cutting-edge technologies such as audio-visual integrated generation, long-shot generation, and real-time generation. The model can produce high-quality video content in just 25 seconds, with a reasoning speed 62 times faster than similar models. Its performance is comparable to Wan 2.1-14B, and in the text-to-video task, its Elo score ranks second only to the top model Veo 2.
In contrast to Kuaishou's commercialization-first approach, ByteDance places greater emphasis on deep cultivation and breakthroughs at the technical level, aiming to achieve long-term development and overall leadership in the AI video generation field by tackling the ultimate challenge of AGI. Kuaishou must further invest in technological research and development while maintaining its commercialization advantages, enhancing the quality and efficiency of AI video generation, to avoid being overshadowed by ByteDance.
Bigger Goals, Greater Challenges
Amidst the surging wave of AI, Kuaishou aims to reshape its business landscape with the power of AI, with the AI video generation track being a crucial part of its grand vision.
From Kuaishou's revenue data, total revenue in 2024 increased by 11.8% year-on-year to 126.9 billion yuan, with adjusted net profit surging by 72.5% to 17.7 billion yuan. Both online marketing services and other service revenues witnessed year-on-year growth rates exceeding 20%. Behind these impressive results, AI has emerged as a pivotal growth engine.
Kuaishou's AI aspirations are ambitious, and the challenges it faces are comprehensive and formidable.
On one hand, Kuaishou intends to continuously strengthen the transformation of AI technology within its original business ecosystem, unlocking new growth avenues for online marketing services and e-commerce businesses.
Theoretically, AI can precisely analyze user needs and provide more effective marketing solutions for merchants. However, in practice, numerous hurdles exist. In content production, AI can only play an auxiliary role and cannot fundamentally address the issue of uneven content quality. In terms of operating costs, while the introduction of AI technology may reduce costs in certain aspects, it also incurs new expenses such as technology usage fees and data security maintenance costs.
On the other hand, Kuaishou is fully upgrading its AI commercialization monetization model, represented by Keling, attempting to position Keling AI as the AI infrastructure for the global $120 billion video creation market. This endeavor aims to facilitate a glamorous transformation from "China's second-largest short video platform" to "leader in the AI video ecosystem."
This is undoubtedly a high-stakes endeavor. Building infrastructure necessitates possessing robust technical versatility, stability, and widespread market recognition. Kuaishou must invest substantial resources in technological research and development, ecosystem construction, and market promotion. Currently, technical standards within the industry are not yet unified, and the market landscape is fragmented. In promoting its own technology as an industry standard, Kuaishou will inevitably encounter resistance from various quarters.
Opportunities and challenges coexist. If Kuaishou fails to make profound adjustments and optimizations in technological research and development, business strategies, and industry responsibility, its vision of becoming the "leader in the AI video ecosystem" may prove to be a mirage, ultimately being ruthlessly eliminated in the fierce market competition and industry transformation.