"King of Competition" Doubao is on the Table: Who Feels the Pressure?

01/03 2025 463

"The inability to stop price wars among big model vendors reflects anxiety about the future. In this arms race of big models, Doubao aims to perform a 'miracle through sheer force'. @TechNews Original

The price war in the big model sector has raged on for a year...

Just before the new year, Alibaba Cloud announced its third round of price cuts for big models in 2024, with prices for Tongyi Qianwen's visual understanding models dropping by over 80% across the board.

Similarly, at Volcano Engine's recent Force conference, besides heavily promoting Doubao, the most noteworthy development was another price drop. Currently, the input price for Doubao's visual understanding model is 0.003 yuan per thousand tokens, allowing 284 720P images to be processed for just 1 yuan.

Previously, in May last year, the inference input price for Doubao's general model Pro-32k version was 0.0008 yuan per thousand tokens, costing less than 1 cent. This forced Alibaba Cloud to initiate a new round of price cuts for its three core Tongyi Qianwen models, with reductions of up to 90%. Baidu Intelligent Cloud was even more aggressive, announcing that two flagship products under its Wenxin large model line, ENIRE Speed and ENIRE Lite, would be fully free to use.

According to Tan Dai, President of Volcano Engine, "The market needs full competition, and cost reduction is a result of technological optimization. Only the best can survive." Clearly, in this arms race of big models, Doubao aims to perform a 'miracle through sheer force'.

However, amid ByteDance's intense internal competition, doubts persist: Is Doubao's pricing really cheap enough? Why are big models engaging in price wars? Will pricing continue to be a focus for companies securing orders in the future?

01 Exaggerated Discounts? Full of Tricks

To understand the tactics of big model vendors, it's essential to understand their business models. According to a review by "Yuanchuan Technology Review," the services currently provided by various vendors can be broadly divided into three types:

The first is basic model inference services, which involve providing answers based on input information. Simply put, it's the process of "actually using" the model. Each vendor has different model standards for this part.

The second is model fine-tuning, where vendors can charge based on token usage (training text * number of training iterations) according to customer needs. Billing occurs after training is complete, with post-payment based on usage.

The third is model deployment, where a customer exclusively occupies a portion of computing resources, making them a major customer. Pricing is based on the computing resources consumed or the number of tokens used for model inference.

These three pricing models represent the progressive development of big model development from shallow to deep. The intense price cuts by tech companies are primarily focused on the first type of basic service, namely, the inference fees for standard model versions. This pricing is further divided into "input" and "output" components. Simply put, input refers to the content of the user's question, while output is the model's response.

When invoking a big model, billing is often based on the number of input and output tokens. These subtle differences can easily become tactics for big model companies.

For example, Doubao's general model, Doubao Pro-32k, has an input price of "0.8 yuan per million tokens," which, according to official claims, is 99.3% cheaper than the industry average. Some mainstream models have also started to reduce prices. For instance, Alibaba Cloud's three main Tongyi Qianwen models, Qwen-Turbo, saw prices drop by 85% to 0.3 yuan per million tokens, while Qwen-Plus and Qwen-Max saw input prices drop by 80% and 50% to 0.8 yuan and 20 yuan per million tokens, respectively.

However, there are differences in output prices. The price of 2 yuan per million tokens is on par with competitors like Qwen-Plus and DeepSeek-V2 and even higher than some others like Qwen-Turbo and GLM-4-9B.

Looking at Doubao's latest visual understanding model, Doubao-vision-pro-32k, the input price is 3 yuan per million tokens, roughly 0.4 USD, while the output price is 9 yuan, approximately 1.23 USD. According to Doubao, this price is 85% cheaper than the industry average.

However, when compared to direct competitors: Alibaba's multimodal model Qwen-VL series matches its price after recent reductions; the multimodal Gemini 1.5 Flash model quotes 0.075 USD per million input tokens and 0.3 USD per million output tokens, with additional discounts for smaller contexts (less than 128k); GPT-4o mini charges 0.15 USD for input and 0.6 USD for output.

Not only Doubao but other domestic vendors also have similar pricing "tricks." For example, Baidu's announced free ERNIE-Speed-8K model has a fee of 5 yuan per million tokens when actually deployed. Similarly, Alibaba's Qwen-Max, like ByteDance's Doubao general model Pro-32k, only reduces the input price.

It's worth noting that reductions in standard model inference prices can indeed reduce costs for small and medium developers. However, for slightly more advanced usage involving model fine-tuning and deployment, these services have not been the main focus of price wars and have not seen significant price reductions.

In simple terms, the most significant price cuts by vendors have been on lightweight pre-configured models. In contrast, the actual price reductions for more powerful "super-sized" models are not as exaggerated. For instance, the fine-tuned Doubao-pro series models are priced at 50 yuan per million tokens, higher than the flagship models from vendors like Alibaba and Tencent.

The intense price wars initiated by major vendors are akin to playing an online game, using various tactics to attract players and incorporating various gameplay elements. In short, to become stronger, one must invest. Even so, these large companies have invested significant resources. So, why do these vendors focus so much on pricing?

02 To Succeed, the Heat Must Not Subside

Among big model vendors, ByteDance is certainly not among the fastest starters. Earlier this year, ByteDance CEO Liang Rubo mentioned the word "sluggish" in an internal speech, pointing out the company's lower sensitivity to big models compared to startups.

"It wasn't until 2023 that we started discussing GPT, whereas industry-leading big model startups were founded between 2018 and 2021," he said.

As a latecomer, ByteDance needs to engage in intense internal competition, and it has been doing just that since the middle of this year, generating wave after wave of buzz.

In addition to the obvious intention of Doubao to offer discounts to B-end customers mentioned earlier, Doubao has also made a full-fledged push into the C-end market.

For the C-end market, Doubao can be seen both online and in public places. According to "Lianxian Insight" citing AppGrowing statistics, as of November 15, among the ten domestic AI-native applications, Kimi and Doubao were the two most heavily invested in advertising, with investments of 540 million yuan and 400 million yuan, respectively.

Over a longer timeframe, Doubao's advertising investment has been even more intense. According to AppGrowing statistics, from April to May 2024, Doubao's advertising investment was estimated at 15-17.5 million yuan. In early June, Doubao launched another large-scale advertising campaign with an investment of up to 124 million yuan.

In addition to advertising, Doubao also benefits from Douyin's user base. ByteDance has essentially blocked all AI applications other than Doubao from advertising on Douyin. The goal is clear: to thoroughly address the "user anxiety" surrounding big model applications.

However, reality often diverges from expectations. According to "Intelligent Emergence" reports, ByteDance internally reflected that user activity on Doubao is not high. Doubao is only active for 2-3 days a week, with users sending around 5-6 messages per day, each lasting about 2 minutes, resulting in an average user session duration of only about 10 minutes. These figures have not shown significant growth over the past year.

In simple terms, despite the cost-free advertising investment, which has made Doubao the AI software with the largest number of domestic users, it still does not qualify as a killer app.

ByteDance management believes that AI conversation products like Doubao may only be an "intermediate state" of AI products. Internally, ByteDance judges that a paid subscription model is unlikely to work in China. Additionally, the low session duration and message frequency result in limited potential advertising space, creating an invisible ceiling for such products.

Therefore, in the long run, products with lower thresholds and a more "multimodal" form have greater potential for implementation. CapCut and Dreamlike may be suitable entry points, which is why Doubao focused part of its efforts on video models at this conference.

From a user perspective, according to "Caijing Magazine" reports, most users pay for products and services that provide value, which goes beyond solving specific problems like improving work efficiency or providing emotional support. There is also a type of value in the market that "aligns with policy directions." More importantly, the ability to identify and deliver to specific customers is crucial, testing AI companies' capabilities beyond technology and products. Often, this ability can help AI companies grow more than technical prowess.

The AI market in China differs from that in the United States, making it difficult to penetrate the market through platform-based software sales. Instead, commercialization often relies on securing individual projects and engineering contracts, the sources of which are often related to a company's popularity.

"When a mature enterprise is deploying big models, it's unlikely to consider an immature product or company. Disregarding cost, well-known brands are often the first choice, not only due to technical trust but also trust in overall service and quality," a technology enterprise manager told "TechNews." "After all, small companies still carry risks. It's like buying a car; if the car company goes bankrupt while you're driving it, that would be a significant loss."

Startups often create buzzworthy news, often to secure funding and survival. In contrast, Doubao, which already has a background, aims to use its popularity to attract and retain more customers. However, it's an unspoken fact in the industry that regardless of who you are or how advanced your technology is, maintaining popularity is crucial. After all, even the best wine can go unnoticed in a secluded alley.

03 Elimination Round, Potentially Ending Price Wars

In fact, not just Doubao but all second-tier and lower-tier big model vendors are currently in the stage of spending money to attract traffic to retain users. Behind this unyielding "competition showcase" lies furious product development and research speed, signaling the resumption of the elimination round for big model service providers focused on "bursting the bubble".

The year 2024 has already seen an elimination round, leaving only about 10% of big models in the finals, making the industry structure more rational.

However, this is not the end but the beginning. In the view of "TechNews," the focus of the new elimination round will no longer be price but technology.

Currently, tech companies are gradually realizing that simply launching a free application does not directly benefit the company. The C-end user base is hard to grow, and customer acquisition costs have significantly increased. It's more important to directly reach B-end customers willing to pay, such as those in the financial, government, and automotive industries.

However, when a large number of companies concentrate on a particular industry, a protracted price war often ensues as each company strives to establish a benchmark customer to pave the way for future market expansion. A simple and brutal price war can cause some companies to exit voluntarily or involuntarily, with prices returning to normal once the market stabilizes.

The contradiction lies in the fact that everyone wants to enter "lucrative" fields. In a prolonged price war, technological cost becomes the key to victory. Simply put, under the same solution and quotation, the company with the lower technological cost can afford to lose less and survive longer.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.