The Survival Rules for Cloud Providers Have Changed in the 'Token' Era

Home

Finance

ICV

Smart City

Digital Live

Cloud

Optics

Home Finance AI ICV Smart City Digital Live Cloud Optics

04/03 2026 575

Source | Bohu Finance (bohuFN)

Since the beginning of the year, a 'lobster craze' has swept the nation, with open-source AI agents like OpenClaw quickly gaining popularity. Users worldwide are busy 'employing' digital workers to handle tasks, leading to an exponential surge in Token consumption.

In response, NVIDIA CEO Jensen Huang suggested paying salaries in Tokens, while Alibaba and Tencent have started offering Tokens as employee benefits. While this may sound fantastical, with Token prices soaring, the prophecy of 'computing power as compensation' is nearing reality.

Recently, Alibaba Cloud, Baidu Cloud, and Tencent Cloud have raised prices for AI computing power, storage, and related products, with increases exceeding 30% in some cases. In January, Amazon AWS and Google Cloud had already initiated a round of price hikes, breaking the cloud computing industry's tradition of 'only decreasing, never increasing' pricing.

The pricing rationale among cloud giants is remarkably consistent: demand for computing power continues to rise, while costs for core hardware and related infrastructure have surged significantly. Simply put, the supply of computing power can no longer keep pace with consumption.

The National Data Bureau noted that China's daily Token invocation volume reached 100 billion in early 2024, soaring to over 140 trillion by March of this year—a thousandfold increase in two years. A 'Token revolution' is taking shape.

As competition in the AI era shifts from 'model competition' to 'computing power competition,' tech giants are accelerating their strategic realignments. Those who can 'burn' Tokens more efficiently will gain pricing power in future commerce.

01 Cloud Providers Collectively Raise Prices

Why have Tokens suddenly become so valuable? First, let's clarify what a Token is.

A Token is the smallest 'computational unit' when AI processes information. When we submit a sentence, code snippet, or image to AI, it is broken down into individual Tokens, which large models then interpret, predict, and generate.

In simpler terms, Tokens can be likened to 'kilowatt-hours' in a power plant—the more electricity (large model usage) consumed, the higher the cost (Token consumption).

However, Tokens were initially inexpensive or even free.

By late 2022, ChatGPT had opened the door to artificial general intelligence (AGI) with its large language model. Over the following year, domestic and foreign tech giants rushed to develop their own general-purpose large models, bringing Token consumption into the spotlight.

Based on parameter and Token counts, Guosheng Securities estimated that training GPT-3 once costs approximately $1.4 million, while larger LLM models range from $2 million to $12 million.

Despite these costs, in 2024, Alibaba, ByteDance, Baidu, and other giants not only offered free C-end services but also ignited a brutal price war in the B-end market, slashing API call prices from 'cents' to 'millicents.'

During the initial surge of large models, the industry consensus was that computing power would become increasingly affordable, even evolving into an internet infrastructure akin to broadband today.

Building on this assumption, giants adhered to internet-era thinking, treating model capabilities as entry-point resources. They aimed to attract developers and enterprises with ultra-low prices, seeking first-mover advantages in commercialization.

But the narrative didn't unfold as expected. The 'Hundred-Model War' ended abruptly after just a year, as giants realized that generative dialogue offered limited commercial value. Large models needed to be deployed in vertical scenarios to unlock greater competitiveness.

However, as AI shifted from 'training' to 'inference,' every conversation, generation, and deduction required new computations, meaning Token demand no longer grew linearly but exponentially.

The emergence of programming agents like Claude Code and OpenClaw further intensified Token demand. These agents can work around the clock, with each generating hundreds or thousands of sub-agents to handle different tasks.

Developers report that transitioning from chatbots to agents amplifies computational consumption per task by 30–100 times, potentially exceeding 1,000 times in extreme scenarios.

For example, a student using AI to write a 7,500-word paper might consume around 10,000 Tokens without revisions. By this logic, pure text-based conversations consuming a million Tokens daily seem substantial.

Yet, an agent performing a simple task could trigger millions of Token consume (consumption). Some users report spending 50 million Tokens running OpenClaw for half a day, while others claim to burn thousands of dollars monthly programming with it.

As agents become a broader public demand, users have learned to shop around, favoring cheaper domestic models that now dominate global developer communities—one of the biggest beneficiaries of this Token boom.

Recently, multiple media outlets cited insiders claiming Kimi's cumulative revenue over the past 20 days exceeded its total income for 2025, while newly listed Minimax and Zhipu saw their stock prices soar to new heights.

However, with users flooding in domestically and abroad, Kimi frequently displays 'insufficient computing power during peak hours' warnings, while MiniMax directly imposed traffic limits...

Ultimately, these smaller domestic models lack their own GPUs and rely on pricing set by cloud providers like Alibaba Cloud, Tencent Cloud, and Volcano Engine to profit from the 'lobster craze.'

Yet computing power remains costly, with GPU markets in short supply. When Token revenue growth fails to outpace data center construction costs, price hikes become inevitable.

02 The Race to Become 'Token Factories'

Over the past year, surging Token demand has multiplied cloud providers' revenues several times over.

According to multiple reports, Volcano Engine achieved over RMB 20 billion in revenue in 2025, up from over RMB 12 billion in 2024. Alibaba's Q3 FY2026 earnings showed cloud revenue accelerating by 36%, its fastest growth in three years, with AI-related product revenue posting triple-digit growth for the tenth consecutive quarter.

But we must consider both revenue and investment. Alibaba's 2025 capital expenditures reached RMB 123.8 billion, while the Financial Times reported ByteDance's capital spending at around RMB 150 billion. How long will it take to recoup these costs through 'computing power sales' alone?

Tech giants are unwilling to remain mere 'shovel sellers.' Since Tokens are the 'hard currency' of future AI competition, they aspire to control the entire Token lifecycle—production, scheduling, and monetization—becoming the 'utilities' of the intelligence era.

At NVIDIA's GTC 2026 conference, Jensen Huang outlined a new vision: transforming NVIDIA from a 'chip company' into an 'AI infrastructure and factory company.' Instead of merely selling chips, NVIDIA aims to become a 'Token factory.'

NVIDIA plans to acquire Groq to bolster inference capabilities, while its newly released Vera CPU enables real-time scheduling and coordination of agent AI systems. Combined with CUDA's ecological moat and partnerships with power giants to build next-gen AI factories, all Tokens—from creation to consumption—will flow through NVIDIA's 'Token factories.'

Under Huang's tiered pricing strategy, Token prices will directly correlate with computing power output—greater capability and speed command higher prices, with NVIDIA controlling the roadmap and pricing power.

Alibaba shares a similar strategic direction but aims beyond being a Token supplier and infrastructure operator—it seeks to explore new businesses to drive greater Token consumption.

Recently, Alibaba established the Alibaba Token Hub (ATH) business group, led directly by Wu Yongming, with the core objectives of 'creating, delivering, and applying Tokens' to further strengthen AI business synergies.

Alibaba then launched 'Wukong,' an enterprise-grade AI-native work platform targeting the B-end market. Initially covering ten scenarios, including e-commerce, design, manufacturing, and finance, Wukong functions as a data army capable of embedding into enterprise workflows.

Chen Hang, founder of DingTalk, who recently rejoined Alibaba, noted that the value of Token economics lies not in replacing humans with machines but in converting vast human resource costs into digital productivity amplified 10–100 times.

This aligns with the previous goal of improving operational efficiency through SaaS (Software as a Service). However, SaaS primarily sustains operations, whereas Wukong, built on Maas (Model as a Service), drives quantifiable business growth. Once enterprises anchor their computing power and data within Alibaba's ecosystem, migration costs will rise, forming a core barrier.

ByteDance's Volcano Engine shares similar ambitions, prioritizing Maas to drive comprehensive cloud business breakthroughs.

Volcano Engine's advantage lies in ByteDance's national-level (national-level) apps like Douyin and Jianying, which enable rapid validation and iteration of image and video generation models in user scenarios with hundreds of millions of users, creating a closed-loop advantage of 'computing power-models-scenarios.'

For instance, the Seedance 2.0 model quickly gained traction with its breakthrough capabilities, attracting massive users and achieving scaling deployment in professional scenarios like animated dramas, directly driving a surge in Volcano Engine's Token consumption.

According to LatePost, ByteDance's Volcano Engine processes over 100 trillion Token calls daily, making it the world's third cloud provider to surpass this milestone.

However, domestic cloud providers are not idle. Alibaba CEO Wu Yongming stated that the newly formed ATH business group aims to capture 80% of China's AI cloud market growth in 2026.

Currently, Alibaba Cloud leads China's AI cloud market with a 35.8% share, followed by Volcano Engine at 14.8%. The battle for Tokens is imminent.

03 Tokens as 'Circulating Currency'

Despite differing approaches, tech giants' strategic layouts converge on one truth:

AI-era competition extends beyond model capabilities. Giants are vying for control over a Token production, distribution, and application system. Whoever controls Token pricing will dominate the most valuable production assets of the AI era.

Previously, tech giants competed for user attention; in the future, they will focus on where Tokens originate, flow, and how they are consumed.

First, producing high-quality Tokens at lower costs. While data, algorithms, and scale were once the internet industry's 'moats,' giants must now embrace 'heavy assets,' prioritizing computing infrastructure as a new strategic frontier.

China's affordable and stable electricity supply, coupled with breakthroughs in chips and model architectures by tech firms, provides inherent advantages for global computing power expansion. This round of collective cloud provider price hikes signals their intent to capture profit margins in global markets.

Second, seizing Token distribution rights. Giants control large model technologies, converting underlying computing power into standardized, callable Token services, further lowering barriers for users to access model capabilities.

For example, Alibaba's 'Wukong' targets enterprise work platforms, while WeChat introduced plugins supporting OpenClaw integration. Once users deploy these tools, continuous Token consumption follows, as giants compete for new 'Token entry points.'

Finally, stimulating Token consumption. Possessing Tokens alone doesn't close the commercial loop; embedding model capabilities into real business scenarios drives sustainable Token consumption.

Long-term success for 'Token factories' hinges on 'consumption-output' conversion efficiency. The goal isn't selling one-time computing power but binding Token consumption to customer value, creating stable, sustainable commercial revenue.

Yet rules evolve. AI's development has far outpaced human expectations, and no one could have predicted OpenClaw's sudden rise years ago.

Thus, whether tech giants, vertical model companies, or agent tool developers, all must continue advancing in their respective domains while addressing gaps.

For giants to transform AI from an abstract concept into a tangible business, the competition will no longer hinge solely on parameter races or Token consumption. Ultimately, it will follow a fundamental logic: AI that ordinary people can use effectively is the most valuable AI.

The true commercial entry point to the AI era may feel near—or still distant.

The cover image and illustrations belong to their respective copyright holders. If copyright owners deem their works unsuitable for public browsing or gratuitous use, please contact us promptly, and our platform will immediately rectify the issue.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.

Newest

Links