Doubao Blocked vs. Silicon Valley Alliance: Who's Undermining China's Trillion-Dollar AIoT Market?

Home

Finance

ICV

Smart City

Digital Live

Cloud

Optics

Home Finance AI ICV Smart City Digital Live Cloud Optics

12/19 2025 657

This marks my 396th column article.

On December 1, Doubao Mobile made its debut. ByteDance finally unveiled its ace in AI hardware. However, the contest barely commenced before the entire setup was disrupted.

The following day, users attempted to use Doubao's AI agent to operate WeChat. Tencent's backend swiftly raised an alarm: accounts were flagged for "abnormal login environments" and were forced offline, with some accounts temporarily suspended. Alibaba followed suit: on platforms like Taobao, Xianyu, Damai, and others, Doubao's automated operations frequently triggered human-machine verification, leading to crashes and forced logouts. Banks were even more stringent: the Agricultural Bank of China and China Construction Bank completely blocked agent logins and payment channels, citing "risky environments." This signified a collective "immune rejection" by China's internet sector.

On December 5, the Doubao team announced restrictions on agent operations in areas such as scoring, incentives, financial payments, and certain gaming scenarios. In simpler terms, it was a strategic retreat. From launch to compromise, less than five days had elapsed.

Four days later, a starkly contrasting signal emerged from across the Pacific.

On December 9 (local time), Anthropic announced the formal donation of the Model Context Protocol (MCP) to the AI Agent Foundation under the Linux Foundation. This meant MCP was no longer a proprietary asset of a single company but would become a neutral, open standard. Anthropic made a strategic choice: relinquish exclusivity in exchange for industry consensus.

On one hand, there was containment and blockade; on the other, open collaboration. Together, these events unveiled the core contradiction in current AI development.

Doubao's GUI agent approach represents a form of unauthorized "digital parasitism." It circumvents app-built barriers by simulating human clicks to access services. In the short term, this seems to bypass interface limitations; however, fundamentally, it's a crude violation of platform data sovereignty. From WeChat's or Taobao's perspective, this isn't technological innovation but traffic hijacking—an undeclared attack.

AI agents demand "most-favored-nation treatment" for data access, while platforms perceive this as "forced entry."

The positions of the two sides are irreconcilable; conflict is inevitable.

More critically, this "visual recognition + simulated clicks" approach is a dead end. Without foundational protocol support, AI agents can only function as "hackers," clashing with apps' anti-scraping and anti-cheat mechanisms. While mobile devices with substantial computing power and frequent updates might reluctantly (barely) maintain compatibility through OTA updates, the situation is catastrophic for broader AIoT devices—smart glasses, speakers, appliances. Imagine: your smart fridge relies on simulated clicks to order takeout. One day, Meituan updates its UI, moving buttons a few pixels. The fridge becomes "blind." This isn't hypothetical; it's the inevitable fate of GUI agent routes. An AIoT ecosystem built on "cracking" and "simulation" would be fatally fragile. Clearly, this path is unsustainable.

Silicon Valley's Alliance: Ending Chaos with Protocols

While domestic firms grapple with the legality of "simulated clicks," Silicon Valley has shifted its strategy.

On December 9, the Linux Foundation announced the AI Agent Foundation (AAIF). The member list is noteworthy: while AWS, Google, and Meta were expected, OpenAI and Anthropic—fierce rivals in large models—now sit at the same table.

This isn't a routine industry alliance but a reshaping of interest landscapes.

The "handshake" wasn't driven by idealism but by pragmatic cost-benefit analysis: in the agent era, single-model intellectual advantages are plateauing; interoperability is the real bottleneck. If every AI must develop custom interfaces for thousands of SaaS apps—or resort to brute-forcing frontends like Doubao—the industry's marginal costs become unsustainable.

The giants realized: the ecosystem value unlocked by interoperability far exceeds monopoly gains from closed systems. Instead of building moats, they'd expand the pie together.

The first fruit of this consensus is Anthropic's donated MCP (Model Context Protocol).

MCP addresses a fundamental issue: how do large models connect to external data? Previously, integrating models with local files, databases, or Slack required custom code for each source—tedious, costly, and unstable. MCP standardizes these connections: one interface for all data sources. Models and data are decoupled.

The AI Agent Foundation's "founding projects" extend beyond MCP: OpenAI donated AGENTS.md, and Google contributed frameworks for building agents and workflows.

If MCP is the USB-C of charging interfaces, AGENTS.md is the user manual for AI. It explicitly informs AI what data a website/app offers, which APIs are callable, and how to pass parameters. Combined with Google's open-source A2A (Agent-to-Agent) protocol—a universal execution framework for AI engineering—developers gain a complete toolchain from connection to cognition to execution.

The intent is clear: upgrade agent interactions from "guerrilla warfare" to "organized forces."

Doubao's GUI agents rely on visual recognition and simulated clicks, essentially tinkering with app surfaces—fragile, inefficient, and legally risky. MCP-based interactions use API pipelines to access core data directly, with clear paths and accountability.

Silicon Valley is drafting not just technical specs but AI's foundational communication protocols. Just as TCP/IP defined internet data transmission, MCP aims to define a universal language for AI to understand and operate the external world.

The Deadlock Behind 70% Penetration

According to the "AI+" action plan, China's national timeline is set: by 2027, penetration of next-gen smart terminals and agents will exceed 70%; by 2030, it will surpass 90%. These are not aspirations but mandatory targets.

The question is: how?

If Xiaomi's AC can't follow Baidu's commands or Huawei's phone can't access Alibaba's services, "penetration" becomes a cluster of isolated islands—not adoption but internal friction.

The reality is: hardware makers build walls to lock users into device ecosystems; internet giants dig moats to hoard data. Each side fortifies its bastion, fragmenting the ecosystem.

Fragmentation is the biggest obstacle to AIoT scale.

Worse, this internal divide faces external pressure. The U.S. has unified via AAIF; if China fails to offer an equivalent standard, it risks two traps.

Trap 1: Directly adopting MCP.

This seems convenient, but with data sovereignty and U.S.-China tech decoupling intensifying, surrendering control over foundational interaction protocols is dangerous. Protocol standards are never neutral; they dictate data flow, access rights, and exclusions.

Trap 2: Isolated competition.

If Alibaba, Tencent, and Huawei each create proprietary protocols, developers face redundant work, inflating R&D costs and slowing iterations—ultimately delaying industry-wide adoption.

Between "being defined" and "self-chaos," China's AIoT options are narrowing.

The Window to Break Free is Closing

Standard vacuums don't last forever. China must define its rules or be defined by others.

The path is clear: China needs its own agent interconnection protocol (tentatively CN-MCP). The biggest hurdle isn't technology but leadership. If Baidu leads, Tencent won't follow; if Huawei sets the standard, Xiaomi may reject it. Any giant-led standard will be seen as "self-serving" and lack industry trust.

The only viable route is a national industry alliance or neutral open-source foundation to leverage credibility and break down barriers.

Even if leadership is resolved, CN-MCP can't mirror the U.S. model. China's ecosystem differs: U.S. internet is Web/SaaS-driven and open; AI agents can directly scrape web data via APIs. China's services are concentrated in super-apps like WeChat, Douyin, and Meituan—black boxes of mini-programs and native apps, inaccessible externally.

Thus, CN-MCP must solve not just "connection" but "service atomization." Instead of relying on simulated clicks to operate apps (a proven dead end), it must push super-apps to decompose internal functions into standardized, externally callable interfaces. Meituan's ordering, Ctrip's services, WeChat's chat, 12306's ticketing—all should become atomic services directly usable by AIoT devices.

This requires systemic change.

Governments must elevate agent interconnection standards to "new infrastructure" status. This isn't optional but the digital economy's foundational plumbing. Without unified protocols, AIoT scale remains a pipe dream.

Internet giants must also recognize: in the mobile era, closure might retain traffic; in the AI era, closure means marginalization. If your services can't be read or called by agents, you'll vanish in the IoT world. Opening interfaces to make apps AIoT's foundational infrastructure is the only path to survival.

In the AI era, closure isn't a moat—it's digging your own grave.

Epilogue

Doubao Mobile's fate isn't a product failure but a path failure.

The wall it hit—giant blockades, missing interfaces, ecosystem fragmentation—isn't an accident but an inevitable reaction to the current order. Without universal protocols, any attempt to breach walls is treated as an invasion.

But that wall is crumbling.

GUI agents relying on cameras to "see" screens and simulate clicks are transitional solutions—the only viable path during the window when old interfaces haven't collapsed and new protocols aren't established. But they're not the endgame. The true future is universal protocols replacing proprietary interfaces, with services flowing like water and electricity through standardized pipelines.

What will AIoT devices look like then? No need to preinstall dozens of apps to hog computing power and memory—just a built-in universal protocol. Hardware returns to sensing and interaction; services are called on-demand, instantly delivered.

The question remains: who defines these protocols?

The internet era connected people; the agent era connects things and services. Whoever sets the connection standards will shape the next decade's underlying rules. This standards battle is one China cannot afford to sit out.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.

Newest

Links