Agent 2025: Navigating the Narrow Path and Broad Horizons of AI

08/11 2025 535

With the advent of the "First Year of AI Agents," does the sector merely shroud itself in an illusion of prosperity?

Content/Lin Shu

Editor/Yong'e

Proofreader/Mangfu

The release of GPT-5, boasting enhanced capabilities in complex reasoning, multimodal fusion, and autonomous agent functionalities, has sparked debates. Some herald the dawn of the "AI Agent era," while others remain cautious, viewing GPT-5 as a "reshuffle for AI Agent entrepreneurs."

As Zhu Xiaohu aptly puts it, current AI agent startups "are akin to individual webmasters in the early days of the internet," embodying grassroots spirit but facing brutal competition. The withdrawal of Manus, once hailed as a "national-level product" with beta invitation codes fetching up to $10,000 each, has further fueled discussions in this sector.

In fact, Manus' predicament mirrors that of many current Agent products.

Despite 2025 being labeled the "First Year of AI Agents," the sector has witnessed explosive growth, with star products like Coze Space, GenSpark, Xinxiang, and Xinliu emerging. However, they still grapple with multiple challenges, including technology, commercialization, and product-market fit (PMF).

Specifically, Agent products incur high development and operational costs but face low user willingness to pay, with commercialization models yet to mature. A 2025 report by Galaxy Securities reveals that the average customer acquisition cost (CAC) in the AI Agent industry stands at $50 per user, while the average user lifetime value (LTV) hovers between $20-$30, indicating that most products are yet to turn a profit.

Moreover, many Agent products fall short of user expectations and suffer from severe functional homogenization, leading to high user churn rates and difficulties in fostering long-term user loyalty. So, amidst the fanfare of the first year, does the AI Agent sector merely mask a false prosperity?

Part 1: Structural Dilemmas Amidst Prosperity: The "Single-Point Dilemma" and "Organizational Gap" of AI Agents

Overall, the current AI Agent market lacks products that can truly transcend cycles and demonstrate that "Agents understand execution better than humans."

The root lies in two deep-seated structural issues faced by current Agent products: firstly, they are generally "single-point" empowered; secondly, many enterprises focus on creating so-called "general" functions.

Most current Agent products tend to focus on optimizing single tasks or specific scenarios (e.g., information retrieval, report generation, task automation) but lack the ability to coordinate and integrate multiple links in the enterprise production chain. This "single-point" empowerment model hinders Agents from playing a "hub" role in complex, cross-departmental business processes.

This phenomenon stems from both technical shortcomings and organizational lags.

Technically, some Agent applications are not fully mature and often experience lag, failure, or excessive time consumption when executing tasks involving complex logic, multiple steps, or multiple tools.

Take Manus as an example. Many users found that when tasks involve multiple tools (such as files, emails, Notion, and cloud storage), Manus often gets stuck during execution, transmits incorrect step results, or takes over an hour. This reflects that such Agent applications lack an explicit memory mechanism, resulting in frequent loss of state information in multi-turn conversations or even misuse of old information, or that the interfaces of various tools lack a unified protocol, relying solely on prompts for invocation.

Products like Coze Space, when tasked with "creating charts based on data," often produce subpar results in terms of completion status and quality, failing to meet "qualified" requirements.

This shows that a considerable number of current Agents basically only have a single layer of prompts to call APIs, lacking a structured, unified data interface, and a corresponding reasoning chain.

Organizationally, many enterprises have not actually completed the transformation to "human-machine collaboration" suitable for the AI era.

A clear example is that in the first half of this year, a considerable number of enterprises implemented the programming Agent application cursor but received feedback that such applications did not significantly improve efficiency.

The reason is that in the actual operation of enterprises, a piece of code often goes through multiple stages from creation to actual use, including requirement clarification, task breakdown, code development, review, testing, and cross-departmental coordination.

The current issue is not that Agents write code slowly but that enterprises have not "embedded" Agents into their processes. The entire "software delivery pipeline" is still human-led, approval-based, and serialized.

As a result, while AI may save 20% of development time, 60% of the bottlenecks in the process are not in the coding stage but in organizational processes and human factors. This completely offsets the efficiency gains brought by Agents in the face of outdated "human-governed" processes.

Part 2: The Division of Agents: The Empty Hype of General Agents and the Challenges of Vertical Deep Diving

Among the various Agents emerging this year, many star products such as Manus, GenSpark, and Coze Space have chosen the path of "general Agents."

After all, compared to vertical Agents, the concept of "general" Agents sounds more alluring and offers greater imagination space. For investors, the narrative of "building an AI operating system" is far more compelling than "developing an HR reimbursement assistant." Early users are also more easily drawn to the vision of "all-rounder" Agents, as general Agents seem more advanced, versatile, and capable of creating a FOMO effect.

However, there is a significant gap between reality and vision. The current form of general Agent technology is more akin to a moderately intelligent virtual assistant, struggling to handle core functions such as system scheduling and permission management.

For individual users, general Agents are currently in an awkward position. They often address trivial matters such as ordering takeout, booking hotels, and life planning, which are not deeply painful needs. These demands are usually not urgent and have vague evaluation criteria.

In these scenarios, users are more concerned with "mindset" and "experience" rather than pure "efficiency." For example, when ordering takeout, people often care more about which restaurant to order from rather than the speed of placing the order.

In contrast, some AI Agents focused on "specialized, narrow, and deep" vertical fields and on solving specific pain points for enterprises have achieved considerable success this year.

For example, in the financial industry, Muffintech, as an insurance customer service Agent, can automatically handle common customer service inquiries (such as policy status) with a 98% accuracy rate and shorten claims approval time to one day, saving insurance companies $5 million annually.

In the legal industry, Harvey, which specializes in document drafting, addresses pain points such as the time-consuming manual research (an average of 20 hours per case) and high error rates in document drafting. It achieves automatic analysis of legal cases and regulations and generates research reports with a 90% accuracy rate, bringing intuitive efficiency improvements to law firms.

Although these vertical Agents may appear unsophisticated and not technically complex, they are not easily replicable by any enterprise. There are multiple thresholds and difficulties involved.

Vertical fields require a large amount of industry data with high collection thresholds, and models must be fine-tuned or retrained for specific scenarios.

For example, Agents in the manufacturing industry need to process sensor data, while legal Agents need to generate logically sound documents. These tasks require extremely high accuracy.

This makes it necessary for model teams to not only be proficient in AI technology but also familiar with industry knowledge. Such compound talents are very scarce and costly to recruit.

At the same time, vertical Agents need to seamlessly integrate with the enterprise's existing industry standard systems (such as SAP, Salesforce) to achieve data sharing and process optimization.

However, many enterprises have data silos, and cross-system integration requires the development of custom APIs, which requires the team to have experience in system architecture design and industry software integration, demanding high technical capabilities.

The high requirements for technology and industry knowledge make it difficult for most small and medium-sized enterprises to create competitive vertical Agents.

At this stage, large companies including BAT and ByteDance excel at building platforms and demos, such as Alibaba's DingTalk + Quark and Baidu's Qianfan App Builder. However, there are not many cases where complex vertical businesses have been end-to-end transformed. Most are still small-scale pilots or simple aids. Additionally, many companies have conducted numerous POCs (proofs of concept), but few have been put into large-scale use.

According to a ThoughtWorks report, due to insufficient business collaboration and high operating costs, up to 88% of AI POCs fail to enter large-scale deployment. The study found that out of 33 AI POC projects launched by each company, only 4 enter the production stage.

The reason is that internet giants are more adept at "general capabilities + traffic and platforms," while truly tackling the dirty work, customization, compliance, and implementation in vertical industries requires offline deep diving and industry know-how accumulation, which do not align well with their business attributes, evaluation systems, and commercial incentives.

Part 3: Crossing the Market Gap: The Choice of Going Overseas and Verification of Local Value

In addition to the two structural issues mentioned earlier, Agent products have faced a lingering commercialization challenge since their inception, namely the deep fragmentation between domestic and international markets.

For most domestic AI enterprises, the "compliance" requirement makes their development highly dependent on domestic model capabilities, but there is still a generational gap between domestic models and top US models.

Compared to domestic models, advanced foreign models such as Claude Opus 4 often maintain more stable logical consistency and lower error rates in complex reasoning chains, especially cross-domain and multi-condition derivations.

Moreover, they have achieved a context length of millions of characters and are highly stable in strictly formatted, long, and structured document, code, and JSON generation, all of which are levels that current domestic models find difficult to achieve.

At the same time, limited by the overall digitization level and consumer habits in China, the willingness to pay of B-end and C-end users at this stage is not ideal. This makes it even more challenging for consumer-grade AI applications, especially startups, to have their value fully recognized by the market and achieve commercialization.

Under this premise, domestic AI application entrepreneurs need to make greater efforts to bridge the value gap between model capabilities and market expectations. This means that the team needs to have a deeper accumulation of comprehensive capabilities in scenario design, data engineering, model understanding, market, and business cognition.

Under the pressure of "high investment and low value," it is understandable that Agent products like Manus choose to go overseas as a strategy.

According to overseas AI entrepreneurs, the overseas market is more generous in valuing AI products, with 10,000 daily active users supporting a valuation of $100 million. In other words, each daily active user is worth 70,000 RMB.

Nevertheless, going overseas is not the ultimate solution. All Agent products cannot escape the competition in model capabilities.

With giants like OpenAI and Anthropic beginning to deploy their own Agent products in 2025 and adopting a "model supply cutoff" strategy to maintain competitive advantages, the advantages of shell-based Agents will rapidly collapse.

For example, a while ago, the famous overseas AI programming application Windsurf was completely cut off from Claude's supply, reflecting the vulnerability of many enterprises (including Manus) without self-developed models in the face of giants.

Therefore, going overseas should be a phased strategy of "surviving and practicing capabilities" rather than a final destination of "never returning."

In the domestic market, the patience of capital for the Agent sector will not last long. Despite the constant slogans and frequent promotion of benchmark cases by vertical large model and application vendors, it remains unclear how much real economic benefit Agents can create in the future.

However, it is certain that current Agents have demonstrated their value in cost reduction and efficiency enhancement in scenarios with clear processes and fixed rules, such as customer service, marketing, and data analysis. With these commercial scenarios as a foundation, Agents will not be a complete "bubble" this year.

For future greater commercial breakthroughs, Agents need to truly play a transformative role in some high-value vertical fields, such as finance and healthcare. This will require the synergistic effects of multiple factors, including technological evolution, organizational adaptation, and industrial ecological collaboration.

END

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.