Microsoft Enters AI Browser Market, Yet Faces Mediocrity Challenge

07/30 2025 356

Edge can read web pages, but that's about its current limit.

Author | Xue Xingxing

Editor | Jiang Jiao

Cover | Edge Screenshot

After two years of incremental AI upgrades, Microsoft has introduced Copilot mode to the Edge browser, officially venturing into the AI browser market.

Unlike previous iterations, where AI was limited to a chat sidebar, the new Copilot mode enables AI to read and comprehend web content. This includes interpreting technical documentation or providing an outline for YouTube videos.

It also consolidates all open web pages, aiding decision-making by creating comparison tables, especially useful when toggling between multiple product or hotel pages. Additionally, Microsoft has added voice functionality, allowing users to understand web pages and converse with AI in real-time through voice commands.

While these features may not drastically differ from existing AI browsers, Microsoft CEO Satya Nadella hails them, stating, "This is our first step in redefining the browser for the AI era." His favorite feature, the multi-tab RAG, allows him to swiftly analyze Microsoft's papers published in Nature over the past year, subtly promoting new features.

Currently, this feature is in an experimental phase, with Microsoft promising to progressively integrate new capabilities into Copilot mode. This phase also implies that Copilot is "free for a limited time." Since Copilot relies on OpenAI's foundational models, users can leverage many paid ChatGPT features for free.

For instance, users can access ChatGPT's paid DeepResearch feature within Copilot mode. Nadella also teased a task delegation feature, enabling users to delegate tasks to Copilot while browsing. An AI entrepreneur deemed it a free ChatGPT Agent alternative, currently available only to Plus and higher-tier users at a minimum monthly fee of $20.

This is Edge's primary allure compared to other AI browsers. Before OpenAI unveils its own browser, users can experience a stripped-down ChatGPT in Edge, potentially giving Sam Altman another reason to distance himself from Microsoft.

Edge can read web pages, but that's about its current functionality.

Much like when Microsoft first announced Edge's AI transition in 2023, the product update on Microsoft's official website continues to embellish Copilot mode with grandiose concepts, such as "We are witnessing a turning point in how we interact with the web" and "This is our next step in exploring more powerful web browsing methods."

Upon enabling Copilot mode, users may notice that the homepage interface transforms into a Copilot dialog box, described by Microsoft as "integrating conversation, search, and web navigation." Users can activate Bing search or converse directly with Copilot by typing keywords.

According to Microsoft's product documentation, Copilot understands user intent, aiding in information clarification and alleviating the burden of complex tab pages. While browsing, users can pose Copilot questions on the left side of the address bar. Copilot also anticipates user actions and offers suggestions based on past behavior.

Take Manus' AI Agent contextual engineering technical documentation, released on its official website. When opening this page, Copilot comprehends the content, providing document interpretation or an outline. It also seems to comprehend video content, summarizing YouTube video key points and generating video abstracts.

Copilot summarizing web page content

Copilot summarizing YouTube video

For e-commerce websites, Copilot offers AI summaries based on product detail pages, historical price trends, and comparisons with other websites. However, this feature is limited to overseas shopping sites like Amazon and Shein, unsupported on domestic platforms like Taobao and JD.com.

Copilot summarizing product pages

These capabilities are fundamental to AI browsers. Similar functions are available in domestic browsers like Quark, Doubao, and even QQ Browser. Quark's AI summary converts web pages into reading mode, Doubao analyzes and organizes Bilibili videos, and QQ Browser offers a dual-screen feature combining web search and model dialogue. Essentially, installing an AI plugin on Chrome can provide a similar experience with customizable models.

Copilot mode's significant difference lies in its proactive capabilities, such as AI tab grouping by theme or Nadella's mentioned multi-tab RAG, enabling AI to peruse all open tab content. Whether reading papers, comparing hotels, or browsing news, users can swiftly activate this feature, transforming the browser into a tool for easy comparison, decision-making, and task completion.

ShanShang tested multiple domestic and international e-commerce sites, including Taobao, JD.com, and Amazon, finding that Copilot organizes product models, selling points, prices, reviews, etc., across platforms and web pages, offering detailed purchase suggestions. Most domestically updated AI Agent products offer similar functionalities.

Copilot summarizing Tmall product pages and offering purchase suggestions

However, that's about its current limit. Like other products claiming Agent capabilities, Copilot doesn't integrate the payment process. Bookings, purchases, or itinerary planning can't achieve the seamless experience showcased in Microsoft's promotional videos, indicating these features are still future prospects.

Copilot can read PDF documents. Opening Alibaba's 2024 quarterly financial report, Copilot reads the content and provides targeted answers. It also accesses OpenAI's text-to-image capability, asking if users need to visualize financial data into charts after continuous dialogue. However, it's not directly usable, with numerous errors in charts, like omitting Alibaba's Q1 2024 performance – it only read three pages.

Copilot reading PDF financial reports

With voice mode, Microsoft added a futuristic Vision feature to Copilot mode, enabling real-time voice conversations. Users can ask questions like "What is this paragraph about?" or "What is this image?" through voice, regardless of the web page. "It will see your current page, read it with you, and discuss your issues. You won't feel lonely facing tab pages anymore," stated Microsoft's previous product documentation.

However, this capability isn't novel. The Comet browser released by Perplexity this month also supports real-time voice interaction but is currently limited to Perplexity Max subscribers ($200/month) and invited users.

Doubao's desktop version has similar functions, but on Mac, it requires screen sharing permissions, sharing the entire desktop with AI. Edge doesn't necessitate screen sharing. ShanShang's tests found that Doubao's voice interaction can't be interrupted in real-time, requiring manual screen clicks, lacking interactivity. In contrast, even under domestic network conditions, Copilot Vision offers faster response speeds, more accurate answers, and real-time interruptibility, providing an instant messaging-like interactive experience.

Forget the AI blogger slogans claiming a revolution in web browsing. Based on current experience, many Copilot mode functions aren't vastly different from existing AI browsers. Microsoft emphasizes that Copilot is still experimental, with new features to come. Users can enable or disable it at will.

The chaotic AI browser market hasn't reached its pinnacle yet.

Compared to the general AI Agent concept earlier this year, AI browsers emerged earlier and faster. Early AI browsers integrated basic AI dialogue or webpage summary functions.

Post-Agent craze, more AI browsers emphasize autonomous task execution, like The Browser Company's Dia browser, Perplexity's Comet browser, and Opera Neoa. Promoting Edge's Copilot mode, Microsoft also highlights its proactivity.

Tech companies' enthusiasm for AI browsers is understandable. After over 30 years, the browser remains the primary internet access window on desktop devices, with its core interaction mode largely unchanged, evolving from Netscape and IE to today's 17-year-old Chrome.

Large models have redefined information acquisition, and browsers are expected to change too. Both giants and startups are vying for this market. Market research firm Market.us predicts the global AI browser market will grow from $4.5 billion in 2024 to approximately $76.8 billion in 2034, with a compound annual growth rate of 32.8%.

Market.us Report

However, distinguishing current AI browsers from AI Agents is challenging – they share similar underlying technologies and aim for the same goals. Microsoft's Copilot mode capabilities mirror OpenAI's functional updates over the past six months. For instance, OpenAI's Operator function enables AI to operate web pages. Microsoft's promotional video depicting Copilot's future capabilities is akin to OpenAI's ChatGPT Agent.

Before launching Manus, Butterfly Effect aimed for AI browsers but ultimately abandoned the idea. Manus co-founder Zhang Tao noted that AI browsers' user experience is limited, realizing that a truly universal AI agent must transcend the browser interface.

Even Perplexity's decision to launch an AI browser seemed forced. Perplexity CEO Aravind Srinivas said he contacted the Chrome team hoping to make Perplexity the default search engine, but was rejected, prompting him to create his own browser.

— Who knows what Srinivas was thinking. The New York Times reported that Google paid Apple approximately $18 billion in 2021 alone to remain Safari's default search engine, so handing over Chrome to competitors was out of the question.

Srinivas' view on AI browsers somewhat aligns with AI Agents. In a podcast interview in April, he said browsers are the best platform for building Agents. "Browsers are like containerized operating systems. They can access third-party services through hidden tabs, scrape client-side page content, and perform reasoning and actions on your behalf."

In essence, AI browsers are a type of AI Agent. However, due to current large model limitations, they can't escape the traditional browser framework, relying on traditional tabs and web interactions. They mostly enhance existing browser experiences rather than revolutionizing them.

It's unclear how many users this intermediary form will attract to change their habits. Microsoft added AI features to Edge as early as 2023, but after two years, it hasn't significantly threatened Chrome, suggesting AI's driving force might not be as impactful as the Chromium engine.

Wang Junyu, Wandoujia's founder, wrote in a China Business News commentary that while today's AI browsers innovate experiences, they haven't achieved qualitative change. AI isn't deeply integrated into browsers, failing to truly become the "eyes and hands."

Chrome's market dominance is hard to shake, as most AI-touting browsers use the Chromium engine. Despite rumors of OpenAI launching an AI browser, they're also reportedly interested in acquiring Chrome.

"Even if Dia finds a 10x experience highlight, I see no reason why Chrome can't follow suit," said Wang Junyu. In May, Google added a Gemini entry to Chrome, allowing users to read web pages and converse, offering a Dia-like experience.

AI browsers frequently highlight their capacity to autonomously navigate web pages, yet this capability evokes privacy and security apprehensions, particularly among users reluctant to transmit their browsing data to AI systems.

In the past, Microsoft introduced the Recall feature in Windows, an AI tool embedded at the system level that periodically captures user screens and content in the background, facilitating seamless navigation back to previous points in time. However, upon its release, Recall encountered significant criticism, with some media outlets even labeling it as spyware.

Tech media outlet The Verge reported on Microsoft's incorporation of Copilot mode in the Edge browser. A highly-rated comment beneath the article remarked: "Microsoft has bundled the questionable Copilot software into the latest Edge update."

Another, more spirited comment stated, "Microsoft needs to clarify whether I should use Copilot in the browser, Windows 11, or even every line of a Word document and cell in Excel. Having that Copilot logo everywhere is incredibly annoying."

© Copyright by Shanshang. All rights reserved. Unauthorized reproduction is strictly prohibited.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.