Comprehensive Review of Google I/O 2026: Introducing Two New Models, Audio Glasses, and the All-Encompassing Gemini

05/20 2026 348

After much anticipation, Google I/O 2026 finally kicked off in the early hours of May 20, 2026, Beijing time. With Google having previously used The Android Show to tease the new features of Android 17, AI took center stage as the protagonist of this year's event.

Unlike other AI companies, Google commands multiple entry points into different internet ecosystems, including Gemini, YouTube, Google web search, and Android. Therefore, a key focus of this year's Google I/O was exploring 'how to leverage AI to empower these ecosystems'.

Gemini Omni and Gemini 3.5: Enhancing Google AI's Versatility

Google officially unveiled its latest and most versatile model, Gemini Omni, at Google I/O. How versatile is it? In essence, 'Gemini Omni can generate any form of output from any form of input.' Moreover, it allows for real-time modifications through dialogue during the generation process.

Image Source: Google

Take, for example, the generation of a music video (MV). By feeding Gemini Omni with music, video, image materials, and general visual requirements, it can directly output a corresponding short video. At the event, Google showcased an impressive AIGC case:

Drawing a simple circle on a white paper and adding a textual description was enough for Gemini Omni to generate a complete special effects video. But that's not all—if users are unsatisfied with certain visual elements or styles, a single sentence can precisely modify specified elements without altering others, such as 'replace the glass building with soap bubbles,' complete with realistic physical collision effects.

Image Source: Google

In Google's own words, 'Gemini Omni is like the Nano Banana of the video field.'

According to Google, the Google Omni Flash model will be accessible on platforms such as the Gemini App, Google Flow, and YouTube Shorts starting today, with corresponding APIs to be released later.

Image Source: Google

In addition to the versatile Gemini Omni, Google also upgraded Gemini to the Gemini 3.5 version (Gemini 3.5 Flash) at Google I/O. Compared to Gemini 3.1 Pro, Gemini 3.5 Flash has demonstrated improvements in programming, real-world agent capabilities, large-scale tool invocation, and more.

Of course, for AI models, 'where there's a lightweight Flash, there's also a professional Pro.' Google also announced that Gemini 3.5 Pro will make its debut next month, though no further details were disclosed.

In summary, Google has achieved a balance of being 'fast, efficient, good, and economical' this time.

Google Antigravity and Gemini Spark: Faster and Stronger Intelligent Agents

With advancements in underlying model capabilities, AI agents based on Gemini have naturally been upgraded as well.

Image Source: Google

On the developer front, Google's AI development environment, Antigravity, leverages Gemini 3.5 Flash. According to Google, with the support of Gemini 3.5 Flash, Antigravity built an operating system kernel in just 12 hours, with the entire development process's AI API cost being less than a thousand dollars.

Image Source: Google

Google even utilized Antigravity and Gemini 3.5 Flash to refactor (reconstruct) the interactive interface of Google Search, introducing a new concept called 'Generative UI.' Frequent users of Google or other AI search engines know that even with AI mode enabled on the search page (referring to knowledge base searches achieved through 'inquiries' in AI Apps, not regular searches), the AI mode still outputs results in a dialogue box (ChatBox) format.

Image Source: Google

For general AI searches, the dialogue box interaction mode suffices. However, when users ask questions requiring intuitive demonstrations, such as 'How does a tourbillon work?', the text box mode falls short. In response, Google has developed an 'adaptive, self-generating' AI search UI based on Antigravity's programming capabilities.

Image Source: Google

Simply put, when faced with complex questions, Google Search will now use 'Vibe Coding' to directly write an interactive front-end webpage to answer the user's question interactively.

Unfortunately, this feature will not be available to users until the summer of 2026. However, the good news is that it is part of the Google Search update and does not require a Gemini subscription. Additionally, the UI of the Gemini App itself has been upgraded to align with the new Android visual elements.

Thanks to the multimodal capabilities of Gemini 3.5 Flash, Google Search's AI prediction and multimodal capabilities have also been enhanced. In addition to text and image-based searches, the new Google Search can directly input videos or documents. Search recommendations, previously intelligently sorted based on big data, have now been upgraded to AI search completion powered by Gemini 3.5 Flash.

Image Source: Google

Beyond front-end upgrades, Google has also comprehensively improved the 'back-end capabilities' of the search agent. The new search agent can run 24/7 in the background, continuously monitoring specific information according to user requirements. For example, before going to bed, Xiaolei can ask the search agent to keep an eye on AI news from companies like OpenAI, Anthropic, Grok, Perplexity, and X, and receive email notifications when there are hot topics that cannot be missed, prompting them to write an article.

Speaking of agents, Google also officially released a new agent for personal users at the event—Gemini Spark. Like other AI agents, Gemini Spark can take over users' phones and browsers 24/7. However, unlike current mainstream hosted agents, Gemini Spark will run in a dedicated virtualized environment.

Image Source: Google

Obviously, Gemini Spark is also powered by Gemini 3.5 Flash and Antigravity, naturally supporting voice interaction and background responses. In terms of external connectivity, Gemini Spark can not only directly interact with other components of the Google ecosystem (Google Docs, Google Calendar, Gmail, etc.) but also connect with external apps through the MCP protocol, achieving more comprehensive task hosting.

Google did not announce the platform compatibility of Gemini Spark at the event. Leitech expects that Gemini Spark will be available on phones through the Gemini App (iOS) and Google search component (Android).

Image Source: Google

While AI agents are running (whether in the foreground or background), the newly added Android Halo feature will display a persistent agent status marker in the top-left corner of the Android phone screen, allowing users to jump to the agent interface at any time, similar to the current 'camera prompt' and 'microphone prompt' on phones.

From Leitech's perspective, the emergence of Android Halo underscores the importance of agents from another angle. Although technically speaking, Gemini Spark is just a 'software feature,' its status is already on par with cameras and microphones, being an indispensable core component of the phone.

Image Source: Google

On the computer side, Google mentioned that Gemini Spark will be available on the Chrome browser in the summer of 2026.

However, unlike some domestic AI agents like Doubao that require payment for certain features, Gemini Spark will be a fully subscription-based function and will be available to Google AI Ultra subscribers next week.

It is worth mentioning that to distinguish between enterprise users and high-usage personal users, Google has introduced an additional 'Lite' AI Ultra tier (USD 100 per month) between the original AI Pro (USD 20 per month) and AI Ultra (USD 250 per month, temporarily reduced to USD 200 per month).

Image Source: Google

It is clear that even 'wealthy' Google finds it challenging to sustain the enormous computational costs brought about by this comprehensive AI through a free model. In the end, AI relies on computational power, which relies on hardware, which in turn relies on money. In the AI era, internet giants can no longer sustain the costs of AI through hardware sales and service subscriptions alone.

To exaggerate a bit, as the capabilities of AI agents further expand, paid AI services may very well become a part of our 'essential consumption,' just like mobile phone plans.

Audio Glasses Make Their Debut, Gemini Ecosystem Becomes Increasingly Complete

Last year, Google showcased AI glasses with display capabilities. At this year's Google I/O, Google also previewed an 'audio version' of smart glasses. First and foremost, although they are called 'audio smart glasses,' these Gemini glasses are not purely audio glasses like Jiehuan but are audio glasses equipped with cameras, possessing AI vision and multimodal input capabilities.

Since the glasses will not be released until the fall of 2026 (most likely to coincide with the new chip at the Qualcomm Snapdragon Summit), Google did not announce specific product information such as weight, sensor models, or battery life at Google I/O, only showcasing the product appearance and general functions.

Image source: Google

In terms of design, Google I/O mentioned smart glasses developed by Samsung in collaboration with two renowned eyewear brands, Gentle Monster and Warby Parker. Functionally, these two glasses are similar to the existing AR1 smart glasses, allowing users to activate Gemini using voice commands or the touchpad on the right temple.

Thanks to the capabilities of the Gemini underlying model and Spark agent, Gemini glasses can automatically break down a user's voice commands into agent actions and execute them in the background on the user's smartphone. Users can use voice commands on the glasses to ask Gemini to "buy a cup of coffee I ordered last time"; Gemini on the phone can automatically open the coffee app, add the item to the shopping cart, and place the order directly after the user confirms via voice (presumably using voiceprint verification technology, similar to domestic AI glasses).

Image source: Google

It is worth mentioning that Google also mentioned that the Gemini AI audio glasses will support both Android and iOS platforms. It is certain that, under iOS's extremely strict App sandboxing mechanism, the capabilities of the Gemini audio glasses will inevitably be significantly reduced compared to the Android platform.

To expand the capabilities of Gemini, Google has also fully AI-ized its office suite (Google Workspace): users can use voice commands to invoke Gemini to search for email information (Gmail Live), write documents (Docs Live), and even generate images (Google Pics).

Image source: Google

Combined with the high-end Googlebook mentioned in the previous Android 17 topic, Google is sparing no effort this year to integrate Gemini into every piece of ecosystem hardware it can control.

The keynote speech at Google I/O has come to an end here. At this point, some may feel that this Google I/O is merely Google's attempt to catch up after falling behind in the AI race. However, from Leikeji's perspective, the content mentioned in the Google I/O 2026 keynote speech actually signifies that Google has finally found the right ticket to the AI era.

For example, addressing the question of 'what can AI do,' Google directly targeted its 'foundational business' by using generative UI to change the 'turn-based' and 'one-way interaction' setup of AIGC. This shift from one-way to two-way interaction is also evident in Gemini smart hardware. For a long time, there have been no true 'two-way AI devices' in the AI hardware track (which means 'field' or 'arena' in this context): hardware and AI were completely separated.

Image source: Google

This time, the multimodal capabilities of Gemini 3.5 Flash have truly made devices like audio glasses the 'physical organs' of Gemini. Coupled with the Googlebook released last week, Gemini finally has the ability to proactively perceive the world and proactively output results.

More importantly, Gemini is leveraging its 'privileges' within the Android system to build an impenetrable moat that other manufacturers cannot cross. While overseas entities like Anthropic and various domestic large model agents are still testing the boundaries of sandboxing mechanisms and struggling to achieve cross-App collaboration using the MCP protocol, Gemini has already achieved seamless native interoperability at the Android bottom layer (which means 'underlying layer' or 'foundation' in this context).

Remember how we said at the beginning that 'Google simultaneously controls multiple different internet ecosystem entry points, such as Gemini, YouTube, Google web search, and Android'? At Google I/O 2026, these widely 'blooming' ecosystem entry points have finally reached the season of 'bearing fruit.'

Having said that, while Gemini's heavy dependence on the Google ecosystem poses a challenge to OpenAI and Grok, it also presents an opportunity for domestic AI companies that Gemini has inadvertently created.

Undoubtedly, regardless of Gemini's dominance overseas or its seamless native interoperability, this ecosystem product suite still suffers from "climate sickness" (a term used here to denote its failure to adapt to local conditions) in China. However, the business logic of "multimodal input/output + private system + 24/7 managed agent" illuminates a potential development path for domestic AI companies:

Just as Google is audaciously dismantling sandbox restrictions at the native underlying layer overseas, domestic manufacturers can similarly establish their own "independent realms" within customized Android systems.

More significantly, Chinese brands exhibit even greater aggressiveness and commitment to localizing agents compared to native Android. At Google I/O 2026, Google unveiled its ace card with Gemini. Now, it falls upon domestic AI giants and smartphone brands to collaborate and "break through" through cooperation.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.