Google 2026 I/O Conference Full Recap: Models Remain Important, but Agents Are Taking Over Everything

Home

Finance

ICV

Smart City

Digital Live

Cloud

Optics

Home Finance AI ICV Smart City Digital Live Cloud Optics

05/20 2026 592

Author | Lin Yi

Editor | Key Points Team

In the early morning of May 20th, Google held its 2026 I/O Conference. CEO Sundar Pichai revealed at the event that the number of tokens processed monthly across Google's services has reached 32 quadrillion, a sevenfold increase from the previous year.

This time, Google delivered a comprehensive update of its full-stack technologies and products, ranging from chips and models to applications:

Chip Layer: The TPU 8t, optimized for pre-training, and the TPU 8i, optimized for inference, were introduced. By upgrading global computing clusters, the industry's pain point of high computing costs is addressed.

Model Layer: The Gemini Omni world model, featuring physical consistency, was released, enabling AI to transcend the limitations of pure text and pixel generation and truly understand three-dimensional spatial intelligence. Additionally, the cost-effective and low-latency Gemini 3.5 Flash was introduced, boasting significantly improved coding, agent, and tool-calling capabilities compared to the previous 3.1 Pro. The Gemini 3.5 Pro, however, will only be available next month.

Application Layer: The latest Antigravity 2.0 platform was released, capable of autonomously writing a complete operating system within 12 hours through multi-agent collaboration. Furthermore, the personal agent Gemini Spark, which operates 24/7 in cloud virtual machines, was introduced to accelerate the automation of software engineering.

Industry Standards: In collaboration with Amazon, Microsoft, Meta, and other giants, the UCP and AP2 agent-based e-commerce protocols were launched, establishing commercial interaction norms for the agent era ahead of time.

Terminal Hardware: The first audio smart glasses with built-in Gemini, based on the Android XR platform, were released.

In summary, the entire conference demonstrated that Google, leveraging its full-stack AI technology advantages, has built a super ecological barrier with an "Agent-First" approach across search, office, shopping, and hardware. AI has transformed from a tool into an always-online productivity force capable of autonomously breaking down and executing complex tasks—agents are taking over everything.

We have compiled key information from this conference. Here are the highlights:

1. Computing Power Foundation: Eighth-Generation TPUs and Substantial Increase in Capital Expenditure

Google released the TPU 8t, optimized for pre-training, and the TPU 8i, designed for inference. Meanwhile, Sundar Pichai disclosed that Google's annual capital expenditure for the year is expected to reach approximately $180 billion to $190 billion, nearly six times that of 2022.

2. Model Updates: Gemini Omni World Model and Gemini 3.5 Flash

Gemini Omni possesses leapfrog physical concept understanding capabilities, breaking through the previous limitations of AI systems in simulating physical properties like kinetic energy and gravity. It can generate videos with accurate physical characteristics based on text prompts and supports secondary editing of video elements through direct dialogue with users.

Gemini 3.5 Flash focuses on extremely high response speed and cost-effectiveness. According to official data, 3.5 Flash outputs tokens four times faster per second than other cutting-edge models.

3. Software Engineering: Antigravity 2.0 Fully Automated Code Generation Platform

The Antigravity 2.0 platform significantly improves software development efficiency. In a test project, a team of 93 sub-agents worked in parallel for 12 hours, processing 2.6 billion tokens, and wrote and tested a complete operating system from scratch, including a scheduler, memory management, and file system. Supported by the cost advantages of Gemini 3.5 Flash, the end-to-end API call cost for this complex software engineering project was compressed to under $1,000.

4. Agents: 24/7 Operational Gemini Spark

Gemini Spark runs on dedicated virtual machines in Google Cloud, remaining online 24/7. Even when the terminal device is turned off, Gemini Spark can autonomously navigate across multiple applications such as Docs, Gmail, and Calendar to execute complex tasks for users.

5. Search Architecture: Information Agents and Dynamic Interactive UI

Google Search has been fully upgraded to Gemini 3.5. Users can now set multiple agents to continuously monitor network information in the background, pushing highly relevant customized solutions during critical events such as severe financial market fluctuations or specific product launches.

Additionally, the search results page introduces a dynamic interactive UI supported by the Antigravity platform, capable of generating directly operable data visualization charts in real-time based on users' specific questions.

6. Commercial Infrastructure: Standardized Protocols for Agent-Based E-Commerce

UCP (Universal Commercial Protocol) provides a unified data interaction method for AI agents across various companies, covering the entire process from product search to checkout and logistics inquiry. Currently, companies such as Amazon, Meta, Microsoft, Salesforce, and Stripe have joined to support this standard.

AP2 (Agent Payment Protocol) is used to set funding boundaries for AI shopping and ensure clear accountability. It establishes a verification link based on privacy-preserving technologies among users, merchants, and payment institutions to ensure consistency in return and billing records.

Based on these protocols, the Universal Cart achieves cross-merchant and cross-service functionality, enabling automatic price comparisons, inventory monitoring, and purchase recommendations in the background.

7. Smart Hardware: Audio Smart Glasses Based on the Android XR Platform

In collaboration with Samsung, Gentle Monster, and Warby Parker, Google launched the first audio smart glasses with built-in Gemini, based on the Android XR platform. The device focuses on hands-free interaction, allowing users to complete tasks through voice commands without needing to take out their phones, such as route navigation using real-time map data, accessing local lifestyle applications, and completing product orders and payment verifications.

8. Industry Trust: Cross-Platform Adoption of SynthID Digital Watermarking Technology

In response to societal concerns about the uncontrolled proliferation of generative AI content, Google announced the comprehensive expansion of SynthID digital watermarking technology to search and Chrome browsers. Currently, this technology has added underlying source markers to over 100 billion images and videos worldwide. Leading industry institutions such as OpenAI, NVIDIA, Kakao, and ElevenLabs have officially announced the adoption of this standard, jointly establishing identification norms for AI content sources.

Below is the transcript from the 2026 Google I/O Conference:

1. Innovation in Efficiency and Collaboration Tools: Ask Maps, Ask YouTube, and Docs Live

SUNDAR PICHAI: Hello to the live audience here and to everyone watching around the world. It's great to be back at this year's I/O Conference. The past year has been extraordinary, marked by continuous product launches and rapid technological advancements, placing us in a period of hyper-growth. I truly feel that this has been an incredibly fulfilling year. Let me take you through what I've been up to recently.

The scene in the video where I was plugging in the TPU is quite accurate, but I hope this year involves more than just that. There's still a lot of work to do before it goes to space, and we're working on it. Seriously, this is truly a remarkable moment. It has been 10 years since our transition to an AI-First company. We understand the profound impact AI will have in advancing our mission and improving people's lives on a massive scale. That's why we're adopting a differentiated full-stack AI innovation approach: from custom chips and security foundations to world-class scientific research and models, and then to products and platforms reaching billions of users. This approach enables us to iterate and innovate more rapidly, empowering our company in every way.

What's truly astonishing is how people are using our AI. Students use the Gemini app to prepare for final exams; musicians and artists incorporate generative AI modules like Lyria and Veo into their creative processes; developers write code to turn their ideas into reality. I personally use Gemini in various ways in my life. Recently, I've been using it to understand my parents' medical reports, and I'm sure many of you have done similar things. These stories about how people use AI are the best measures of its value and progress.

SUNDAR PICHAI: To provide a more intuitive demonstration of the scale at which people are using AI, there's another excellent metric: tokens. Tokens are the basic units of data processed by models, with each token representing a problem being solved. Two years ago, our services processed 9.7 trillion tokens monthly, already a massive number. At last year's I/O Conference, this figure grew to approximately 480 trillion. Today, it has increased sevenfold to 32 quadrillion tokens per month.

Mentioning "quadrillion" in an I/O keynote is indeed rare, but we've truly achieved it. Some might argue that this is just "number chasing," and there's some truth to that. However, I believe it profoundly reflects the vibrant state of our products and the ecosystem developers are building on these models. Currently, 8.5 million users utilize our models to build new applications and experiences each month. Our model APIs now process approximately 19 billion tokens per minute. Over the past 12 months, more than 375 clients have each processed over 1 trillion tokens, representing strong industry demand for AI.

Of course, the demand for our various products is also extremely high. We now have 13 products with over 1 billion users, five of which have over 3 billion users. Our Gemini models are attracting more people to use our products and increasing their usage frequency.

It all starts with Search, which has allowed the public to enjoy the benefits of generative AI earlier than any other product in the world. AI Overviews now has over 2.5 billion monthly active users. This is a groundbreaking feature and the largest upgrade to search in our history. People love it, and within just one year, its monthly active user base surpassed 1 billion. When people use AI-driven features in Search, they use search more frequently. I really like the current Search; it's no longer limited to single queries but more like an ongoing conversation that not only provides deeper insights but also seamlessly connects you to the vast online world.

Another area where we're rapidly innovating is the Gemini app. At last year's I/O Conference, the Gemini app had 400 million monthly active users. Today, this number has surpassed 900 million, more than doubling in a year. Meanwhile, daily requests have grown more than sevenfold. This is astonishing growth. We've been adding many unique features, such as Personal Intelligence, which makes AI responses more customized and helpful. Currently, over 50 billion images have been generated by our Nano Banana model. It has shone brightly over the past year, and I know everyone has had a lot of fun with it. In addition to the Gemini app, we've also directly integrated Gemini into many of our products, making conversations more natural.

Recently, Maps received its biggest upgrade in a decade with the introduction of a new feature called Ask Maps. People are using it to ask more complex and longer questions. Here's a real query from a parent: "My child just fell into a duck pond, and the wedding is in 30 minutes. Can I buy a new dress within walking distance nearby?" I'd love to know if she managed to buy one successfully.

We're also bringing this conversational AI to two other products. First is Ask YouTube. People come to YouTube every day with many questions. There's a vast amount of excellent videos, but sometimes it's hard to know where to start. Ask YouTube completely reshapes this experience. Suppose you want to teach a 3-year-old child to ride a pedal bicycle after they've only learned to ride a balance bike. You can simply ask YouTube and see completely different search results: the information becomes easy to understand and browse. You'll not only get an overview and practical tips but also the videos that best suit your needs. If you want to try a specific teaching method, you can click to learn more. The best part is that it can directly jump to the most relevant part of the video, which reminds me of teaching my kids to ride bikes. It remembers the context, so you can ask follow-up questions like, "Should I buy one with hand brakes or foot brakes?" This makes it a continuous conversational experience. It can even present information in tabular form for easy comparison. We're now beginning to test Ask YouTube, which will officially launch in the U.S. this summer.

So far, we've showcased conversational text queries. But often, I wish I could complete tasks at the speed of speech. Thanks to our leaps in audio technology, this is now a reality. A new feature called Docs Live takes this experience to a whole new level. Previously, when creating documents with Gemini, you had to input very precise prompts. With Docs Live, you can simply verbalize any thoughts in your mind and let Gemini handle the rest. Let's see how it works through a demonstration by the product team. Note that these are all real-time demonstrations without any acceleration.

Okay, let's give it a try. I just remembered that I need to prepare some speaking points for an alumni speech at my alma mater's high school career day tomorrow, explaining to students what exactly a software engineer like me does. Although I am an engineer myself, I'm not entirely sure where to start. Can you retrieve my resume from my Drive? Looking at a resume might be a bit boring, so maybe you can come up with some interesting analogies to make this speech more engaging for the students? Additionally, the school sent me an email with a subject line roughly like "Career Day Logistics" earlier. You can get the specific details from that email and put them at the top of the document, so I'll know exactly where to go and when. Let's update these requirements and directly generate a draft.

【Generated Result】That's cool, but the content is a bit too dense. Maybe present these analogies in a table format to make it easier for me to skim. Then, help me add a note about how my brother inspired me to become a software engineer. Put it near the top of the document and bold it so I don't miss it. Okay, looks great.

In the future, you'll be able to create new Docs and edit them directly—all using just your voice. Docs Live is launching this summer for Pro and Ultra subscribers, with the same powerful voice capabilities coming to Gmail and Google Keep. Seeing the pace of innovation across our products is truly remarkable.

2. AI Infrastructure Upgrades: 8th-Gen Custom Chips TPU 8t and 8i

SUNDAR PICHAI: To provide massive support to our vast user base while serving businesses and developers globally, we need to make significant investments in infrastructure—and we’ve consistently invested for both now and the future. Our annual capital expenditures in 2022 were $31 billion. This year, we expect that number to grow roughly sixfold, reaching approximately $180–190 billion. A key part of this investment is our custom chips.

A decade ago, we unveiled our first TPU on this I/O stage. Since then, we’ve transformed how the industry builds AI. Most recently, at Cloud Next, we introduced our 8th-gen TPU—the first time we’ve adopted a dual-chip strategy, designing specialized architectures for training and inference: TPU 8t and 8i. While they may look similar, they’re fundamentally different.

The 8t is optimized for large-scale pre-training, delivering nearly triple the raw compute of our previous generation. We’ve taken a fundamentally different approach to training infrastructure. With JAX and Pathways, training is no longer confined to a single massive data center. Instead, it can now seamlessly distribute across multiple sites, scaling to run on over 1 million TPUs globally. This enables us to create the world’s largest training clusters. For model builders, this means training larger, more capable models in weeks rather than months. The TPU 8i is designed for inference. We’ve dramatically increased speed at every step because, after 27 years in search, we know latency matters.

To give you a sense of this speed, here’s a prompt that will soon run on Flash models—assuming it’s running on 8i. I’ll create a Chrome Dino game and hit submit, with responses generated in real time. As you watch, notice the tokens per second in the top right corner. The speed is incredible—nearly 1,500 tokens per second. The write request takes longer than the generation itself, and the game is pretty fun too. Beyond speed, we’re also thinking about sustainable scaling. Both chips are significantly more power-efficient, delivering up to twice the performance per watt. The TPUs have been training hard for this year’s I/O. From what I understand, there’s some behind-the-scenes footage to share.

Short Film Characters: Hey, how was your weekend? Pretty good—just folded some proteins in a rare tumor dataset. How about you? I simulated 50 years of climate data. I drew a picture of a pug—have you ever seen a pug dressed like an accountant? No, want to see? Alright, TPUs, listen up—I/O is about to start, and we’ve got work to do. Actually, we’ve got trillions of tasks, so clear your caches. Timmy! Dry off your fans—let’s bring the heat. Hey, what are you doing? Editing a montage. Come on, stop with your montage and get down here! What? Now? Yeah, right now.

SUNDAR PICHAI: I’d bet that after I/O, TPUs like Timmy are ready to lie flat and take a break. Our compute innovations are driving our own progress.

Today, I want to dive deep into three areas—models, programming, and agents—to show you how far we’ve come. Let’s start with the exciting progress in World Models. Through world models, AI is shifting from predicting text to simulating reality. Demis and the Google DeepMind team have been pushing the boundaries of what these models can do. Let me invite Demis out to share more.

3. World Models Breakthrough: Gemini Omni and Omni Flash

DEMIS HASSABIS: Hello, everyone—great to be here. Over the past year, AI capabilities have taken a leap forward. We now have agents that can plan and take action for us. We’re also just a few years away from achieving Artificial General Intelligence (AGI). Today, I’m excited to share our progress toward building AGI.

Last year, I outlined our vision to scale Gemini’s multimodal capabilities into an AI world model that understands and simulates the world. This is key to achieving AGI and will have profound implications—from building AI assistants to training robots. Now, we’re taking the next step, and I’m thrilled to announce Gemini Omni.

This new model can generate anything from any input. It combines Gemini’s intelligence with our best generative media models, enabling entirely new levels of world understanding, multimodality, and editing capabilities. Models like Veo, Nano Banana, and Genie can already create incredibly realistic videos, images, and interactive simulations. While not yet perfect, they demonstrate an impressive intuitive understanding of physical concepts. With Omni, we’ve made even greater strides, achieving breakthroughs in simulating kinetic energy, gravity, and other concepts that previous systems struggled with. Gemini’s world knowledge and reasoning shine in Omni, allowing it to turn complex ideas into highly accurate videos. For example, you could give it a simple prompt like, “Create a stop-motion explainer video about protein folding,” and here’s what you’d get.

Video Narrator: Proteins start as chains of amino acids. They fold into specific patterns—like alpha helices and flat sections called beta sheets—to form their perfect 3D structure.

DEMIS HASSABIS: But initial generation is just the beginning. Creation is rarely a one-step process—it’s iterative. Just as Nano Banana redefined image editing, Omni gives you a more natural way to edit videos through conversational language. What’s really cool is that you can provide your own video—like a selfie—and alter reality in a fun way. You can easily adjust details and styles or even add new elements, with the entire scene evolving to reflect your new vision.

A simple circle could become a black hole, or a sunset stroll could come alive. Anything can become a canvas for creating entirely new realities. Let’s take a look at what Omni can do through a short film. We’re starting with video generation, but over time, Omni will generate any output from any input. That’s always been our goal—and it’s why we built Gemini as a native multimodal model from the start. While it was a harder path, the solid foundational architecture is now paying off.

Today, we’re launching the first model in the Omni series: Gemini Omni Flash. It’s already powering our products, and you’ll hear more about that later. I’m incredibly excited about our progress and will soon share more about Omni Pro. Can’t wait to see your amazing creations—now, back to Sundar.

4. AI Content Transparency: SynthID Expands Across Platforms

SUNDAR PICHAI: Thanks, Demis—this is huge progress. As generative AI gets better, the need for greater transparency grows. Research shows people can only correctly identify high-quality deepfake videos about a quarter of the time.

Three years ago, we introduced SynthID—an invisible watermarking technology. Since launch, SynthID has watermarked 100 billion images and videos, along with 60,000 years’ worth of audio assets. Millions are using the SynthID Detector in the Gemini app to verify AI-generated content.

Now, we’re going further by adding cross-product Content Credentials verification. This will show you whether content came from AI or a camera—and whether it was edited using generative AI tools. In this example, Gemini can tell that this photo was taken with a Pixel camera and then edited using Google Photos. We want to make these tools accessible to more people.

So, we’re expanding SynthID and Content Credentials verification to Search and Chrome. With Circle to Search or a right-click in Chrome, you can simply ask, “Was this generated by AI?” and get a clear answer along with helpful context. For instance, this image went viral on social media last year—but it’s clearly fake because I don’t eat burgers. That might not be obvious to everyone, which is where these tools really help. Of course, this only scales when more partners choose to watermark their AI-generated content. NVIDIA signed on with SynthID last year, and today I’m thrilled to announce that OpenAI, Kakao, and ElevenLabs are adopting SynthID as well. It’s great to see this cross-industry collaboration, and we look forward to expanding it further—setting a new standard for transparency in the AI era.

That wraps up our progress on World Models. Now, let’s talk about what’s next for the Gemini 3 series.

5. Major Model Series Upgrade: Gemini 3.5 Flash and 3.5 Pro

SUNDAR PICHAI: Launched just a few months ago, Gemini 3 features a full family of models and is our most widely adopted series yet. We’ve been thrilled to see developers use Flash as their everyday workhorse model and leverage Pro’s deep reasoning capabilities to build amazing multimodal experiences. We’ve been focused on improving these models—especially for agentic programming, long-running tasks, and real-world workflows.

Today, I’m excited to introduce Gemini 3.5 Flash—our first model family that combines cutting-edge intelligence with action-taking capabilities. There are two key points I want to highlight:

First, compared to 3.1 Pro, Flash performs better on nearly every benchmark. It’s made huge strides in programming, delivering an astonishing leap forward on the GDP val benchmark, which covers many tasks with real economic value.

Second, 3.5 Flash is a powerful industry-leading model that rivals the top models but is much faster. That’s why, when you look at the chart of intelligence versus output speed, it sits alone in the top-right quadrant. In terms of tokens per second, it’s four times faster than other leading models—delivering a truly remarkable experience.

This new model has been transformative for Google internally. We’ve been using 3.5 Flash in Antigravity, our reimagined agent-first development platform, and it’s dramatically accelerated our build process. In March, we were processing 500 billion tokens per day for internal developers—a number that doubled every few weeks—and now we’re processing over 3 trillion tokens daily. This scale creates a powerful feedback loop that’s helping us continuously improve 3.5. Today, we’re bringing it to developers everywhere with Antigravity—let me hand it over to Varun to share more.

6. Agentic Development Platform: Antigravity 2.0 Desktop App

VARUN MOHAN: This truly is an incredible time to be a builder. We’ve moved beyond AI tools that just help write code—to agents that actually help execute tasks. These agents are dramatically lowering the barrier to development, enabling anyone to become a builder—even busy CEOs. In fact, Sundar used Google Antigravity to fix a bug in the Google codebase just last week. When we launched the Antigravity IDE in November, we made sure the core agent-driven IDE experience was second to none and added an experimental first-of-its-kind agent for Surface to show where things are headed. Millions are already using Antigravity, so we’re excited to bring you more today. We’ve observed the diversity of tasks and preferences, listened to candid product feedback, and learned from experience.

Now, Antigravity is massively expanding its agentic capabilities, interfaces, integrations, and product line features.

First, we’re launching a full CLI experience: the Antigravity SDK, native voice support with Gemini audio models, and integrations with multiple interfaces and platforms like Android, Firebase, and Google AI Studio. All of this is ready for you to try today.

Most importantly, at its core is Antigravity 2.0—a brand-new standalone desktop application that fully realizes our original vision for a truly agent-optimized experience. The new Antigravity is unapologetically agent-first, focusing on core agentic conversation, agent-generated artifacts, and multi-agent orchestration.

As Sundar mentioned, this is the exact experience that Google’s internal teams have long used to create tremendous value. The Antigravity Agent Harness is the invisible scaffolding for Gemini to execute real-world tasks, now made even more powerful with new core primitives like Subagents, Hooks, and async task management.

Underpinning all of this is the Gemini model, with Gemini 3.5 Flash deeply co-optimized with the Antigravity Harness. As engineers, we were curious to see just how far these agents and models could push the boundaries of possibility.

So with the new Antigravity and Gemini 3.5 Flash, we asked agents to take on a task considered highly complex and impressive: build a working operating system from scratch. We were blown away by the results. Antigravity asynchronously broke down the challenge into a coherent plan, handling tasks with parallel subagents—generating, executing, and iterating. In internal testing, 93 subagents worked in parallel for over 12 hours, initiating over 15,000 model requests, processing 2.6 billion tokens, and developing an initially empty project into a fully functional OS kernel. This would have been impossible on Gemini 3.1 Pro, but thanks to the performance and cost-efficiency of Gemini 3.5 Flash, building such a fully functional OS consumed less than $1,000 in API credits.

The Antigravity agents wrote every line of code—from the scheduler to memory and filesystem management—all generated, audited, and tested by an autonomous team of agents. To be clear, developing an OS from scratch is notoriously brutal and can take months. We weren’t just building an app—we were building a fully functional OS on which apps can run.

Let’s do a live demo and actually show this OS running.

I’m now in a terminal window of the OS built by Antigravity. Demonstrating a working OS isn’t easy, so let’s have some fun and see if it works. We can install a fun utility—SL, a common typo for the LS command. Without spoiling it, let’s just see what happens. There it is—you can see a cool Antigravity-branded motorcycle riding across the screen.

But obviously, it’s not a real OS unless you can run Doom. I’m now trying to run Doom, but it’s not working—turns out we’re missing some necessary video and keyboard drivers. So let’s try to fix that with the new Antigravity. I’ve got a prompt ready to paste in, but while that runs, let’s take a tour of Antigravity 2.0.

As you can see, Antigravity 2.0 is fully Agent-First—all agent conversations and projects are shown in the sidebar. Let’s quickly look at a conversation I had earlier. For this demo, I was curious about some fun facts about Doom, so I had the agent do some research. It generated charts on the right panel and even created a cool artifact for me at the end. It even used Nano Banana Pro to generate an infographic, using code it just wrote to generate charts, and then some cool tables. As you can see, Antigravity 2.0 is unapologetically Agent-First and optimized to be the best interface for you to interact with agents.

Let’s go back to the previous conversation and see how far we got. Antigravity ultimately did a ton of research, wrote over 100 lines of code, fixed, and built the OS. Let’s see if Doom runs now—the moment of truth. Perfect! That’s awesome! This game never gets old. While running Doom on an OS built by Antigravity is fun and impressive, the progress doesn’t stop there. We’ve already asked agents to build a photo editing suite, a real-time messaging app, and a multi-user collaboration platform—all with the same high-quality results. Thanks to the new Subagent teamwork capabilities, days-long engineering tasks are shrinking to hours or even minutes.

We’re excited to bring you this power in Antigravity as an early research preview. Last but not least, 3.5 Flash is insanely fast. As Sundar said, it’s 4x faster than other cutting-edge models. But as we all know, agent programming is incredibly token-intensive, so we’ve supercharged its performance in Antigravity. We’ve optimized Flash exclusively to be not just 4x faster in Antigravity, but a staggering 12x faster.

We’re so excited to let you experience all of this starting today. What we showed you today isn’t just a vision—it’s how we’re making Antigravity the most complete agent development platform for everyone. We’re doing this through the Google ecosystem, whether by integrating with the tech stacks and tools you already use or by powering the next wave of agent experiences across Google products with Antigravity’s Agent Harness. Today, Antigravity 2.0 is available to everyone, everywhere. Please join our developer keynote where we’ll demo all the new capabilities. Back to you, Sundar.

SUNDAR PICHAI: Thank you, Varun. Incredibly, the entire OS Varun demoed was built by a team of subagents in just 12 hours and at such a low cost. What’s remarkable about Flash is that it delivers cutting-edge capabilities at less than half the price of comparable frontier models.

We’ve heard stories of companies burning through their entire token budget for the year by May. If companies blend Flash with other frontier models, they can save a tremendous amount of money. To put this in perspective, Google Cloud’s top customers process about a trillion tokens per day. If they shifted 80% of their workload from other frontier models to 3.5 Flash, they’d save over $1 billion a year. That’s real money that can be reinvested back into their businesses.

Gemini 3.5 Flash is available to everyone starting today across our products and APIs. We’re also very excited about 3.5 Pro—we’re using it internally, and it’s showing tremendous progress. I know you can’t wait to get your hands on it, so please give us one more month to deliver it to you.

7. Your Personal AI Agent: Gemini Spark

SUNDAR PICHAI: Gemini 3.5 and Antigravity are unlocking a new world of what agents—and agentic capabilities—can do. We’ve been focused on delivering agents for developers and enterprises. Now, we’re fully focused on bringing this power to consumers safely and responsibly—to make it useful for everyone.

Today, you’ll see rich agent experiences across many of our products. I’m incredibly excited about the new capabilities we’re bringing directly into the Gemini App—introducing Gemini Spark. It’s your personal AI agent that helps you with your digital life, taking actions on your behalf under your guidance. It runs on a dedicated virtual machine in Google Cloud and is online 24/7. That’s right—you can close your laptop. Powered by Gemini 3.5 and Google Antigravity Harness, it can effortlessly handle long-running tasks in the background. Spark integrates seamlessly with a wide range of tools—starting with our own and expanding to third-party tools via MCP in the coming weeks. You can collaborate with Spark in the way that’s most convenient for you—whether in the Gemini app or soon through email and chat interactions. Let’s invite Josh to share more.

JOSH WOODWARD: Thanks, Sundar. Great to see you all. Let me show you how Spark works through some examples from my personal life. We’re now in the newly redesigned Gemini, which we’ll talk more about later in the show.

I want to show you Spark here—you can directly see a dashboard of all the tasks I have running in the background. It lets you review them, and I’ll paste in a task in just a moment. This is a very straightforward but practical example. Help me draft an email to the team summarizing everything about our recent Gemini Live launch and what we accomplished last week. Using a slash command to invoke ghost writer, so a few things are happening now. It’s compiling everything across Docs, email, and chat threads and pulling the most important information needed for this update. It’s also using everything I’ve ever invoked slash ghost writer on over the last week. This is a personal skill I wrote, so the email sounds very much like my voice. And the best part is—with Spark, you can upload your favorite skills found online. So we’re going to let that run in the background. You can see it’s already executing various tool calls.

Now I’ll switch to another example from my personal life. We’re planning a big block party. This is a fairly complex prompt, and we want it to help collect all the RSVPs, keep track of who’s bringing what, and remember to email neighbors who haven’t signed up yet.

What’s amazing here is that Spark will step through all of this sequentially and save you a ton of time as it collaborates across various skills and apps. It’ll break down the task and generate documents for you. The first is a live RSVP tracker running directly in Google Sheets. It shows who’s confirmed and who hasn’t. And it’ll actually update automatically because it’s connected to Gmail. When L. Thompson replies with 8 RSVPs, it’ll update automatically—that’s pretty remarkable.

Another thing it’s doing is tracking all the different guests and sending follow-up reminders to those who haven’t signed up yet—again, automatically. It’ll create drafts and send them under my control. And finally, the prompt also generated a warm-up presentation for the block party directly in Google Slides and perfectly integrated. It even included things like the giant bounce house we’re putting at the end of the cul-de-sac. All of this is happening in the background and under my control. Gemini can even go a step further and pull out information like how your neighborhood HOA doesn’t allow setup until Friday, June 5th, in the afternoon—pulled from a document in my Google Drive. So it does a tremendous job of pulling everything together.

This shows Spark running on the laptop, but it’s equally amazing on mobile—on both Android and iPhone. Opening it up on my phone, you can see our two previous tasks just synced over. They stay in sync across all your devices, which is incredibly helpful.

Spark is amazing at capturing the little fragments of inspiration in your mind. If you’re super busy, you can just toss tasks over the wall to Spark, and it’ll catch them and start working. Watch this: Help me spin up a few threads. First, find all upcoming meetings with Sundar and flag them bright pink so I don’t miss them. Second, write a note to our new neighbor John and his family, who moved in last night, inviting them to our block party since they weren’t on our original list. Third, create a document listing the most important things my wife and I need to do for our kids before the school year ends, organized by deadline and priority, and make it easy to understand—I don’t want to miss anything. After sending that request, it captures all the context at the speed I talk and handles the tasks. It starts with a single thread, but in the background, it’s actually going off and breaking these down into individual tasks. Now I can just put my phone away and continue with my day, and Spark will work on this for me in the background. This is the first time we’ve ever been able to put our phones down and let it keep working on the I/O stage—this is amazing.

For security reasons, we’re cautiously releasing Spark to trusted testers this week and rolling it out as a beta to Google AI Ultra subscribers in the U.S. next week. We want this new kind of help to be available to as many people as possible, so we’re introducing a new Ultra plan at $100 a month. And for those who need the highest limits, we’re reducing the top Ultra plan from $250 a month to $200 a month.

There’s so much more coming—later this summer, Gemini Spark will run directly in Chrome as your agentic browser across the entire web, taking actions and completing tasks under your guidance. We’re also building a dedicated home base for your agents on your phone—Android Halo—coming later this year. As Sundar said, we’ve now entered a whole new agentic era across Google, and we can’t wait to see what you all build with it. Back to you, Sundar.

SUNDAR PICHAI: Thanks, Josh. It’s great to see Gemini Spark taking actions on your behalf. I’ve tried all sorts of agents, and you can really see the potential here. We’re still early in making agents easy to use, ultra-safe, and genuinely helpful. That’s why I’m so excited about Gemini Spark. We’re laying the foundation to bring all of this safely and responsibly to consumers everywhere—can’t wait to let you all try it.

We’re firmly in the agentic era of Gemini. Gemini Spark is the first experience you’re seeing, powered by the 3.5 model and Antigravity together. This combination gives us new ways to deliver on our mission and fundamentally transform our products to be more helpful. I can’t wait to see how it’ll transform Search—our ultimate moonshot. This past year has shown just how powerful innovation can be, and that’s at the heart of our mission for information. As we enter the agentic era, Search will be more useful and powerful than ever before. Now I’ll hand it over to Liz to share what’s next.

8. Redefining Search Engines: AI Search and Search Agents

LIZ REID: People ask billions of questions on Search every day. Sometimes the whole world is searching for the same thing, but more often, your question is as unique as you are. That’s why we’re committed to letting you ask any question, in any way you want. To make that happen, we’ve been working to combine the best of search with the power of AI.

We began this transformation with AI Overviews. Right here on this stage last year, we launched AI Mode—our most capable AI Search, powered by our cutting-edge Gemini model. As of today, we’re upgrading it to Gemini 3.5.

As Sundar mentioned, AI Mode has already surpassed 1 billion monthly active users, and we’re seeing incredible growth. Since launch, queries in AI Mode have doubled every quarter. And as Search gets better, people are asking more questions—so much so that last quarter, search queries hit an all-time high.

What’s even more remarkable is how specifically and thoughtfully you’re asking real questions because you know Search can actually handle them. You’re having true back-and-forth interactions with Search, diving deeper and deeper. You’re not just asking for nearby hiking trails—you’re asking for a full day trip of nearby hikes with breathtaking views, pet-friendly routes, and a lunch spot with easy parking.

Now, we’re entering a new chapter for Google Search, where amazing AI capabilities aren’t just inside Search—Google Search is now, at its core, an AI Search. It’s an AI Search powered by our most advanced Gemini model, our newest agent capabilities, and the world’s most comprehensive information.

We update over 1 billion facts every minute, index billions of new web pages every day, and connect to an infinitely broad range of human perspectives. So whatever you think of, you can come to Google and ask anything. First, I’m excited to announce that we’re launching an entirely new Search box. The Search box used to be a confined space, but now, it’s been completely reimagined with AI—expanding to follow your curiosity wherever it leads.

As you ask your question, Search helps you refine it with AI-powered suggestions. This goes beyond autocomplete—it offers nuances you might never have thought of, helping you easily articulate the exact question in your mind. This new Search box puts our most powerful AI tools at your fingertips. You can ask cross-modal questions with text, images, files, and video—and search across all those dimensions. This is the biggest upgrade to our iconic Search box since it launched 25 years ago, and it starts rolling out today.

Next, we’re bringing together AI Overviews and AI Mode into one seamless AI Search experience, making conversations with Search even easier. You can effortlessly jump from a question to an answer on the main search results page to a follow-up in AI Mode. Your context stays with you, and the conversation deepens. Your links and sources become even more relevant to your needs, giving you the best of AI and the web, all in one place. This new, seamless AI Search experience is rolling out globally on desktop and mobile starting today.

Earlier, you heard Sundar and Josh share their vision for agents and the possibilities they unlock. Now, we’re taking an exciting step toward that future, where you’ll be able to create and manage multiple AI agents directly in Search to handle tasks for you.

We’re entering the era of Search agents. You’ll have information agents working for you in the background, 24/7. They’ll find exactly what you need and help you take action at critical moments. You’ll be able to launch multiple agents at once in Search to stay updated and make progress on everything that matters to you. These agents, working with Gemini Spark, will help you get even more done. Let’s look at some real-world examples. Say you’re deeply interested in finance and want to stay on top of large-cap biotech stocks with a P/E ratio under 15, positive cash flow, and low debt.

You just ask, and your agent gets to work. It takes your incredibly complex question and develops a strategy, assessing urgency, understanding that you need real-time intelligence, and setting triggers to monitor changing information while selecting the right tools and data hooks for the job. It connects directly to our live financial data, giving you second-by-second updates on stock prices and market insights—so you’re always in the know when markets move. When something changes, your agent sends you a smart, synthesized update. It helps you make sense of the situation, separating signal from noise and surfacing key insights from the chaos. It also points you to highly relevant crowdsourced research platforms, news sites, and social media content. This helps websites and creators get their fresh content discovered by people who truly care, at the exact moment they care most.

Or say you’re looking for an apartment. You can share all the ideas in your head—location, natural light, availability—and your agent will continuously scan the entire web, covering major sites, social media, and forums. Or if you’re a sneakerhead, you can simply ask to be notified whenever any of your favorite athletes drop a sneaker collab or release. It’ll monitor everything from blogs to our Shopping Graph, so you never miss a thing. This summer, you’ll be able to put information agents to work for you, simply by asking Search to keep you updated on whatever you want to know.

Information agents are one of the first agents we’re launching in Search to do more for you. So whether you’re looking to find, verify, book, buy, or do anything else, Search has your back. We’re also bringing agentic programming into Search, enabling it to craft custom experiences for your questions. To show you exactly how this works, let’s bring out Robby.

ROBBY STEIN: We believe the best version of Search is one that’s created just for you, presenting information in the format that’s most helpful for answering your question. We’ve spent years perfecting this. If you’re shopping, we give you products. If you’re asking for data, you see charts. If you’re looking for inspiration, you get beautiful visuals.

Now, we’re taking this to a whole new level by bringing agentic programming—powered by Antigravity and Gemini 3.5 Flash—directly into Search. So Search can instantly and fully customize and build the ideal format for your question, including dynamic layouts, interactive components, or even entire experiences tailored just for you. This is agentic programming at Search scale.

Let me give you an example. Say I’m a college student trying to understand astrophysics. I can go straight to Search and ask how black holes affect spacetime. In AI Overviews, I get an interactive visualization right away. Search realizes that for such a complex concept, I need interactivity to truly understand it. That’s just the starting point, so I follow up—Show me how two orbiting black holes, like a binary system, produce gravitational waves. Search dynamically builds a brand-new interactive visual in real time, completely customized for my specific question. I can adjust parameters like orbital separation and mass ratio, see the waveform patterns change, and watch the smaller black hole spiral around the larger one. Now that I’ve got the basics, I can dive deeper into resources like the LIGO Discovery Papers to learn more.

You might be wondering how exactly Search builds these custom UIs for billions of unique questions. With Gemini 3.5 Flash, Search can plan the ideal response from scratch, handling design, deciding which custom components to build, conducting research, and ultimately deploying the code. To build custom components in the response, this search feature calls on an agentic coding framework powered by Antigravity, so it can read, write files, and execute code in a secure, containerized environment. This is the same technology Varun used to build an entire operating system, and we’re bringing that power directly into Search. Generative UI with Antigravity will be available to everyone for free starting this summer. So no matter what you want to know—whether it’s how a watch actually works or analyzing the new cost of your commute—you’ll get an answer as unique as your question.

Let’s go even further. Some projects aren’t just one-off questions—they’re ongoing tasks. Now, Search can help you build complete, custom, stateful experiences, complete with tools, trackers, and dashboards. Think of these as your own little apps built right into Search, especially useful for long-running tasks that require ongoing attention, like planning a wedding or managing a move.

Want to build one together? I’ve been thinking about what to do with my family this weekend, so here’s what I just searched. Beyond the great response from AI Mode, Search proactively offered to build me a weekend plan. Just like we saw Search create generative UIs and interactive visuals from scratch, Search can now write code. To show you how this works behind the scenes, as it builds, you’ll see a live stream of thought processes and code generation. Search is thinking through the right components—not just getting information, but presenting it in the best possible way. I choose to safely connect Gmail, Photos, and Calendar, so it uses personal intelligent references like receipts and calendars to make suggestions even more useful. It generates a beautiful plan that already accounts for drive times and weather.

Search knows I have two kids, love animals, and that my oldest is learning chess—so the second option is great for them. But to keep both kids happy, I’m going to select Happy Hollow Park and Zoo. And since it’s synced with my calendar, it’s already blocked off my afternoon to meet a friend for a game. All those super cool restaurant reservations below are beautifully presented on Maps. Now that I’ve seen these agents in action, I want to prioritize the First Lady even more, and my wife and I try to schedule date nights on Friday evenings. So I’m going to keep customizing—add a weekly Friday night date and move it to the top. Just like before, it thinks through the actions needed to adjust the plan, queries real-time information, and even double-checks my preferences—all incredibly fast. It’s building in real time using all of Google’s information, so now at the top, I see the map and the Friday Date Night tab. Scrolling down, I see great restaurant options for after the babysitter arrives. Once we pick one, we’re ready to go.

No weekend plan is complete without approval from my wife, Danielle, so I’m sharing this app with her. When she gets it, this is exactly what she sees on her phone. Danielle’s coming in—she might have some feedback for me when I get home, but we’ll handle that later. All I have to do is add it to my calendar, and Search will add it to all our family calendars, and we’re all set. I could plan a brand-new weekend for my family just like this next weekend too.

We’re bringing Antigravity to Search, with generative UI starting to roll out to subscribers this summer. And in the coming months, you’ll be able to custom-build experiences just like this. From Search agents to agentic programming, this is an AI Search that does more for you. No matter what you ask, agent capabilities will transform all the ways you use Search—including how you shop. To tell you more, here’s Vidhya.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.

Newest

Links