Silicon Valley Frontier | The First Year of Physical AI: A Trillion-Dollar Gamble on 'How the World Works' - AI

Home

Finance

ICV

Smart City

Digital Live

Cloud

Optics

Home Finance AI ICV Smart City Digital Live Cloud Optics

Silicon Valley Frontier | The First Year of Physical AI: A Trillion-Dollar Gamble on 'How the World Works'

04/03 2026 379

In March 2026, AMI Labs, founded by Turing Award winner and former Meta Chief AI Scientist Yann LeCun, announced the completion of a $1.03 billion seed funding round.

Almost simultaneously:

World Labs, founded by Fei-Fei Li, completed a new round of financing of approximately $1 billion.

Google DeepMind released the Genie 3 world model.

Tesla continued to advance the deployment of its Optimus humanoid robots in factories.

These events are not isolated but collectively point to a clearer trend: AI is moving from 'understanding the digital world' to 'understanding and acting upon the physical world.'

If 2024 was the expansion phase for large language models (LLMs) and 2025 was the exploration phase for AI agents, then 2026 marks a shift in Silicon Valley's core narrative toward a more fundamental question: Can AI truly understand 'how the world works' and complete tasks in reality?

This is not merely a change in technical direction but signifies a rewriting of the industrial value chain. Over the past two years, the main battlegrounds of AI competition have focused on a few high-barrier areas such as models, computing power, and data centers. However, as AI begins to truly enter the physical world, competition is no longer confined to the model layer but extends simultaneously to hardware, system integration, data collection, simulation environments, supply chain coordination, and real-world deployment. In other words, Physical AI brings not just a single breakthrough but a complete restructuring of the entire infrastructure system.

As such, this round of changes may represent not just a new wave of technological excitement but a rare structural opportunity for the Chinese-speaking world, particularly Chinese entrepreneurs, engineers, and investors. Unlike the previous race dominated by large-scale model training resources and supercapital, Physical AI inherently relies more on composite capabilities: understanding algorithms as well as engineering; system coordination as well as diving deep into manufacturing, supply chains, and industrial scenarios. Teams with technical depth, hardware collaboration capabilities, and a global industrial vision—spanning both China and the U.S.—are better positioned to seize critical roles in this new cycle.

In other words, Physical AI is not just a new story being told in Silicon Valley; it may also be the most important entry ticket for Chinese players in the next round of global technological infrastructure transformation.

01┃ The Century-Long Debate Between Two Paths: LLM Camp vs. World Model Camp

Over the past three years, large language models (LLMs) have nearly dominated AI's developmental path, with their core paradigm based on next-token prediction using massive text datasets. However, the boundaries of this paradigm are gradually becoming apparent: it can 'describe' the physical world but lacks executable understanding; it lacks modeling capabilities for causality and physical constraints; and it performs limitedly in continuous decision-making and long-term tasks.

Therefore, a faction led by Yann LeCun has begun promoting an alternative path: World Models—predicting 'states' rather than 'text.' The core difference is that LLMs use text as the learning object and language as the output form, essentially remaining at the level of 'cognition and expression,' while world models use the state of the physical world as the modeling object, directly aiming for a closed-loop capability of 'perception-decision-execution.'

This is not just LeCun's personal judgment. In Q1 2026, the world model direction saw several key advancements almost simultaneously: AMI Labs, with JEPA as its core architecture, clearly bet on a long-term route of 'research first, then product'; World Labs entered from 'spatial intelligence,' attempting to enable AI to truly understand relationships, occlusions, and physical constraints in the three-dimensional world; Google DeepMind advanced real-time interactive dynamic environment generation through Genie 3 and used it for agent training.

While their paths differ, all three companies point to the same trend: AI's next leap is not just about generating better text but about more accurately modeling the world and completing actions within it.

02┃ The Hardware War: Who's Building the 'Body'?

World models address the 'brain' problem—how AI understands the physical world. But the other half of the Physical AI battlefield is equally intense: who is building the 'body'?

In 2026, the humanoid robot race has fully transitioned from 'lab demos' to 'factory mass production.' Here are some key figures:

Tesla Optimus Gen 3: Over 1,000 units have been deployed at Gigafactory Texas and Fremont Factory, performing part handling and assembly tasks. This marks the largest-scale deployment of humanoid robots in factories in human history. Tesla is building a dedicated factory in Giga Texas with an annual capacity of 10 million units, aiming to reduce the cost per unit to $20,000—down from the industry average of $50,000–$250,000 two years ago.

Boston Dynamics Atlas: The product version of Atlas unveiled at CES 2026 stands 6.2 feet tall, has 56 degrees of freedom, and can lift 110 pounds. More notably, its 'soul'—Boston Dynamics announced a collaboration with Google DeepMind to integrate cutting-edge foundational models into Atlas. Its full-year 2026 production capacity has already been reservation (pre-ordered) by Hyundai and Google DeepMind, with a 30,000-unit-per-year factory under planning.

Figure 03: Figure AI raised $1 billion at a $39 billion valuation. Its Figure 02 participated in the production of over 30,000 BMW X3 vehicles during an 11-month trial run at BMW's Spartanburg plant, moving over 90,000 parts and accumulating 1,250 operating hours. Figure 03 represents a comprehensive upgrade, equipped with 48+ degrees of freedom and a proprietary Helix AI platform.

Mind Robotics: Announced $500 million in financing in March, focusing on industrial-scale AI robot deployment.

However, an underestimated link in this hardware competition is emerging: the dexterous hand.

While the legs of humanoid robots solve mobility and the torso solves load-bearing, it is the hand that ultimately determines whether a robot can operate in complex environments. Take Tesla Optimus as an example: the hand accounts for 17% of the total machine cost, roughly $9,500—the most expensive single component.

The challenge with dexterous hands lies in a fundamental contradiction: fingers have too little space for large motors; small motors lack sufficient torque, requiring high-reduction gearboxes to amplify force; but high-reduction gearboxes introduce inertial distortion, loss of force feedback, and mechanical wear—three issues that 'poison' AI's learning process at the physical level.

A wave of new companies is attempting to break through this bottleneck. Some use axial flux motor architectures to compress the reduction ratio from 288:1 to 15:1, enabling fully backdrivable dexterous hands; others design data collection gloves in tandem, allowing human operation data to migrate to robot hardware with zero loss. These seemingly minor hardware innovations may prove to be some of the most critical infrastructure in the entire Physical AI ecosystem.

03┃ NVIDIA: The 'Shovel Seller' of the Physical AI Era

Every technological wave produces a 'shovel seller.'

During the large model era, NVIDIA became the biggest beneficiary through its GPU and CUDA ecosystem. In the Physical AI era, its role is further upgrading—not just providing computing power but attempting to build an entire infrastructure for the robotics age.

At its GTC Conference in March 2026, NVIDIA unveiled a full platform capability suite for Physical AI: including the vision-language-action model Isaac GR00T for humanoid robots, the Cosmos series for generating large-scale synthetic data, and a toolchain (such as Isaac Lab and OSMO) covering training, evaluation, and deployment. These capabilities are not isolated tools but are gradually forming a complete development and operational system.

Multiple robotics companies, including Boston Dynamics, Caterpillar, Franka Robotics, LG, and NEURA Robotics, are already building next-generation systems on NVIDIA's platform.

Its strategy is also clear:

Not directly participating in end products but becoming the underlying standard for the entire industry.

If Physical AI is a city under construction, NVIDIA is simultaneously providing the cement, steel, and power grid.

04┃ Data: The Most Scarce 'Oil' of Physical AI

In the world of large language models, the internet provides nearly unlimited text data. But in Physical AI, a more fundamental issue emerges:

Real-world manipulation data is extremely scarce.

This makes data one of the most critical and scarcest resources in the entire industrial chain (value chain).

Currently, the industry is exploring three main paths:

Real-world data route. Led by Physical Intelligence, its π0 model is trained on over 10,000 hours of real robot operation data, covering multiple robot forms and task types, enabling complex operations (such as folding clothes, assembling cardboard boxes, etc.). Its open-source behavior essentially provides the industry with a 'manipulation pre-training foundation.'

Synthetic data route. Google DeepMind's Genie 3 and NVIDIA's Cosmos attempt to generate large-scale simulated environments through world models, completing training in virtual worlds before transferring to the real world. The core challenge of this path lies in the sim-to-real gap, but as simulation precision improves, this gap is gradually narrowing.

Human teleoperation route. By using data collection gloves and other devices, human operations are directly mapped into robot systems. This approach yields the highest-quality data but remains limited in cost and scalability.

Tesla is experimenting with a hybrid path: continuously collecting human operation behaviors through factory videos and using them to train Optimus's motor skills.

In the long run, the competitive landscape of Physical AI will likely depend not on who has the best model but on who possesses the largest and highest-quality dataset of physical world interactions. Once the data flywheel starts spinning, its barriers will grow exponentially.

05┃ What the Money Says: A 2026 Q1 Physical AI Funding Landscape

Numbers don't lie. Here are the key funding events in the Physical AI space during Q1 2026:

[World Model Layer]

· AMI Labs (LeCun) — $1.03 billion seed round, valued at $3.5 billion

· World Labs (Fei-Fei Li) — $1 billion new round, Autodesk invests $200 million

【Foundational Model Layer】

· Physical Intelligence — Negotiating a $1 billion new round, valuation to exceed $11 billion

· RLWRLD — $41 million seed round extension

【Full Humanoid Robots】

· Figure AI — Previously raised $1 billion at a $39 billion valuation (2025)

· Mind Robotics — $500 million for industrial-scale deployment

· Galaxea — $434 million, Series B unicorn

· Humanoid — $290 million seed round, direct unicorn

· Generative Bionics — €70 million seed round

【Infrastructure & Tools】

· NVIDIA — Continuous investment in Isaac GR00T / Cosmos platforms

· RoboForce — $52 million, Physical AI workforce platform

Based on the above public data, Q1 has already surpassed $6.4 billion. And this does not even include internal investments from major players like Tesla, Hyundai/Boston Dynamics, and Google DeepMind.

The flow of capital points to one thing: Physical AI has moved beyond the 'proof-of-concept' stage and entered the 'infrastructure-building' phase. Investors are no longer asking, 'Can robots work?' but rather, 'Whose infrastructure can scale robots the fastest?'

06┃ Sober Reflection: Bubble or Turning Point?

Of course, Silicon Valley has never been short of bubbles. Amid the Physical AI frenzy, a few sober questions deserve consideration:

Demo ≠ Deployment. As industry insiders agreed at Davos 2026: The gap between a impressive demo and a system that can run 10,000 times without failure is far wider than propaganda suggests. Figure 02 did participate in producing 30,000 vehicles at BMW's factory, but it handled relatively standardized part movements, not dexterous assembly.

Sim-to-real remains a tough nut to crack. World model fidelity is improving, but the long-tail complexity of the physical world—lighting variations, material differences, unintended collisions—remains the biggest challenge for synthetic data approaches.

Business models remain unproven. LeCun himself said AMI Labs would focus on research in its first year. World Labs is experimenting with a freemium model. Physical Intelligence open-sourced its core model. Currently, these companies generate nearly zero revenue; capital is betting on paradigm monopoly 3-5 years down the line.

The gray rhino of safety and regulation. When thousands of autonomous robots enter factories or even homes, who takes responsibility for accidents? Currently, global regulatory frameworks for Physical AI are virtually nonexistent.

But these very issues suggest we are in the early stages of a technological inflection point, not at the peak of a bubble. Every true paradigm shift—the Internet, smartphones, cloud computing—went through a phase where 'demos far outperformed products' in its early days. The critical difference lies in whether the underlying technology is truly advancing, not just the PowerPoint presentations.

From LeCun's JEPA architecture to Genie 3's real-time world generation, to π0's 68-task generalization capability, to Optimus's 1,000-unit factory deployment—the progress in Q1 2026 represents tangible engineering breakthroughs, not castles in the air.

07┃ Physical AI is not a standalone track—it is the ultimate form of AI.

Physical AI is not a new track; it is more like one of AI's final forms.

As AI shifts from 'understanding the world' to 'entering the world,' what is being rewritten is not just model capability boundaries but also industrial division of labor and value distribution. Future competition will not only occur in model parameters and compute clusters but also in robot bodies, dexterous hands, data acquisition, simulation systems, industrial scenarios, and supply chain organizational capabilities.

This is precisely why this wave is particularly important for the Chinese community.

Because over the past two decades, one of the deepest strengths of the Chinese community has never been a single-dimensional technological label but rather the ability to truly connect cutting-edge technology, engineering execution, hardware manufacturing, and cross-regional industrial collaboration. Whether as entrepreneurs, engineers, investors, or industrial resource organizers, those who can seize this wave of migration from digital intelligence to physical intelligence have the opportunity not just to participate in the trend but to become part of the trend itself at certain critical layers.

In 2026, Physical AI may still be far from mature; but precisely because it is still early, the window has just opened. For the Chinese community, this may not be another cycle of 'follow-the-leader participation' but a new starting point with greater opportunities to deeply penetrate the infrastructure layer, platform layer, and key component layer.

Author's Note: Personal views, for reference only

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.

Newest

Links