06/03 2026
405

Image Source: Official Product Photo from Cerebras Systems
"Is AI Computing Power About to Change Hands?"
Editor | Yun Shu
Produced by | Jixin
New York/San Francisco, May 16, 2026 – On May 14 local time, Cerebras Systems, a "dark horse" in the AI chip sector, made its market debut on Nasdaq under the ticker symbol CBRS. Despite surging 108% intraday on its first trading day and triggering a circuit breaker, the stock eventually closed up 68%. As of May 16, the share price had still surged 51% from its $185 IPO price, reaching a market cap of $60.2 billion. This marks the world's largest IPO in 2026, raising $5.55 billion and breaking Uber's record for U.S. tech IPOs since then.
This chip company, focused on "wafer-scale engine" technology, is challenging NVIDIA's decade-long AI computing monopoly through a strategic partnership with OpenAI worth over $20 billion, sparking deep reflection on AI infrastructure transformation across the global tech industry.
01 20x Oversubscribed, Pricing Hits Record Highs
Cerebras' IPO journey has been nothing short of phenomenal. Initially targeting a price range of $115–$125, the company raised it to $150–$160 due to overwhelming demand, ultimately pricing at $185—16% above the revised upper limit. Underwriters revealed the IPO was over 20x oversubscribed, attracting top global investors including sovereign wealth funds, hedge funds, and tech giants.
"This isn't just an IPO—it's a vote of confidence in AI computing architecture transformation," said Michael Ng, Morgan Stanley's tech industry analyst. "Cerebras' valuation transcends traditional chip companies; investors are betting on its disruptive potential in AI inference."
Cerebras raised $5.55 billion, with total proceeds potentially reaching $6.38 billion if underwriters exercise over-allotment options. This far exceeds Arm's $5.1 billion IPO—the chip industry's largest in 2025—and marks the biggest U.S. tech IPO since Snowflake in 2020. Notably, Cerebras remains unprofitable, reporting $510 million in 2025 revenue and $1.2 billion in net losses. However, investors are focused on growth: orders surged 370% after its January 2026 OpenAI partnership, with Q1 2026 revenue jumping 215% YoY.
02 Wafer-Scale Engine: Redefining AI Computing
Cerebras' core strength lies in its proprietary Wafer Scale Engine (WSE) technology, which contrasts sharply with NVIDIA's multi-GPU cluster approach. Traditional GPU chips occupy a small fraction of a wafer, while Cerebras fabricates an entire 12-inch wafer into a single chip—equivalent to ~56 traditional GPUs—with 1.2 trillion transistors, 188GB on-chip memory, and 24PB/s bandwidth.
"This isn't just scaling chip size—it's a computing architecture revolution," explained Cerebras CTO Sean Lie. "Traditional GPU clusters require network data transfers, while our WSE-3 integrates all computation and storage on a single wafer, eliminating data movement bottlenecks—key for ultra-low-latency inference."

Image Source: "Cerebras WSE-3 vs Nvidia H100/H200/B200: Detailed Technical Comparison—Who Is the True 'King of Chips' in the AI Era?"
Benchmark data shows Cerebras' CS-3 system dominates AI inference tasks:
- Llama 3.3 70B model inference: CS-3 achieves 2,140 tokens/sec vs. NVIDIA's flagship DGX B200 at 120 tokens/sec—18x faster;
- GPT-OSS-120B model: 3,000 tokens/sec inference, 15x faster than GPU solutions;
- Total Cost of Ownership (TCO) 32% lower and power consumption 33% less than DGX B200.
This performance edge is critical for real-time applications. For example, OpenAI's Codex-Spark code generation service achieves "typing-to-response" latency on Cerebras, reducing delays from hundreds of milliseconds (GPU) to tens of milliseconds, boosting developer productivity by over 40%.
03 OpenAI's $20 Billion Bet: Reshaping the Computing Landscape
Cerebras' bold challenge to NVIDIA is backed by OpenAI's strong support. In January 2026, the two announced a landmark deal: OpenAI committed to purchasing over $20 billion in Cerebras AI computing capacity (~750MW) over several years, with joint AI model-hardware co-design. Additionally, OpenAI provided $1 billion in working capital loans to support Cerebras' infrastructure.
"This isn't a vendor relationship—it's strategic tech synergy," said Sachin Katti, OpenAI's VP of Infrastructure. "Cerebras offers dedicated low-latency inference solutions, enabling faster responses, more natural interactions, and a solid foundation for scaling real-time AI to more users." The partnership targets AI's "inference bottleneck." With large model parameters exceeding a trillion, inference costs now account for over 60% of total AI spending—a key barrier to AI adoption. OpenAI's ChatGPT, with over 900 million weekly active users, faces massive inference costs. Cerebras' technology cuts per-token costs by 32% and latency by over 90%.
Deployment will occur in phases:
- 2H 2026: 150MW for Codex products;
- 2027: 400MW for GPT-5 real-time inference;
- 2028: 750MW across all OpenAI core services.
This will be the world's largest high-speed AI inference deployment, supporting over 1 billion token requests per second.
04 Differentiated Competition: Cracking the Monopoly
Cerebras' rise coincides with a pivotal shift in AI computing. NVIDIA dominates 90% of AI training and 80% of inference markets via CUDA and high-performance GPUs, with $215.9 billion in FY2026 revenue—423x Cerebras'. However, Cerebras avoids direct competition by targeting niche markets.
"We're not replacing NVIDIA—we're filling market gaps," said Cerebras CEO Andrew Feldman. "NVIDIA's training and general-purpose computing advantages are unmatched, but our approach excels in ultra-scale, low-latency inference." This strategy is paying off. Beyond OpenAI, Cerebras partners with AWS for dedicated inference capacity and Core42 (formerly G42) to deploy GPT-OSS-120B, offering 3,000 tokens/sec for enterprises. In finance and healthcare, where real-time performance is critical, Cerebras is replacing some GPU clusters.
For example, JPMorgan Chase uses Cerebras for high-frequency trading data, cutting risk assessment response times from 2 seconds to 0.1 seconds while reducing computing costs by 40%. Mayo Clinic leverages Cerebras for medical imaging analysis, slashing AI-assisted diagnosis turnaround from 4 hours to 15 minutes, improving emergency care efficiency.
05 Ecosystem Building and Ramp-Up
Despite prospects, Cerebras faces challenges. Building a software ecosystem is critical—NVIDIA's CUDA platform, with 15 years of development, boasts 90% of AI developers, while Cerebras' toolchain is still maturing. Wafer-scale chip production yields and supply chain stability also pose long-term tests.
"NVIDIA's ecosystem is its strongest moat," noted SemiAnalysis analyst Myron Xie. "Cerebras needs to attract more developers to optimize models for its platform—that takes time and sustained investment." On production, Cerebras relies on TSMC's advanced processes. Wafer-scale chips are far harder to manufacture than traditional GPUs, with yield control being key. Industry sources say WSE-3 yields have improved from 30% to 75%, still below traditional chips' >90%. Facing these hurdles, Cerebras is accelerating ecosystem growth. It recently launched the Model Zoo initiative, offering WSE-optimized versions of 100+ mainstream large models, and partnered with Hugging Face to simplify model migration. Additionally, Cerebras plans to unveil the WSE-4 chip in 2027, packing 2.4 trillion transistors with double the performance.

Image Source: Cerebras Systems' Nasdaq Listing Bell-Ringing Ceremony
Cerebras' 51% IPO surge reflects not just capital market confidence in a chip company but a collective bet on a new AI computing paradigm. Driven by giants like OpenAI, the AI industry is shifting from "parameter scale" to "real-time experience," and Cerebras' wafer-scale technology aligns perfectly with this trend. "The next decade of AI will be defined by inference speed," Andrew Feldman stated at the listing ceremony. "Our OpenAI partnership is just the beginning—more companies will recognize the value of low-latency inference, reshaping the AI computing market." For NVIDIA, Cerebras' rise isn't an existential threat but a catalyst for accelerated innovation. Reports suggest NVIDIA is developing inference-optimized chips, with the GB200 NVL expected in 2027 to boost single-chip inference performance.
Regardless of the final competitive landscape, Cerebras' IPO marks AI chips' entry into a diversification (diversified) era—a major boon for the industry's healthy development. With lower computing costs and improved performance, AI will penetrate industries faster, truly realizing the vision of "AI for All."
For more insights, follow our WeChat Official Account and Video Channel—unlock exclusive dialogues and content!