Qualcomm Unveils AI Inference Chips, Steps into Data Center Market

10/31 2025 441

Qualcomm is embarking on a cross-industry transition, shifting from mobile chips to AI data inference. Leveraging its strengths in power efficiency and cost-effectiveness, the company aims to challenge NVIDIA's dominance.

On October 27, the mobile chip behemoth Qualcomm announced the release of the AI200 and AI250.

These two chips are neither mobile system-on-chips (SoCs) nor automotive-grade chips; instead, they are high-performance powerhouses designed specifically for data inference, seeking a foothold in the AI chip market currently dominated by NVIDIA.

The announcement triggered a surge in Qualcomm's stock price, which soared by 20% during intraday trading and closed up 11%.

The real money-maker in AI large models lies in the inference process, where AI systems perform practical tasks such as answering questions, generating images, and creating videos. At present, the inference market is expanding at an annual rate of 40%. However, NVIDIA's mainstream H100 chip is not only expensive and power-hungry but also frequently in short supply.

Qualcomm is capitalizing on this market gap by applying its technical expertise in 'power management' from mobile chips to data center chips. The company is promoting the core selling point of 'processing 30% more tokens (the fundamental unit of AI data processing) per dollar spent'.

In terms of specifications, the AI200 is equipped with 768GB of LPDDR memory, roughly triple the capacity of similar products, enabling it to effortlessly accommodate large models. The AI250 employs 'near-memory computing' technology, which places computing modules in close proximity to memory. This design increases data transmission bandwidth by a factor of 10 while reducing power consumption, significantly cutting electricity costs for data centers.

Moreover, both chips support cold plate liquid cooling, allowing a single rack to handle 160kW of computing power. This effectively reduces the PUE (Power Usage Effectiveness) index of data centers and minimizes energy waste.

Regarding software adaptation, Qualcomm has adopted a strategy akin to that of the Android system: it supports one-click model import from Hugging Face (an open-source AI model platform). Models can be adapted to a Qualcomm chip-compatible format using the Transformers Library tool, with deployment taking as little as 15 minutes. This achieves zero-modification migration and lowers the barrier to use for cloud providers and other customers.

However, it's worth noting that there is a 'time lag' in the market availability of these two chips: the AI200 is expected to be launched in 2026, while the AI250 will not be available until 2027. This gives NVIDIA time to counter by releasing new-generation chips such as the B100 and Rubin ahead of schedule to capture the market. Will cloud giants leverage Qualcomm's pricing to pressure NVIDIA? Silicon Valley has a spectacle to witness next year.

Saudi Arabia's AI company, Humain, has already placed an early order, planning to construct a 200-megawatt computing power center using the AI200.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.