Domestic Super-Node Showdown: Huawei, Alibaba, and Sugon Compete for Dominance

11/07 2025 430

In 2025, the domestic super-node market burst into life with great enthusiasm. Companies such as Huawei, Alibaba, and Sugon have risen to prominence, propelling China's intelligent computing clusters to unprecedented global heights.

Recently, following the groundbreaking achievements of the Ascend 384 and Panju AL128, Sugon introduced the world's first single-cabinet 640-card super-node, the scaleX640. This innovation elevates computational power integration by 20 times compared to the 384 super-node, smoothly succeeding the Panju AL128 to maintain its leading position.

This article follows the publication schedule and leverages disclosed information from each vendor to offer a comprehensive analysis of the three major super-node products. (Note: Due to differing product iteration timelines, some products may have updated performance metrics. This analysis emphasizes differences rather than rankings.)

Huawei Ascend 384: A Vertical Integration Leader

The Ascend 384 super-node, Atlas 900 A3 SuperPoD, is constructed on a distributed multi-frame cluster design with LingQu optical interconnectivity as its central architecture. At its launch, it represented the industry's largest-scale high-speed bus-interconnected super-node, creatively integrating 384 Ascend NPUs and 192 Kunpeng CPUs. This breakthrough overcame AI computational power interconnectivity bottlenecks while addressing both AI and general-purpose computing requirements.

Its strength lies in its adept 'network-connected computing' technical approach, enabling the super-node to operate as a single computer through a high-speed interconnect bus. This achieves 'one card, one expert' parallel inference, emphasizing superior interconnectivity performance. However, the 384 super-node exclusively supports the Ascend 910C accelerator card and is entirely built around the CANN ecosystem, which restricts its adaptability.

Currently, the 384 follows a 'hardware-centric, software-open' strategy. At the hardware level, it establishes a complete domestic technology system, incorporating Ascend NPUs, Kunpeng CPUs, and LingQu buses. At the software level, Huawei opens its core CANN computing architecture and MindSpore deep learning framework to the community, aiming to attract developers and address ecological gaps.

Alibaba Panju AL128: The Full-Stack Optimization Specialist

The Panju AL128 super-node made waves at the Cloud Town Conference with its breakthrough in extreme-density integration. While traditional server cabinets typically accommodate dozens of AI computing chips, the Panju AL128 set a new standard by housing 128 accelerator cards in a single cabinet, with computational power integration four times that of the 384 super-node.

Behind this high-density integration is a breakthrough in cooling technology. The product utilizes single-phase immersion liquid cooling, tripling cooling efficiency compared to traditional air cooling. This reduces the data center's Power Usage Effectiveness (PUE) to as low as 1.09, cuts cooling system energy consumption by 30%, and halves the footprint.

This technology tackles heat accumulation issues in high-density computing, ensuring chips operate at peak performance under optimal temperatures.

However, Alibaba's true strength lies in its software-hardware synergy optimization capabilities. The Panju AL128 super-node seamlessly integrates with Alibaba Cloud's proprietary HPN 8.0 high-performance network, CPFS parallel file storage, and the PAI artificial intelligence platform, creating a vertically optimized system from hardware to applications. The Tongyi Qianwen model achieved a 3x end-to-end training acceleration through this integration.

Sugon scaleX640: The Open Architecture Champion

The Sugon scaleX640 super-node stands as the current pinnacle of computing cluster scale. As the world's first single-cabinet 640-card super-node, it is tailored for trillion-parameter large models and features an AI computing open architecture. This next-generation, large-scale, high-efficiency intelligent computing infrastructure boasts leading attributes such as 'ultra-high performance, ultimate efficiency, comprehensive openness, and ultra-high reliability.'

The scaleX640 adopts a 'one-drags-two' high-density architecture, constructing a large-scale, high-bandwidth, low-latency super-node communication domain. Two cabinets combined form a 1280-card computing unit, interconnected via high-speed networks. The liquid condensation heat exchange device (CDM) provides a supercooling capacity of up to 1.72 MW for the thousand-card computing unit, reducing PUE to as low as 1.04 and increasing computational power density by up to 20 times.

More crucially, Sugon embraced the most intricate full-stack open architecture integration route, consolidating numerous subsystems such as computing, storage, networking, power, cooling, and management. Leveraging innovative technologies including ultra-high-speed orthogonal architecture, ultra-high-density blades, immersion phase-change liquid cooling, and high-voltage DC power supply, the MOE large model's training efficiency and high-throughput inference throughput performance saw significant improvements of 30-40%.

Conclusion:

Beyond the iterative performance enhancements of the three major super-node products, their distinct approaches reveal notable developmental characteristics.

The vertical integration route, led by a single entity, offers benefits such as short development cycles, low synergy complexity, and rapid initial progress. However, it falls short of the open architecture route in terms of full-stack resource integration scale and has certain scalability limitations.

Additionally, the technical routes adopted by Alibaba and Sugon support multiple AI chip options at the underlying hardware level, featuring open architectures and compatibility designs. These routes offer advantages such as strong industrial vitality, high endogenous potential, and high computational power efficiency. However, they encounter ecological barriers in terms of industrial chain collaboration complexity, requiring substantial resources and effort to standardize technical interfaces and collaboration norms.

Overall, the vertical route emphasizes solo dominance, with manageable industrial chain risks and significant benefits for the chain leader, particularly suitable for nurturing niche field (sub-sector) leaders in the early stages of industry development. The open route stresses upstream-downstream collaboration, forming a community of shared destiny, with shared risks and rewards. Participants along the chain exhibit stronger enthusiasm, especially beneficial for establishing a sustainable development foundation in the mid-to-late stages of industry growth.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.