Purpose-Built
AI Processors
in the Data Center
Whether it’s a data center you’re building on-premises
at your company or a supercomputer for academic research
Efficiency
Habana® Gaudi® deep learning training and inference processors offer substantial price/performance advantage—so you get more training and inference compute while spending less for your on-premises deployments. Architected from the ground up and optimized for deep learning performance and efficiency, Gaudi platform solutions enable more customers to run more deep learning workloads with speed and performance, while containing operational costs.
Usability
We’re focused on giving data scientists, developers and IT and systems administrators all they need to ease the development and deployment of Habana solutions on-premises. Our SynapseAI® Software Platform is optimized for building and implementing deep learning models on Habana AI processors on PyTorch and TensorFlow frameworks on a wide array of models including generative AI, large language, computer vision and natural language models. And the Habana Developer and Habana GitHub sites provide a wide array of content – documentation, scripts, how-to videos, reference models and tools–to help you get started, enabling you to easily build new or migrate existing models to Habana-based systems.
Scalability
With the advent of generative AI and large language models has come the leap size and complexity of workloads. Now, it’s more essential than ever for data centers to be able to easily and cost-effectively scale-up and scale-out capacity—with some systems requiring up to hundreds or even thousands of processors to run training and deployments with with speed and accuracy. Habana Gaudi and Gaudi2 processors were designed like no other—expressly to address scalability—with twenty-four 100-Gigabit Ethernet ports of RDMA over Converged Ethernet (RoCE) integrated into every Gaudi2 processor. The result is massive and flexible scaling capacity based on the industry’s networking standard already employed in virtually all data centers.

Scaling data center training with Gaudi’s RoCE Networking
The unique networking design of twenty-four 100-Gigabit RDMA over Converged Ethernet (RoCE) ports integrated onto every Gaudi2 processor gives data center system builders tremendous versatility—to flexibly and cost-efficiently scale systems from one to 1000s of AI processors. The San Diego Supercomputing Center team is leveraging the flexibility of Gaudi’s RoCE connectivity in its Voyager supercomputer and is using Habana’s first-generation inference processors for model and application deployment. Voyager will serve academic researchers across a range of science and engineering domains—astronomy, climate sciences, chemistry and particle physics, just to name a few.
Researchers on the Voyager supercomputer benefit from training performance and accuracy with the Supermicro X12 Gaudi Training Server, featuring eight Gaudi cards and integrated host CPU, the Dual-socket 3rd Gen Intel® Xeon® Scalable processor, from inference with the Supermicro SuperServer 4029GP-T, containing eight GoyaTM HL-100 Inference PCIe cards and Dual-socket 2nd Gen Intel® Xeon® Scalable processors.
Massive AI training compute capacity with easy and efficient scale-out
6 X 400 GbE
scale-out node
- 7 100-GbE ports for all-to-all connectivity
- u 3 100-GbE ports for scale-out
- u 42 training nodes in Voyager
Seven 6-node racks
252 X 400GbE QSFP-DD
- Connect to centralized networking switch

AI Voyager Project