PROCESSORS IN THE DATA CENTER
Whether it’s a data center you’re building on-premises at your company or a supercomputer
for academic research, Gaudi delivers tangible benefits to make training AI models more
cost-effective and easy to build and scale.
Habana® Gaudi® AI training processors offer substantial price/performance advantage—so you get more deep learning training, while spending less for your on-premises deployments. Architected from the ground up and optimized for AI training efficiency, Gaudi Training solutions enable more customers to run more deep learning training workloads with speed and performance, while containing operational costs.
We’re focused on giving data scientists, developers and IT and systems administrators all they need to ease the development and deployment of Habana solutions on-premises. Our SynapseAI® Software Platform is optimized for building and implementing deep learning models on Habana AI processors on TensorFlow and PyTorch frameworks and with computer vision and natural language processing models. And the Habana Developer Site and Habana GitHub provide a wide array of content – documentation, scripts, how-to videos, reference models and tools to help you get started and enable you to easily build new or migrate existing models to Habana-based systems.
As datasets increase in size and complexity, it’s essential for data centers to be able to easily and cost-effectively scale-up and scale-out capacity—with some systems requiring up to hundreds or even thousands of processors to run training with speed and accuracy. Habana Gaudi AI training processors were designed like no other—expressly to address scalability—with ten 100-Gigabit Ethernet ports of RDMA over Converged Ethernet (RoCE) integrated into every Gaudi processor. The result is massive and flexible scaling capacity based on the networking standard already employed in virtually all data centers.
Learn More About Habana
Habana partners with OEM and systems integrator Supermicro to assist our customers in on-premises deployments of Habana AI processor systems.
Habana Labs and Supermicro are collaborating with DDN, a leading AI data management and storage system provider, to enhance storage capacity of and manage data flows for AI training with the Gaudi-based Supermicro X12 Gaudi AI Training System.
Scaling data center training with Gaudi’s RoCE Networking
The unique networking design of ten 100-Gigabit RDMA over Converged Ethernet (RoCE) ports integrated onto every Gaudi processor gives data center systems builders tremendous versatility—to flexibly and cost-efficiently scale systems from one to 1000s of AI processors. The San Diego Supercomputing Center team is leveraging the flexibility of Gaudi’s RoCE connectivity in its build out the Voyager supercomputer and is using Goya processors for inference applications. Voyager will serve academic researchers across a range of science and engineering domains—astronomy, climate sciences, chemistry and particle physics, just to name a few.
Voyager is being built with the Supermicro X12 Gaudi Training Server, featuring eight Gaudi cards and integrated host CPU, the Dual-socket 3rd Gen Intel(R) Xeon(R) Scalable processor, and the Supermicro SuperServer 4029GP-T, containing eight Goya(TM) HL-100 Inference PCIe cards and Dual-socket 2nd Gen Intel® Xeon® Scalable processors.