Optimizing Large Language Model Inference on Gaudi2 with Hugging Face Optimum-Habana
We have optimized additional Large Language Models on Hugging Face using the Optimum Habana library.
DeepSpeed, Hugging Face, Inference
BLOOM 176B Inference on Habana Gaudi2
With Habana’s SynapseAI 1.8.0 release support of DeepSpeed Inference, users can run inference on large language models, including BLOOM 176B.
BLOOM, DeepSpeed, Inference