Habana Blog

News & Discussion
Tagged: DeepSpeed

Fine-Tuning Llama2-70B with DeepSpeed ZeRO-3 and Low-Rank Adaptation (LoRA) on Intel® Gaudi®2 AI Accelerator

With Habana’s SynapseAI 1.13.0 release, users can run Fine Tune the Llama2 70B model using only 8 Gaudi2 Accelerators.
DeepSpeed, Fine Tuning, Llama, LoRA

Training Llama and Bloom 13 Billion Parameter LLMs with 3D Parallelism on Habana® Gaudi2®

One of the main challenges in training Large Language Models (LLMs) is that they are often too large to fit on a single node or even if they fit, the training may be too slow. To address this issue, their training can be parallelized across multiple Gaudi accelerators (HPUs).
3D-Parallelism, DeepSpeed, GenAI, Large Language Models

Porting a model to Megatron-DeepSpeed with Habana Gaudi

If you want to train a large model using Megatron-DeepSpeed, but the model you want is not included in the implementation, you can port it to the Megatron-DeepSpeed package. Assuming your model is transformer-based, you can add your implementation easily, basing it on existing code.
3D-Parallelism, DeepSpeed, GenAI, Large Language Models

Optimizing Large Language Model Inference on Gaudi2 with Hugging Face Optimum-Habana

We have optimized additional Large Language Models on Hugging Face using the Optimum Habana library.
DeepSpeed, Hugging Face, Inference

BLOOM 176B Inference on Habana Gaudi2

With Habana’s SynapseAI 1.8.0 release support of DeepSpeed Inference, users can run inference on large language models, including BLOOM 176B.
BLOOM, DeepSpeed, Inference

Pre-Training the BERT 1.5B model with DeepSpeed

In this post, we show you how to run Habana’s DeepSpeed enabled BERT1.5B model from our Model-References repository.
BERT, DeepSpeed, developer, Gaudi, Gaudi2, pytorch, synapseai

Fine tuning GPT2 with Hugging Face and Habana Gaudi

In this tutorial, we will demonstrate fine tuning a GPT2 model on Habana Gaudi AI processors using Hugging Face optimum-habana library with DeepSpeed.
DeepSpeed, developer, Fine Tuning, Gaudi, GPT, GPT2, Hugging Face

Memory-Efficient Training on Habana® Gaudi® with DeepSpeed

One of the key challenges in Large Language Model (LLM) training is reducing the memory requirements needed for training without sacrificing compute/communication efficiency and model accuracy.
DeepSpeed, developer, Gaudi, Large Language Models