3D-Parallelism - Deep Learning and AI Processor Chip Manufacturer

August 31, 2023

Training Llama and Bloom 13 Billion Parameter LLMs with 3D Parallelism on Habana^® Gaudi2^®

One of the main challenges in training Large Language Models (LLMs) is that they are often too large to fit on a single node or even if they fit, the training may be too slow. To address this issue, their training can be parallelized across multiple Gaudi accelerators (HPUs).

3D-Parallelism, DeepSpeed, GenAI, Large Language Models

August 31, 2023

Porting a model to Megatron-DeepSpeed with Habana Gaudi

If you want to train a large model using Megatron-DeepSpeed, but the model you want is not included in the implementation, you can port it to the Megatron-DeepSpeed package. Assuming your model is transformer-based, you can add your implementation easily, basing it on existing code.

3D-Parallelism, DeepSpeed, GenAI, Large Language Models

Training Llama and Bloom 13 Billion Parameter LLMs with 3D Parallelism on Habana^® Gaudi2^®

Porting a model to Megatron-DeepSpeed with Habana Gaudi

Products

Solutions

Industries

Developer

Resources

Contact us

Privacy & Legal

Social Media

Habana Blog

Training Llama and Bloom 13 Billion Parameter LLMs with 3D Parallelism on Habana® Gaudi2®

Porting a model to Megatron-DeepSpeed with Habana Gaudi

Products

Solutions

Industries

Developer

Resources

Contact us

Privacy & Legal

Social Media

Training Llama and Bloom 13 Billion Parameter LLMs with 3D Parallelism on Habana^® Gaudi2^®