In this post, we will learn how to run PyTorch V-diffusion inference on Habana Gaudi processor, expressly designed for the purpose of efficiently accelerating AI Deep Learning models.
Art Generation with V-diffusion
AI is unleashing new and wide array of opportunities in the creative domain and one of them is text-to-image creative applications.
In this tutorial we will learn how to generate artwork from an input prompt using a pre-trained V-diffusion model. We will use the Habana PyTorch v-diffusion reference model, based on the work done by Katherine Crowson (@RiversHaveWings) and Chainbreakers AI (@jd_pressman).
The models are denoising diffusion probabilistic models, which are trained to reverse a gradual noising process, allowing the models to generate samples from the learned data distributions starting from random noise. DDIM-style deterministic sampling is also supported. The models are also trained on continuous timesteps. They use the ‘v’ objective from Progressive Distillation for Fast Sampling of Diffusion Models, hence the v in v-diffusion.
Clone the code, and install requirements:
git clone -b 1.6.0 https://github.com/HabanaAI/Model-References
python3 -m pip install -r requirements.txt
Download the pre-trained model to generate 256×256 art works. The model was trained on wiki art.
wget https://the-eye.eu/public/AI/models/v-diffusion/cc12m_1_cfg.pth && mv cc12m_1_cfg.pth checkpoints/
The two samples below will run the pretrained checkpoint on Habana Gaudi and will generate images based on the given prompts.
./cfg_sample.py "a girl at the end of the world":5 -n 1 -bs 1 --seed 0 --device 'hpu' –hmp
You should observe a similar output as below
The image will be saved in the local folder
And a second example
./cfg_sample.py "a robot, by Picasso":5 -n 1 -bs 1 --seed 0 --device 'hpu' –hmp
You can try different prompts, or different configurations for running the model. You can find more information on Habana v-diffusion reference model GitHub page.