Can you see the GOYA vs. T4 performance difference?

Share this article:

Yes, this is what >3x inference throughput looks like.

While the Habana GOYA™ AI Inference processor is relatively new to AI processing, having been introduced only in September 2018, its performance is redefining what customers can expect from a processor that’s custom-designed and optimized for AI inference.

On the ResNet-50 benchmark, GOYA is outpacing performance of its closest rival, the T4 processor, by a factor of more than 3. GOYA delivers 15,393 images-per-second inference throughput as opposed to the T4’s Nvidia-reported performance of 4,944 images-per-second. As you see here, 3x makes a tangible difference…3 times faster processing = 3 times quicker processing of deep learning workloads = 3 times increases in  productivity.

The key factors used in assessing inference performance are throughput/speed, power efficiency, latency and the ability to support small batch sizes. In this same ResNet-50 benchmark, GOYA offered power efficiency of 149 images-per-second-per-Watt (IPS/W) vs. T4’s power efficiency of 71 IPS/W. And, GOYA supports minimal latency of 1.01ms (well below the industry requirement of 7 milliseconds) vs. T4’s whopping 26 ms. In addition, GOYA’s performance is linear and sustained even at small batch sizes.

For more information on the GOYA AI Inference Processor, check out the whitepaper.


Share this article: