
GOYA INFERENCE
8-CARD SERVER
Coming soon, Greco™ second-generation inference processor for data center deployments.
Goya Inference performance
BERT – Base SQuAD Performance
Goya performance on BERT-Base, a popular natural language processing model, is based on the following configurations:
Hardware: 1x GOYA HL-100; CPU eon GOLD [email protected] GHz.
Software: Ubuntu v-18.04; SynapseAI 0.11.0-477
Performance on ResNet50
Goya delivers super fast throughput and ultra low latency on ResNet50, a popular image recognition model. These performance metrics are based on:
Hardware: Goya HL-100 PCIe card; host CPU Xeon Gold [email protected] GHz; and
Software: Ubuntu v-18.04, SynapseAI v-0.11-447.
Habana®Synapse®AI Software Suite
for Inference
The SynapseAI® Software Suite provides flexible and easy
development and deployment of a wide array of computer
vision and natural language processing workloads, integrating support for TensorFlow and PyTorch frameworks.
Synapse AI supports customization to address user-specific
solution requirements. Designed for flexibility, Habana’s
inference software leverages foundational and flexible
blocks: Habana’s programmable Tensor Processor Core
(TPC) SDK, optimization, graphic compiler and runtime
software, a rich and extensible kernel library and data
center deployment and management tools.

TPC KERNEL LIBRARY & SDK
- Programmable TPC with SDK
- Extensive TPC kernel libraries support user customization
- TPC tools (simulator, compiler, debugger) for custom kernel development

OPTIMIZER, COMPILER & RUNTIME
- Seamlessly integrates with existing frameworks
- Can be interfaced with C or Python API
- Supports development flow from graph to optimized recipe
- Enables quantization, pipelining, layer fusing

PERFORMANCE & DEPLOYMENT TOOLS
- Deployment validation toolset: test and monitor functionality and performance
- Libraries and tools for run-time detection and reporting
- Run-time and orchestration plug-ins
