Inference Performance

Model Imported from Batch SizeLatency (msec)Throughput (samples per sec)PrecisionModel source
inception V3TF10.4373432INT8Source
41.1554422INT8
82.1984447INT8
102.6744457INT8
bninceptionONNX1CFCFINT8
4CFCFINT8
8CFCFINT8
10CFCFINT8
20CFCFINT8
resnet18 V1
224,224
ONNX10.15214547INT8Source
40.27227669INT8
80.51430690INT8
100.5731571INT8
resnet18 V2
224,224
ONNX10.2148701INT8Source
40.51713547INT8
80.82314607INT8
100.98714823INT8
resnet34 V1
224,224
ONNX1285INT8Source
40.42915784INT8
80.70217400INT8
100.82318220INT8
resnet34 V2
224,224
ONNX10.3165460INT8Source
40.6569107INT8
81.1099976INT8
101.38510206INT8
resnet50 V1
224,224
TF10.2327238INT8Source
40.50112873INT8
80.82414027INT8
100.95114662INT8
ONNX10.2237505INT8
40.48113571INT8
80.79314768INT8
100.90515491INT8
resnet50 V1
160x160
ONNX10.201N/AINT8Source
40.322N/AINT8
80.525N/AINT8
100.58N/AINT8
resnet50 V1 slim
224,224
TF10.2237439INT8Source
40.46814477INT8
80.75915972INT8
100.87116799INT8
resnet50 V2
224,224
TF10.3624694INT8Source
40.7277472INT8
81.3358155INT8
ONNX10.3664468INT8Source
40.816830INT8
81.4887363INT8
resnet101 V1
224,224
ONNX10.3424582INT8Source
40.6878214INT8
81.2449171INT8
101.4039802INT8
resnet101 V2
224,224
ONNX10.5412764INT8Source
41.2944164INT8
82.1454511INT8
102.6054594INT8
resnet152 v1
224,224
TF10.4951566INT8open source model
41.0013379INT8
81.7633527INT8
101.9463565INT8
ONNX10.72792INT8Source
41.6415256INT8
82.9265823INT8
103.5486176INT8
resnet152 V1 slim
224,224
TF10.7531724INT8Source
40.9994931INT8
81.556612INT8
101.7737099INT8
resnet152 V2
224,224
ONNX10.71929INT8Source
41.6412944INT8
82.9263218INT8
103.5483267INT8
ResNext50-32_4d
224,224
ONNX10.4033825INT8open source model
40.8756004INT8
81.566522INT8
101.8896657INT8
resnext101_32_4d
224,224
ONNX10.4963064INT8open source model
41.1344616INT8
81.9365094INT8
102.1875433INT8
tiny yolo v210881920Pytorch12.107N/AINT8Source
10.199N/AINT8Source
320320
10.26N/AINT8Source
416416
10.499N/AINT8Source
608608
96096010.933N/AINT8Source
Yolo V21088_1920Pytorch16.618N/AINT8Source
32032016.593N/AINT8Source
41641610.578N/AINT8Source
54473610.791N/AINT8Source
60860811.514N/AINT8Source
96096011.454N/AINT8Source
yolo v3416416Pytorch11.179N/AINT8Source
960960Pytorch4N/AINT8Source
bert squadMX113.524N/AINT8Source
LARGE
max sequnce length = ??? Check with Dror2435.705N/AINT8
bert mrpcMX12.841N/AINT8
BASE42.911N/AINT8
max sequnce length = 12882.952N/AINT8
103.468N/AINT8
105.934N/AINT16
127.003N/AINT16
bert squadMX12.872N/AINT8
BASE42.897N/AINT8
max sequnce length = ??? Dror?82.936N/AINT8
105.953N/AINT8
103.488N/AInt16
12N/AINT8
24N/AINT8
bvlc_googlenet
224,224
ONNX10.296NCINT8Source
googlenet_bn_no_lrn
224,224
ONNX10.15912809INT8Developed inhouse based on Googlenet with batch norm and w/o LRN
40.42317542INT8
80.70118431INT8
100.82318531INT8
squeezenet1.1
224,224
ONNX1887INT8Source
40.23736420INT8
80.42637840INT8
100.54737927INT8
ssd-vgg16
300,300
MX10.9N/AINT8Source