Use Low-Precision Optimizations for High-Performance Deep-Learning Inference Applications

Publication Date
01 Mar 2022

With advances in hardware acceleration and support for low-precision, deep-learning inference delivers higher throughput and lower latency. However, data scientists and AI developers often need to make a trade-off between accuracy and performance. There are also the deployment challenges due to high computational complexity of inference quantization. This webinar talks about the techniques and strategies, such as automatic accuracy-driven tuning for post-training quantization and quantization aware training, to overcome these challenges.

Join us to learn about Intel’s new low-precision optimization tool and how it helped CERN openlab to reduce inference time while maintaining the same level of accuracy on convolutional Generative Adversarial Networks (GAN). The webinar gives insight about how to handle strict precision constraints that are inevitable while applying low precision computing to generative models.