Hanguang 800
World-leading High-performance AI Inference Chip
Overall Introduction Architecture Features Technical Features Software and Hardware Collaboration Industry Applications
Hanguang 800
T-Head released the first data center chip-Hanguang 800 in September 2019. Hanguang 800 is a 12nm high-performance artificial intelligence inference chip, integrating 17 billion transistors, to realize the peak computing power of 820 TOPS. In the industry standard ResNet-50 test, it achieves the inference performance of 78563 IPS and the energy efficiency ratio of 500 IPS/W.
With the T-Head self-developed architecture, Hanguang 800 uses the collaborative design of software and hardware to realize the performance breakthrough. The artificial intelligence chip software development kit was self-developed by T-Head as well, which enables Hanguang 800 to get high-performance experience of high throughput and low latency in developing deep learning application. Hanguang 800 has been successfully applied to data center, edge server and so on.
View Datasheet(PDF)
Architecture Features
In order to enable users to conveniently use Hanguang 800 acceleration chip, T-Head provides HGAI (Hanguang Artificial Intelligence) software development kit for users to get high-performance experience of high throughput and low latency in developing deep learning application on Hanguang 800 chip. HAGI mainly includes the model front-end Graph IR (intermediate representation) conversion, quantization, compilation and operation. The model which is coverted and compilied by HGAI completed after HGAI conversion and compilation can be easily integrated into current popular deep learning inference framework that users can conveniently utilize Hanguang 800 chip to accelerate inference operation. Currently, HGAI can support the following popular deep learning frameworks: TensorFlow, MXNet, Caffe, ONNX; in the future, it will support more popular deep learning frameworks. Meanwhile, users can use NPUSMI for on-line monitoring of Hanguang 800, including main frequency, memory percent utilization, computing power utilization, etc.
Accelerate convolution and matrix multiplication, support deconvolution, dilated convolution,
3D convolution, interpolation, ROI, etc.
Deep Opitmized for ResNet-50, SSD/DSSD,
Faster-RCNN, Mask-RCNN, DeepLab and so on
High-density computation to greatly improve
processing efficiency
Hardware-software co-design to support sparse compression of weight and quantized compression of computation
Besides INT8/INT16 quantized acceleration, cover FP16/BF16 vector calculation
Accelerate various functions such as ReLu, Sigmoid, Tanh, support new activation functions in future
Each chip contains four cores which can be flexibly configured upon computing power requirement, such as single-card-single-core, multiple-card-four-core
Industry Applications
Cloud Computing Service
Intelligent Search of E-commerce
E-commerce Marketing
Technical Features
Deep optimization of CNN and vision algorithm
Extensible to support other DNN model
High energy efficiency, low latency
Support programmable model extension by
instruction set
Complete software stack to support TensorFlow, MXNet, Caffe, ONNX and other frameworks
World-leading single-chip INT8 inferential
computing power
Software and Hardware Collaboration
In order to enable users to conveniently use Hanguang 800 acceleration chip, T-Head provides HGAI (Hanguang Artificial Intelligence) software development kit for users to get high-performance experience of high throughput and low latency in developing deep learning application on Hanguang 800 chip. HAGI mainly includes the model front-end Graph IR (intermediate representation) conversion, quantization, compilation and operation. The model which is coverted and compilied by HGAI completed after HGAI conversion and compilation can be easily integrated into current popular deep learning inference framework that users can conveniently utilize Hanguang 800 chip to accelerate inference operation. Currently, HGAI can support the following popular deep learning frameworks: TensorFlow, MXNet, Caffe, ONNX; in the future, it will support more popular deep learning frameworks. Meanwhile, users can use NPUSMI for on-line monitoring of Hanguang 800, including main frequency, memory percent utilization, computing power utilization, etc.
Please log in Technical Resources for more details >