Cadence organizes on-device AI into three families
Cadence Design Systems’ Tensilica group has organized its machine-learning platforms into three families intended to cover a wide range of on-device AI applications, based on a combination of custom instructions and reworked accelerator engine.
The low end of the range is covered by AI Base, in which the existing Tensilica configurable processors are provided with instruction-set extensions to handle common inferencing operations. For higher-performance work, the company has defined the AI Boost platform that combines extensible processors with the company’s latest generation of accelerator, which it calls the neural network engine (NNE). The last, AI Max, is intended as a turnkey option for applications such as ADAS, in which multiple accelerators are combined to support throughput of up to 32 TOPS in the initial products.
According to Pulin Desai, group director of marketing and business development at Cadence, the NNE has native support for most common layers, offering acceleration for activations, pooling and LSTM-type layers as well as convolutions.
The cores are accompanied by a software framework and compiler that performs optimizations such as pruning and weight clustering that are commonly used to shrink models trained on a server to versions that can run on edge devices. In addition, the accelerator cores support tensor compression, in which both weight and input data are compressed before transfer to main memory and decompressed on the fly before being processed by the execution pipeline. This compression reduces the amount of memory that needs to be transferred in and out of memory, which helps to reduce energy consumption overall. The software environment supports models created by TensorFlow, ONNX, PyTorch, Caffe2, TensorFlowLite, and MXNet.