Altera finds a way to cheaper floating point in FPGAs
Altera has revealed that the DSP blocks in the Arria 10 FPGAs contain the logic needed to make them work as IEEE754-compliant floating-point multipliers and adders, claiming that it has found a near overhead-free way of implementing the functions alongside the existing fixed-point capabilities. The company plans to take the blocks forward into the Stratix 10 products that will be built on Intel’s 14nm process.
Altera architect Martin Langhammer said he took a novel approach to implementing floating point to make it silicon-efficient enough to build in as standard.
“There has been a bunch of academic work on putting floating point into FPGAs but none of these approaches have been commercially viable,” said Langhammer. “The trick was to come up with these circuits, they are very different to the way that floating point is normally built.”
The company plans to publish papers later on some of techniques that were used, he said. The company will publish tool support in the second half of the year.
By hardwiring the floating-point logic, Langhammer claimed that it becomes possible to implement DSP units that would have outgrown existing FPGAs because of the amount of lookup tables (LUTs) that are needed to extend the fixed-point units to support floating point. On the order of 700 programmable-logic elements are needed to perform the point-shifting work, not including the error-checking steps required by IEEE754.
Building the core elements into each DSP makes it possible to select floating-point support on a block-by-block basis. Langhammer said the support should help speed up DSP implementation because most algorithms are adapted from prototypes built using floating-point algorithms. Today, they either need to use larger FPGAs to build floating-point engines or take the time to convert the algorithm to fixed-point, which makes it harder for the architects to alter their reference model if they find problems.
Image Communication paths between DSP blocks support vectorizable matrix and dot-product operations
As Altera sees a large opportunity among more software-oriented developers for DSP-enhanced FPGAs as languages such as OpenCL take hold, the company expects IEEE754 support to be important to those users. It will make it easier to port code over from desktop environments.
Langhammer said an important element of the floating-point unit’s design is low latency and its ability to pass data to other blocks, both of which are important for the matrix-intensive operations and vectorized topologies that the company sees as vital to many of the target applications. As well as systolic forms of FIR filters – which are more common in traditional FGPA designs – the architecture will support direct-form implementations.
“The design allows us to pipeline these structures easily and run them without any stalls,” said Langhammer, adding that BDTI has evaluated the approach. “We can stick together any number of DSP blocks.”