Ceva has decided to take its very long instruction word (VLIW) architecture into the world of IoT sensor nodes and standalone smartwatches with the launch of the X1 processor core.
The company sees the introduction of the LTE-based IoT cellular communications standards as an opportunity to replace the traditional pairing of a 32bit general-purpose processor and dedicated RF baseband controller with a unified processor core. With the option of adding accelerators for functions such as Viterbi and Turbo coding, Emmanual Gresset, business development director for Ceva’s wireless business unit, said the X1 can support the new Cat M1 and Cat NB1 wide-area wireless protocols in software.
“There are a lot of new markets and use-cases that can take advantage of something like NB1 because it can be delivered at the same cost as Bluetooth LE,” Gresset said. “As operators deploy NB-IoT and Cat-M1, the growth will be significant.
“Although DSP is in our DNA, we made significant additions to make an efficient CPU,” Gresset claimed. “Cost is vastly improved because you only have one CPU subsystem instead of two. And in software we can run multiple modes [of RF communication].”
The X1 reduces the number of arithmetic units to just one, armed with a multiplier-accumulator able to handle two 16bit operations in parallel or one at 32bit resolution. The four-way VLIW processor can run an instruction through this unit in parallel with a load, a store and a program-control operation such as a jump or branch. To overcome the code bloat associated with a conventional VLIW – a four-way instruction packet can be 128 bits wide – the X1 supports variable-length packets and includes support for 16bit instructions for a subset of commonly used operations.
“Because these devices usually have onchip memory, code size is the most important factor for area,” Gresset said.
Ceva’s designers opted for a variable-length pipeline with up to ten stages when more complex DSP operations are running. The team added dynamic prediction to the existing static prediction to try to reduce the penalty of branches. Gresset added the longer pipeline allows for the use of slower local memory for a typical clock rate in the region of 150MHz. To help minimise branch penalties, the X1 supports predicated execution so that simple conditions can be executed inline.
In terms of general-purpose code performance, Gresset said the X1 benchmarks at approximately the same as the ARM Cortex-M4 on the EEMBC Coremarks-per-MHz metric. “And we can run a complete IoT modem in software, they can’t. If you run Cat-M and GPS, you still have room for sensor processing at 160MHz,” Gresset said, assuming that the tasks can be run at different times during the sensor node’s duty cycle.
In terms of software support, Ceva has a port of FreeRTOS, which is used in the Pebble smartwatch, to the core as well as one for the IoT-oriented Zephyr operating system.